For those who are interested in something similar to
C1 and
C2 in industrial setting, there is another paper by Dr. Schmittlein and Dr.
Paterson Customer Base Analysis: An Industrial Purchase process application. I will hereto refer as
C3.
Okay, so I want to continue to document how maths helps figure out when a particular customer will make a purchase. We can't really claim we will know when exactly the purchase will occur, but we can claim that we know to reasonable sense of assurance the chance the customer making a purchase given that he has shown certain buying behavior.
That behavior is defined in
C1 as:
P(Customer Active|Purchasing Information)
If we are persistent enough we might be able to answer questions like what are the expected number of transactions in certain time period and what will be the probability of those transactions occurring. Finally, if we are brave enough to venture ahead, we will unlock the potential of an individual account in terms of its expected transactions provided we have the purchasing information defined as number of transactions in a given time period, and the time of the last transaction which gives us the recency.
Most companies now a days have past purchases nicely stored in a database. But if we are keeping transactions with Al Capone's book keeper in a general ledger than probably this type of analysis will not really work. But I am sure almost everybody has enough historical data to play with. As long as we can establish a long run transaction rate, get individual customer retention/dropout rates, which basically are function of the time period chosen for analysis, the transactions are independent and there are these heterogeneity in transaction and dropout rates mentioned in C1 and C2. Here the maths starts getting thicker, when we have to assume, that the heterogeneity (the transaction rates are different for different customers as are the dropout rates) follows certain mathematical distributions, gamma in case of C1 and geometric in case of C2.
Now I gawk at these things just much as any sane marketing professional will do. Why make life complicated when some logistic regression with some basic assumptions will give us a reasonable estimate (or something even simpler), and that will work in most cases. But again accuracy and precision are art and connected themes make better sense than disconnected answers to the connected questions (this thought is blatantly stolen from C1). More on this later, but for now we get more into C1, C2 and C3.
The first step in predicting the customer future purchases is to identify the mathematical process for the customer transactions. In practice the customers make purchases just like you and I. Some by impulse, some by need, and some god only knows why, peer pressure, love to shop;
economics of behavior, we can go in many directions. In the end, it is a random process, and each purchase (we can assume) is made independent of the next purchase or what was bought earlier. This behavior is equivalent to a
poison process.
Next we will try to dive into Gamma distribution and Geometric distribution and try to figure out what the particular distributions mean in the assumptions for solutions presented in C1 and C2.