Featured Post

Customer focus is a data imperative

Age of information is really the age of confirmation and it is upon us. Gone are the days of naive customer focus termed as providing the b...

Sunday, August 9, 2015

Statistical Distributions and Customer Analytics

This blog is more like a diary, sometimes a reference and sometimes just a scratch pad for me to get my head wrapped around few concepts. With that, I disregard all my earlier promises about what I will write, I will delve right into the topics of interest and the complication at hand.

After listening to few lectures of Dr Peter Fader, I had to dig through the articles about customer segmentation and surrounding predictive analysis.

The non-contractual buying customers is my area of interest for now and there are tons of papers on customer segmentation but they all point to one paper that started it all, and it was before its time. Because almost all other papers point to the difficulty in implementation of the theoretical frame work in to more meaningful practice. Mainly because of the computational complexity and the required computing horse power, unavailable at time or limited in availability. We will get into why of that later, but that paper was By Dr David Schmittlein, Donald Morrison, and late Dr. Richard Columbo.

That research paper was Counting Your Customers: Who Are They and What Will They Do Next? (1987)by the Dr. Schmittlein (I will refer to as C1). I will skip the number of papers in between that talk reference the above paper and just focus on a more recent one namely Counting Your Customers’ the Easy Way: An Alternative to the Pareto/NBD Model (by Dr Fader. ) (I will refer as C2).

We might digress a bit into other papers which we might need in order to understand few things in the above two but to understand these papers one has to have some math background to at least enjoy the story being told. The implementation is yet another discussion. But just to have a feeling of why changing the practical story changes the maths behind the story and how it impacts the implementation is complex but interesting.

Each paper points to plethora of other research papers and after going through a few, I have come to the conclusion, that we need to look at Negative binomial distribution and Beta Geometric distribution on the side while we go through these papers. And that will be the story following few blogs, till we reach the Bruce Hardie excel implementation.