Data Mining: How to Significantly Improving Acquisition Modelling Performance: The Ability to Create Individual-Level Variables

Predictive analytics practitioners will universally agree on data limitations as being the most significant challenge in building acquisition models. Traditional solutions have utilized aggregated postal area type data either from Statistics Canada or a range of other data service providers which in addition will also provide cluster type variables. In using external data sources such as aggregate postal area demographics only, we do indeed obtain solutions that yield the ability to rank order response rate. As a result, the use of external data sources has always been the critical element in building these type of tools. From the success of using this data to better target prospects in becoming customers, many of the data service providers in this area will also offer predictive analytics solutions as a complementary service to the actual source data. Companies that desire to obtain more targeted acquisition results look to these providers as being their partner in delivering these results.

All practitioners will profess to the significant improvement in predictive analytics results when using individual-level information. One just has to look at customer models such as cross-sell,upsell, and attrition models in order to see the significant improvement in model performance. Because of the greater success here, predictive analytics efforts in many cases have been more focused on existing customers with neglible effort on acquisition targeting. But what if there were opportunities within the acquisition data itself that allowed results to improve significantly.

The key in creating these opportunities is to ultimately create that all-important individual-level information from name and address and thereby create new individual-level variables. Of course, the question is how does one do this ? Working with one of our business partners, we were able to create algorithms based on name and address that resulted in gender and age type variables. Furthermore, information was created on the source of the list that yielded the prospect name. As well, we also overlaid the traditional external postal area demographic information to our prospect list and then began our model building process. In our final model, nine variables were identified as the key predictors of acquisition response. Interestingly enough, the external postal area type variables were the two weakest variables in the model. The model in terms of performance yielded a performance lift in acquisition response of 5 to 1 between the top 5% scored names and the bottom 5% scored names. Using just postal area demographic variables yields only a performance lift in acquisition response of 2 to 1 between the top 5% and the bottom 5%. The use of individual-level information in this case clearly provided that additional response lift advantage.

Of course, more work in employing more practical examples need to occur before overarching conclusions can be made regarding this approach. But nevertheless, this initial work in this isolated case does point to the need that this approach should be clearly explored in other acquisition type programs. Our belief in identifying significant improvements to predictive analytics solutions has always been less about the mathematical or advanced statistical approach and more about the actual data itself. Our work in this area with this client certainly amplifies this belief and it is our hope that more organizations at least explore this approach of converting name and address information to individual-level information as a viable option in improving their targetted acquisition efforts.

Data Mining

Saturday, November 3, 2012

How to Significantly Improving Acquisition Modelling Performance: The Ability to Create Individual-Level Variables

No comments: