Bridging the Interpretability Gap in Customer Segmentation


Customer segmentation is a vital tool for any business seeking to understand their customers better and develop more effective marketing strategies. Historically, there have been two main approaches to segmentation: rules-based and machine learning-driven. Rules-based segmentation offers simplicity, making it easy to categorize new customers and for development and marketing teams to understand how to appeal to different groups. However, it risks oversimplifying the unique complexity of a large customer base and simply amplifying received wisdom about customer habits. Machine learning-driven segmentation, on the other hand, is a powerful tool for identifying previously unknown relationships between consumer traits and buying behavior. This approach enables precise identification of customer subgroups, but the lack of explainability can be a challenge.

In this talk I will present a new, hybrid approach which combines the best aspects of both methods. The process begins with a careful observation of customer data and assessment of whether there are naturally formed clusters in the data. It continues with the selection of a clustering algorithm and the fine-tuning of a model to create clusters. After that, there is additional exploratory data analysis to understand what differentiates each cluster from the others. Finally, linear approximation is used to create a simple representation of the machine learning clustering algorithm. This representation is highly interpretable in context and can be used for a variety of business purposes. Bridging the gap between accuracy and simplicity, this hybrid method offers the precise identification of customer groups created by machine learning clustering methods and the simple business profiles yielded by rules-based segmentation methods. This allows data scientists to fulfill their role of identifying previously undiscovered relationships between data elements while still catering to the goals of business stakeholders.


Evie Fowler is a data scientist based in Pittsburgh, Pennsylvania. She currently works in the healthcare sector leading a team of data scientists who develop predictive models centered on the patient care experience. She holds a particular interest in the ethical application of predictive analytics and in exploring how qualitative methods can inform data science work. She holds an undergraduate degree from Brown University and a master's degree from Carnegie Mellon.

Open Data Science




Open Data Science
One Broadway
Cambridge, MA 02142

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google