Abstract: ING bank offers banking services to 12 million commercial customers in over 40 countries. One of the most important services, payment transactions, connect multitude companies as a massive payment transaction network.
We, ING Wholesale Banking Advanced Analytics team, built a peer detection model: given a company, find out the most similar companies (peers) from 12 million candidates. The model has been used by relation managers and risk managers cross several different departments in multiple scenarios such as cross-selling and prospects recommendation.
In this talk, we will first explain the business case and how we abstracted it to a network-based recommendation problem. Then, we will discuss multiple ways of extracting node representation from a graph network, from a graph neighborhood perspective to a network embedding perspective. Next, we will demonstrate how we computed node similarity efficiently. We developed a python package to accelerate top n cosine similarity computation for sparse matrix, and we used spark local sensitive hashing for dense matrix. At last, we will describe an active learning framework we built to improve peer detection by user feedback, and how we utilized airflow to productize the whole model pipeline.
Bio: Zhe Sun is currently a senior data scientist in ING Wholesale banking Advanced Analytics team, where he has applied machine learning techniques to problems ranging from entity matching to large scale payment transaction network analysis. Together with the team, he aims to change the way the bank operates via data driven analytics and machine learning. He has 9 years of industry experience within data science and software engineering across a range of international companies, within the Banking and Telecommunications sectors.