Abstract: Fraud detection in credit card transactions is a very wide and complex field. Over the years, a number of techniques have been proposed, mostly stemming from the anomaly detection branch of data science. That said, most techniques can be reduced to two main situations depending on the available dataset:
Situation 1: The dataset has a sufficient number of fraud examples
Situation 2: The dataset has no (or just a negligible number of) fraud examples.
The first situation is more standard. Here we can deal with the problem of fraud detection with classic machine learning techniques. All supervised machine learning algorithms for classification will do, e.g. Random Forest, Logistic Regression, etc.
The second situation is a bit trickier. Here, we have no examples of fraudulent transactions, and we need to become a bit more creative. We could use techniques from the outlier detection or the anomaly detection approach, e.g. anomaly detection and isolation forests.
In this hands-on tutorial you learn how to handle both situations using either logistic regression, an isolation forest, or an autoencoder.
The tool of choice for this tutorial is the open source tool KNIME Analytics Platform. After a short introduction to the tool, we will split in two groups, each group focusing on one of the two situations.
Please bring your own laptop with KNIME Analytics Platform pre-installed. To install KNIME Analytics Platform, follow the instructions provided in these YouTube videos:
If you would like to get familiar with KNIME Analytics Platform, you can explore the content of our E-learning course (https://www.knime.com/knime-introductory-course).
Bio: Kathrin Melcher is a Data Scientist at KNIME. She holds a Master's Degree in Mathematics from the University of Konstanz, Germany. She joined the evangelism team at KNIME in 2017 and has a strong interest in data science and machine learning algorithms. Kathrin enjoys teaching and sharing her data science knowledge with the community - for example in the book "From Excel to KNIME" - as well as on various blog posts and at training courses, workshops, and conference presentations.
Data Scientist | KNIME