Imagededup – Finding Duplicate Images Made Easy!


The problem of finding duplicates in an image collection is widespread. Many online businesses rely on image galleries to deliver a good customer experience and consequently, generate more revenue. Hence, the image galleries need to be of the highest quality. The presence of duplicates in such galleries could potentially degrade the customer experience. Additionally, image-based machine learning models could generate misleading results due to the duplicates present in the training/evaluation/test sets.

Therefore, finding and removing duplicates is an important requirement across several use cases. In this talk, we want to present imagededup, a Python package we built to solve the problem of finding exact and near-duplicates in an image collection. We will speak about the motivation behind building it, its functionality, and also give a demo.


Dat is the Head of AI at Axel Springer Ideas Engineering (, the innovation unit of Axel Springer SE which is the largest digital publishing house in Europe. He establishes and leads Axel Springer AI ( where his goal is to make AI more accessible within Axel Springer and hence drive innovations within the group. His ultimate plan is to turn Axel Springer into an AI-first company.
Dat's interests are diverse from traditional machine learning, deep learning, AI in general to computer vision. Previously, he co-headed the data team at where he built up the machine learning team from scratch. His team mainly focused on computer vision problems from teaching a computer to understand aesthetics to upscaling low-resolution images. He is a regular speaker and has presented at several renowned conferences. He also blogs about his work on Medium. His background is in Operations Research and Econometrics. Dat received his MSc in Economics from the Humboldt University of Berlin.

Open Data Science




Open Data Science
One Broadway
Cambridge, MA 02142

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google