Clustering YouTube: A Top Down and Bottom Up Approach
Clustering YouTube: A Top Down and Bottom Up Approach


At ZEFR we know that when an advertisement on YouTube is relevant to the content a user is watching it is a better experience for both the user and the advertiser. In order to facilitate this experience we discover billions of videos on YouTube and cluster them into concepts that advertisers and brands want to buy to align with their particular creatives.

To serve our clients we use two different clustering strategies, a top down supervised learning approach and a bottom up unsupervised learning approach. The top down approach involves using human annotated data and a very fast and robust machine learning model deployment system that solves problems with model drift.

Our clients are also interested in discovering topics on YouTube. To serve this need we use unsupervised clustering of videos to surface clusters that are relevant. This type of clustering allows ZEFR to highlight what users are currently interested in. We show how using Latent Dirichlet Allocation can help to solve this problem. Along the way we will show some of the tricks that produce an accurate unsupervised learning system.

This talk will touch on some common machine learning engines including Keras, TensorFlow, and Vowpal Wabbit. We will also introduce our open source Scala DSL for model representation, Aloha. We show how Aloha solves a key problem in a typical data scientist's workflow, namely ensuring that feature functions make it from the data scientist's machine to production with zero changes.


Jon Morra is the Vice President of Data Science at ZEFR. In this role, he leads a team of data scientists responsible for creating data-driven models. Jon and his team are focused on bringing ZEFR's wealth of information about video on the internet to help better drive customer's needs and meet market demands. Previously, Jon was the Director of Data Science at eHarmony, where he helped grow the data science team to support multiple business facets.

Jon holds a B.S. from Johns Hopkins and a Ph.D. from UCLA both in Biomedical Engineering.

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google