Deep Learning Pipelines on Apache Spark
Deep Learning Pipelines on Apache Spark


Deep learning has allowed remarkable results in fields such as Computer Vision and Natural Language Processing, but barriers to entry inhibit many data scientists from using deep learning as an everyday tool. Existing deep learning frameworks require significant programming, and scaling up via distributed computing requires even more work. In this talk, we discuss our new open source library meant to address some of these challenges.

Deep Learning Pipelines is an open source library integrating popular deep learning libraries with Apache Spark. The library aims to help engineers and data scientists familiar with Spark to train and deploy deep learning models into their existing workflows. This package simplifies deep learning in three major ways:

(1) It offers simple APIs built on top of Spark MLlib’s Pipeline APIs, so it integrates well with enterprise machine learning workflows.
(2) It uses Spark under the hood to automatically scale out common deep learning tasks such as inference, featurization and tuning.
(3) It enables data scientists to publish deep learning models as Spark SQL User Defined Functions, which can be used by collaborators with no knowledge of the underlying models.

To illustrate these benefits, we will run a live demo to show how Deep Learning Pipelines can be used to tackle the example use case of image classification.


Joseph Bradley is an Apache Spark PMC member and Machine Learning Software Engineer at Databricks. Previously, he was a postdoc at UC Berkeley after receiving his Ph.D. in Machine Learning from Carnegie Mellon in 2013.

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google