Engineering For Data Science
Engineering For Data Science


Practicing data scientists typically spend the bulk of their time working developing models for a particular inference or prediction application, likely giving substantially less time to the equally complex problems stemming from system infrastructure. We might trivially think of these two often orthogonal concerns as the modeling problem and the engineering problem. The typical data scientist is trained to solve the former, often in an extremely rigorous manner, but can often wind up developing a series of ad hoc solutions to the latter.

This talk will discuss Docker as a tool for the data scientist, in particular in conjunction with the popular interactive programming platform, Jupyter, and the cloud computing platform, Amazon Web Services (AWS). Using Docker, Jupyter and AWS, the data scientist can take control of their environment configuration, prototype scalable data architectures, and trivially clone their work toward replicability and communication. This talk will toward developing a set of best practices for Engineering for Data Science.


Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google