
Abstract: Managers and users of enterprise data science, analytics, and machine learning platforms struggle to balance the competing priorities of openness and control. Open platforms enable users to flexibly experiment with new tools and methods that can help a business to be innovative and competitive, but they can be risky and difficult to control and maintain. Closed platforms offer greater predictability and supportability and mitigate many risks, but can lead to overreliance on vendors and can preclude data scientists from using tools and methods that might turn out to be transformational.
The speaker offers guidance on how to reduce this tension by making sound decisions in the architecture of a data science platform and establishing the right patterns for its use. The speaker will discuss:
* How to be smart—not afraid—when data scientists want to try new tools and packages
* How to architect an open source data science platform to enable experimentation while minimizing risk
* How to think more holistically about support
* How to avoid pitfalls by practicing reproducibility and expanding the use of version control
Bio: Jordan Volz is a Systems Engineer at Cloudera. He helps clients design and implement big data solutions using Cloudera’s Distribution of Hadoop, across a variety of industry verticals.
Previously, he has worked as a consultant for HP Autonomy delivering compliance archiving, e-Discovery, and electronic surveillance solutions to regulated financial services companies, and as a developer at Epic Systems building HIPPA-compliant EMR software.