Abstract: Applied Scientist is a job title that is growing in usage across tech companies including Amazon, Facebook, and Microsoft. This new discipline is at the intersection of data science and machine learning engineering, and while there is growing demand for applied scientists in industry, it can be difficult to build relevant experience. This talk will explore four different themes to focus on when putting together a portfolio for applied science jobs.
The first theme is getting hands-on experience with different cloud platforms and coding environments. I recommend that aspiring applied scientists build experience with these platforms, even if it means doing so on your own time. Both Amazon Web Services (AWS) and Google Cloud Platform (GCP) provide free-tier solutions for building experience with these platforms, and trying out different systems can provide a broader perspective on how to build production-grade data products.
The second theme is using large-scale data sets when putting together a portfolio of projects. Ideally, you should be pushing the limits of single-instance computing and exploring distributed computing approaches to solve problems at scale. This means moving beyond Kaggle datasets that can be loaded into memory, and instead exploring options such as BigQuery’s public datasets.
The third theme is getting hands-on experience integrating different components within a data platform. For example, instead of just setting up a web endpoint, explore building a consumer for the service that causes a large volume of requests. Applied science often requires building end-to-end data and model pipelines, and integrating different components together within a cloud platform is a great way of demonstrating experience in this area.
The final theme is authoring content about your projects. While putting your code and documentation on platforms like GitHub is great for sharing projects, it’s also good to build experience writing long-form content about your projects. This can take the form of white papers or blog posts, and authoring this type of content demonstrates written communication skills.
Bio: Ben Weber is a distinguished data scientist at Zynga with past experience at Twitch, Electronic Arts, Daybreak Games, and Microsoft Studios. He received his PhD in computer science from UC Santa Cruz.