ODSC West 2022

Preliminary Schedule

more sessions added weekly

50% OFF | REGISTER NOW

West 2022 Schedule

We are delighted to announce our West 2022 Preliminary Schedule!
Please Note: In-Persons attendees will have access to virtual sessions. If you have a virtual Pass, please note that we will not live-stream any in-person sessions. Only virtual sessions will be recorded. 

 

ODSC West Talks
---Tuesday, 1st November
--Wednesday, 2nd November
Thursday, 3rd November
ODSC West Training&Workshops/Tutorials
----Monday, 31st October
---Tuesday, 1st November
--Wednesday, 2nd November
Thursday, 3rd November
---Tuesday, 1st November
--Wednesday, 2nd November
Thursday, 3rd November
----Monday, 31st October
---Tuesday, 1st November
--Wednesday, 2nd November
Thursday, 3rd November
AI in a Minefield: Learning from Poisoned Data

Virtual | Talk | Machine Learning | Beginner – Intermediate

 

 

In this talk, we will present the challenges of learning from dirty data, overview data poisoning attacks on different systems like Spam detection, image classification and rating systems, discuss the problem of learning from web traffic – probably the dirtiest data in the world, and explain different approaches for learning from dirty data and poisoned data. We will focus on threshold-learning mitigation for data poisoning, aiming to reduce the impact of any single data source, and discuss a mundane but crucial aspect of threshold learning – memory complexity. We will present a robust learning scheme optimized to work efficiently on streamed data with bounded memory consumption. We will give examples from the web security arena with robust learning of URLs, parameters, character sets, cookies and more…more details

AI in a Minefield: Learning from Poisoned Data image
Johnathan Roy Azaria
Data Scientist Tech Lead | Imperva
Reasoning About the Probabilistic Behavior of Classifiers

Virtual | Talk | Machine Learning | Research Frontiers | Intermediate-Advanced

 

This talk will overview our recent work on reasoning about the behavior of learned classifiers. I will cover techniques to inject domain knowledge into machine learning models, for example by enforcing monotonicity constraints, or logical structure, on neural network outputs…more details

Reasoning About the Probabilistic Behavior of Classifiers image
Guy Van den Broeck, PhD
Director | Associate Professor | StarAI (Statistical and Relational Artificial Intelligence Lab) | UCLA
Anomaly Detection with Python and R

In-person | Half-Day Training | Beginner-Intermediate

 

 

This course will examine anomaly detection through the example of fraud, but all these techniques can be applied to other areas as well. We will start with the importance of feature creation and transformation. We will then cover more statistical based approaches to anomaly detection. Last, we will end with more machine learning based approaches to allow the learner to approach anomalies from any angle and industry need…more details

Anomaly Detection with Python and R image
Aric LaBarr, PhD
Associate Professor of Analytics | Institute for Advanced Analytics at NC State University
Operationalizing Organizational Knowledge with Data-Centric AI

Virtual | Talk | All Levels

 

 

Data-centric AI broadly describes the idea that *data*, rather than models, is increasingly the crux of success or failure in AI for many settings and use cases. More specifically, data-centric AI defines ML development workflows that center around principally iterating on the *training data*–e.g. labeling, sampling, slicing, augmenting, etc.–rather than the model architecture. In this talk, I’ll describe how programmatic or weak supervision can not only facilitate these data-centric workflows (in ways that manual labeling cannot), but more importantly, will present an overview about how it can serve as an API for rich organizational knowledge sources, presenting recent technical results and user case studies…more details

Operationalizing Organizational Knowledge with Data-Centric AI image
Alex Ratner, PhD
Assistant Professor, | Co-founder & CEO | UW | Snorkel AI
Building Production-Ready Recommender Systems with Feast

In-person | Talk

 

In this talk, we explore:

  • Challenges of building recommender systems
  • Strategies for reducing latency, while balancing requirements for freshness
  • Challenges in mitigating data quality issues
  • Technical and organizational challenges feature stores solve
  • How to integrate Feast, an open-source feature store, into an existing recommender system to support production systems…more details
Building Production-Ready Recommender Systems with Feast image
Danny Chiao
Engineering Lead | Tecton
Denoising Diffusion-based Generative Modeling

Virtual | Talk | All Levels

 

 

Coming Soon!

Denoising Diffusion-based Generative Modeling image
Stefano Ermon
Assistant Professor | Stanford University
Detecting Changes Over Time with Bayesian Change Point Analysis in R

In-person | Talk | Beginner-Intermediate

 

 

Did my data change after a certain intervention? This is a common question with data observed over time. Classical statistical and engineering approaches include control charts to see if the series falls outside of the normal boundaries of expected data. A Bayesian approach to this problem calculates the probability that the data series changes at every point along the series. Bayesian change point analysis allows the analyst to evaluate a whole series and look where the highest probability of change occurred. Has the financial asset lost value after the recent financial report? Are the healthcare outcomes at this hospital better after our new process to help patients? Did the manufacturing process improve after upgrading the machinery? All these questions and more can be answered with these techniques which will be shown in R…more details

Detecting Changes Over Time with Bayesian Change Point Analysis in R image
Aric LaBarr, PhD
Associate Professor of Analytics | Institute for Advanced Analytics at NC State University
Transforming The Retail Industry with Transformers

In-person | Talk | Deep Learning | Machine Learning | ML Ops and Date Engineering | All Levels

 

 

In this talk, we discuss these challenges and share our findings and recommendations from working on real-world examples at SPINS, a data/tech company focused on the natural grocery industry. More specifically, we describe how we leverage state-of-the-art language models to seamlessly automate parts of SPINS’ data ingestion workflow and drive substantial business outcomes. We provide a walk-through of our end-to-end MLOps system and discuss how using the right tools and methods have helped to mitigate some of these challenges. We also share our findings from our experimentation and provide insights on when one should use these massive transformer models instead of classical ML models. Considering that we have a variety of challenges in our use cases from an ill-defined label space to a huge number of classes (~86,000) and massive data imbalance, we believe our findings and recommendations can be applied to most real-world settings. We hope that the learnings from this talk can help you to solve your own problems more effectively and efficiently!…more details

Transforming The Retail Industry with Transformers image
Azin Asgarian
Applied Research Scientist | Georgian
Transforming The Retail Industry with Transformers image
Elliot Henry
Data Science Manager | SPINS
Human Factors of Explainable AI

In-person | Talk | Responsible AI | Beginner

 

 

There are many types of users and stakeholders that require Explainable AI. Without explanations, end-users are less likely to trust and adopt ML-based technologies. Without a means of understanding model decision-making, business stakeholders have a difficult time assessing the value and risks associated with launching a new ML-based product. And without insights into why an ML application is behaving in a certain way, application developers have a harder time troubleshooting issues, and ML scientists have a more difficult time assessing their models for fairness and bias. To further complicate an already challenging problem, the audiences for ML model explanations come from varied backgrounds, have different levels of experience with statistics and mathematical reasoning, and are subject to cognitive biases…more details

Human Factors of Explainable AI image
Meg Kurdziolek, PhD
Sr. UX Researcher | Google
Scalable, Real-Time Heart Rate Variability Biofeedback for Precision Health: A Novel Algorithmic Approach

In-person | Talk | Machine Learning | Data Science Research Frontiers | All Levels

 

 

Heart rate variability biofeedback (HRV-B) is a clinically effective therapy in which patients can improve their mental and physical well-being through real-time monitoring of the heart-rate and specialized breathing techniques. HRV-B can improve health outcomes in a number of medical or wellness-related conditions, ranging from depression and anxiety, to cardiovascular disease, asthma, cancer fatigue, women’s health, better sleep, peak athletic performance, and stress resilience…more details

Scalable, Real-Time Heart Rate Variability Biofeedback for Precision Health:  A Novel Algorithmic Approach image
Kirstin Aschbacher, PhD
Head of Health Data Science | Meru Health
Extensible Hosted Jupyter Notebook Platform for Accelerating Data Insights

In-person | Talk | ML Ops and Date Engineering | Data Science Research Frontiers | Data Analytics | Beginner – Intermediate

 

 

We at LinkedIn leverage Jupyter notebooks extensively to do ad-hoc data analysis and our data scientists, engineers and developers spend a lot of time iterating over the query development lifecycle. We have created a hosted notebook platform at LinkedIn for our internal users) called Darwin (Data Analytics & Relevance Workbench at LinkedIn) – a one-stop solution for a complete Jupyter notebook/query life cycle (query development, query testing and query productionizing)…more details

Extensible Hosted Jupyter Notebook Platform for Accelerating Data Insights image
Swasti Kakker
Sr. Software Engineer | LinkedIn
Extensible Hosted Jupyter Notebook Platform for Accelerating Data Insights image
Manu Ram Pandit
Sr. Software Engineer | LinkedIn
Building Modern Search Pipelines with Haystack, Large Language Models and Hybrid Retrieval

In-person | Talk

 

In this talk we navigate through the latest buzz around semantic search and separate the noise from the meaningful advancements. Is dense retrieval better than BM25’s keyword search? Do large language models outperform smaller transformers? How well do the models generalize to industry corpora? How can we leverage Question Answering? We will benchmark different methods, share best practices from industry use cases and show how you can use the open source framework Haystack to build, test and deploy stellar search pipelines easily yourself…more details

Building Modern Search Pipelines with Haystack, Large Language Models and Hybrid Retrieval image
Malte Pietsch
CTO & Co-Founder | deepset
Navigating the Pitfalls of Applying Machine Learning in Practice

Virtual | Talk | Machine Learning | All Levels

 

 

Coming Soon!

Navigating the Pitfalls of Applying Machine Learning in Practice image
Jacob Schreiber
PhD Candidate | University of Washington
MLOps for Deep Learning

In-person | Talk | Deep Learning | ML Ops and Date Engineering | Intermediate – Advanced

 

Only a significant minority of companies unlock the true potential of AI as trained models accumulate dust due to challenges in MLOps. Serving reliable AI predictions to customers involves cost, effort, and planning to set up a continuous deployment pipeline. MLOps for Deep Learning demands a carefully crafted deployment pipeline. We discuss our open-source project which is a robust continuous deployment pipeline by integrating our unique drift detection and model retrain algorithms for serving DL models. We show how to efficiently deploy, monitor, and maintain DL models in production using our solution which is a Kubernetes native POC solution…more details

MLOps for Deep Learning image
Diego Klabjan, PhD
Professor | Northwestern University
MLOps for Deep Learning image
Yegna Jambunath
ML Ops Research Specialist | Northwestern University
Responsible AI Is Not an Option

In-person | Talk | Machine Learning | Data Analytics | Responsible AI | Intermediate

 

In his talk “Responsible AI Is Not an Option,” Dr. Scott Zoldi brings to bear his decades of experience in delivering analytic innovation in a highly regulated environment, to underscore the urgency with which the topics of AI fairness and bias must be ushered onto Boards of Directors’ agendas. In his dynamic presentation, Dr. Zoldi spells out the “why” and “how” of fulfilling the social covenant and soon, regulatory requirements of enterprises using AI ethically, transparently, securely and in their customers’ best interests…more details

Responsible AI Is Not an Option image
Scott Zoldi, PhD
Chief Analytics Officer | FICO
Optimizing Recommendations for Competing Business Objectives

In-person | Talk | Deep Learning | Machine Learning | Intermediate

 

In this talk I will give an overview of this problem, what makes it difficult, how we have been addressing it at Wayfair, and the lessons that we have learned so far. I will start by giving an overview of the different types of competing business objectives that can arise in the e-commerce use case and how they compete against each other in practice. Along the way I will introduce some of the fundamental concepts of multi-objective optimization, including Pareto Efficiency and the Pareto Frontier, and how they relate to this problem. Finally, I will discuss the pros and cons of various strategies for making recommendation systems profit aware…more details

Optimizing Recommendations for Competing Business Objectives image
Ali Vanderveld, PhD
Senior Staff Data Scientist | Wayfair
Introduction to Differential Privacy Concepts

In-person | Talk

 

Differential privacy can be viewed as a technical solution for protecting individual privacy to meet legal or policy requirements for disclosure limitation while analyzing and sharing personal data. Using examples and some mathematical formalism, this talk will introduce differential privacy concepts including the definition of differential privacy, how differentially private analyses are constructed, and how these can be used in practice…more details

Introduction to Differential Privacy Concepts image
Veena B. Mendiratta, PhD
Veena B. Mendiratta | Adjunct Faculty | Network Reliability and Analytics Researcher
Causal/Prescriptive Analytics in Business Decisions

In-person | Talk

 

In this talk, I will provide a holistic review of the research methods and tools for causal analytics in business decisions. I focus especially on causal inference in data science. I will discuss a decision tree that helps data scientists to identify the best causal research method based on the problem, context, and the nature of the data. I will draw on my proprietary research on prescriptive analytics (https://docs.google.com/document/d/1b8yaDzriVB2JyIBNQMsUn-uz4bXnsdFe6hTLLTOs1q4/edit)…more details

Causal/Prescriptive Analytics in Business Decisions image
Victor Zitian Chen, PhD, CFA
Director of Experimental Design and Causal Inference | Fidelity Investments
Make Your Data Science Environment Just Right With Saturn Cloud

In-person | Demo Talk

 

In this demo, we’ll cover the best practices for setting up data science environments, how to store them so you can replicate your code, and how to share a setup with your coworkers…more details

Make Your Data Science Environment Just Right With Saturn Cloud image
Dr. Jacqueline Nolis
Head of Data Science | Saturn Cloud
An Introduction to Data Literacy with SQL

Virtual | Bootcamp | Data Analytics

 

Data literacy is an essential foundational topic for anyone considering data science or machine learning.  This session will help one understand core data workflow concepts including what is data, data generation and collecting, data profiling, data transformation, data shaping, and other essential workflow topics.  As this is an interactive training session, in addition to covering these data topics, we will layer on SQL and an introduction to relational databases…more details

An Introduction to Data Literacy with SQL image
Sheamus McGovern
Founder and CEO | ODSC
Beyond the Basics: Data Visualization in Python

In-person | Half-Day Training | Machine Learning | Data Visualization | Intermediate-Advanced

 

While there are many plotting libraries to choose from, the prolific Matplotlib library is always a great place to start. Since various Python data science libraries utilize Matplotlib under the hood, familiarity with Matplotlib itself gives you the flexibility to fine tune the resulting visualizations (e.g., add annotations, animate, etc.). This session will also introduce interactive visualizations using HoloViz, which provides a higher-level plotting API capable of using Matplotlib and Bokeh (a Python library for generating interactive, JavaScript-powered visualizations) under the hood…more details

Beyond the Basics: Data Visualization in Python image
Stefanie Molin
Data Scientist, Software Engineer, Author of Hands-On Data Analysis with Pandas | Bloomberg
Machine Learning with Python: A Hands-On Introduction

In-person | Half-Day Training | Machine Learning | All Levels

 

 

By completing this hands-on workshop, you will develop an understanding of machine learning concepts and methodologies and learn how to fit, tune, and evaluate the predictive performance of a variety of parametric and non-parametric models for classification and regression. You will become familiar with how to preprocess data, build, tune, and cross-validate predictive models, and make predictions with the models in Python…more details

 

 

Machine Learning with Python: A Hands-On Introduction image
Clinton Brownley, PhD
Data Scientist | Meta
Large Scale Deep Learning using the High-Performance Computing Library OpenMPI and DeepSpeed

Virtual | Workshop | Deep Learning | MLOps & Data Engineering | Intermediate

 

 

OpenMPI is used for high-performance computing at supercomputing centers as part of distributed computing systems. In the first half of this workshop, we will get to know more about OpenMPI and work with it hands-on using a python interface. The second half of the workshop will focus on using DeepSpeed on OpenMPI and work through a few examples. Examples for MPI basics will include inferring pi using distributed computing and running python scripts using OpenMPI…more details

Large Scale Deep Learning using the High-Performance Computing Library OpenMPI and DeepSpeed image
Jennifer Dawn Davis, PhD
Staff Field Data Scientist at Domino Data Labs
Any Way You Want It: Integrating Complex Business Requirements into ML Forecasting Systems

Virtual | Workshop

 

 

Regardless of their concrete application, the primary goal of forecasting systems is to produce the most accurate forecast possible. However, while beating benchmarks is important, a forecast useable in business processes additionally needs to fulfill many more criteria, which significantly increases the complexity of real-world solutions…more details

Any Way You Want It: Integrating Complex Business Requirements into ML Forecasting Systems image
David Koll
Senior Data Scientist | Continental AG
Self-Supervised and Unsupervised Learning for Conversational AI and NLP

Virtual | Workshop | NLP | Machine Learning | Beginner – Intermediate

 

In this talk, I will be giving some background in Conversational AI, NLP along with Self-supervised and Unsupervised techniques. Transformers-based large language models (LLMs) such as GPT-3, Jurasic, T5 have been foundational to the advances that we see. I will walk the audience through hands-on examples and how they can leverage transformers and large language models for few-shot, zero-shot learning in a variety of NLP applications such as text classification, summarization and question-answering…more details

Self-Supervised and Unsupervised Learning for Conversational AI and NLP image
Chandra Khatri
Chief Scientist and Head of AI | Got It AI
Fighting Churn With Data

In-person | Workshop | Machine Learning | Beginner-Intermediate

 

This workshop will teach hands-on coding techniques covering all the foundations of a complete churn fighting data pipeline including churn measurement, feature engineering from raw data, data set creation, and machine learning. The workshop is taught using Python and SQL from the open source fightchurn package (pypi pypi.org/project/fightchurn/, github github.com/carl24k/fight-churn.) Participants should come with their own laptop prepared with Python and an IDE such as PyCharm or VSCode that will allow them to put breakpoints in code that will be demonstrated and discussed….more details

Fighting Churn With Data image
Carl Gold, PhD
Director of Data Science | Migo
Foundations of Deep Reinforcement Learning

Virtual | Tutorial | Deep Learning | Intermediate-Advanced

 

Deep Reinforcement Learning equips AI agents with the ability to learn from their own trial and error. Success stories include learning to play Atari games, Go, Dota2, robots learning to run, jump, manipulate. This tutorial will cover the foundations of Deep Reinforcement Learning, including MDPs, DQN, Policy Gradients, TRPO, PPO, DDPG, SAC, TD3, model-based RL, as well as current research frontiers.more details

Foundations of Deep Reinforcement Learning image
Pieter Abbeel, PhD
Professor and Director of the Robot Learning Lab at UC Berkeley | Co-Founder at covariant.ai | Co-Founder at Gradescope | Advisor at OpenAI
Practical Tutorial on Uncertainty and Out-of-distribution Robustness in Deep Learning

In-person | Tutorial | Deep Learning | All Levels

 

 

Deep neural networks can make overconfident errors and assign high confidence predictions to inputs far away from the training data. Well-calibrated predictive uncertainty estimates are important to know when to trust a model’s predictions, especially for safe deployment of models in applications where the train and test distributions can be different. I’ll first present some concrete examples that motivate the need for uncertainty and out-of-distribution (OOD) robustness in deep learning…more details

Practical Tutorial on Uncertainty and Out-of-distribution Robustness in Deep Learning image
Balaji Lakshminarayanan, PhD
Building a Semantic Search Engine

In-person | Half-Day Training | Deep Learning | Machine Learning | Beginner – Intermediate

 

 

Most production information retrieval systems are built on top of Lucene which use tf-idf and BM25. The current state of the art techniques utilizes embeddings for retrieval. This workshop aims to demystify what is involved in building such a system…more details

Building a Semantic Search Engine image
Nidhin Pattaniyil
Machine Learning Engineer | Walmart
Self-Supervised and Unsupervised Learning for Conversational AI and NLP

In-person | Workshop | NLP | Machine Learning | Beginner – Intermediate

 

In this talk, I will be giving some background in Conversational AI, NLP along with Self-supervised and Unsupervised techniques. Transformers based large language models (LLMs) such as GPT-3, Jurasic, T5 have been foundational to the advances that we see. I will walk the audience through hands-on examples and how they can leverage transformers and large language models for few-shot, zero-shot learning in a variety of NLP applications such as text classification, summarization and question-answering…more details

Self-Supervised and Unsupervised Learning for Conversational AI and NLP image
Chandra Khatri
Chief Scientist and Head of AI | Got It AI
Running Any ML Code in Any ML Framework

Virtual | Workshop | All Levels

 

 

Coming Soon!

Running Any ML Code in Any ML Framework image
Daniel Lenton, PhD
CEO | Ivy
Bagging to BERT – A Tour of Applied NLP

In-person | Workshop | All Levels

 

 

In this workshop, we will explore some of these popular NLP techniques that have broad applicability. From the basics of bagging and word vectors to the creating of contextualized representations of words and sentences, the workshop will equip participants with the tools they need to turn messy text data into useful insights. The focus of the workshop will be building NLP approaches with increasing complexity. Each step in the progression will build on the others and be evaluated against one another…more details

Bagging to BERT – A Tour of Applied NLP image
Benjamin Batorsky, PhD
Data Scientist | ThriveHive
Separating the Signal from the Noise: Signal Processing and Feature Extraction Techniques for Biological Data

In-person | Workshop | All Levels

 

 

Your model is only as good as the data that goes into it, so removing the maximum amount of noise while retaining signal is vital. This talk will introduce basic signal processing and motion artefact removal techniques, such as low-pass filtering, Kalman filtering, and Savitzky-Golay filtering. The session will take you through practical examples that you can apply straight away to your biological data. I will also make recommendations for data collection to consider when designing a sensor so that you can get the best possible data (because prevention is better than treatment!). Those interested in machine learning and signal analysis for biomedical processing, electrical and optical signals, and wearables technology will enjoy this talk!…more details

Separating the Signal from the Noise: Signal Processing and Feature Extraction Techniques for Biological Data image
Michelle Hoogenhout
Lead Data Scientist | Hydrostasis, Inc
StructureBoost: Gradient Boosting with Categorical Structure

In-person | Workshop | Machine Learning | Intermediate-Advanced

 

Often, categorical variables possess a natural structure that is not linear or ordinal in nature. The months of the year have a circular structure while the US states have a structure that can be represented by a graph. StructureBoost uses novel techniques that allow this known structure to be exploited to yield better predictions. Recently, StructureBoost has been enhance to utilize the structure in the target variable (i.e. in multi-classification) as well as in the predictor variables. This hands-on workshop will demonstrate how to use StructureBoost in different problems involving categorical variables with known structure…more details

StructureBoost: Gradient Boosting with Categorical Structure image
Brian Lucena, PhD
Principal | Numeristical
Select date to see events.

Register for ODSC West 2022 - November 1-3rd

Register and save 50%
See our Program Summary for an Event Overview
Program Summary

.

How It Works

  • Virtual conference experience includes networking lounge area, speaker auditorium, expo halls, and prizes

  • Access multiple livestream tracks on Tuesday, Wednesday, Thursday

  • Switch between sessions or tracks as your interests dictate

  • Multiple focus areas including deep learning, machine learning, NLP, research frontiers, AI X for business, and more

  • Sessions you missed can be viewed on demand at your leisure

  • Engage virtually with fellow attendees, speakers, and Expo partners

  • Participate in Q&A sessions with your speaker over live chat

  • Directly download slides and other session materials

  • (Training only) Access training and workshops prerequisites, notebooks, and other materials prior to training session starting

  • (Training only) Access hands-on training and workshops with instructor-led code labs and notebooks.

Interested? Don't doubt - explore ODSC West 2022 Conference

Register and save 60%

ODSC Newsletter

Stay current with the latest news and updates in open source data science. In addition, we’ll inform you about our many upcoming Virtual and in person events in Boston, NYC, Sao Paulo, San Francisco, and London. And keep a lookout for special discount codes, only available to our newsletter subscribers!


Open Data Science

 

 

 

Open Data Science
One Broadway
Cambridge, MA 02142
info@odsc.com

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Youtube
Consent to display content from Youtube
Vimeo
Consent to display content from Vimeo
Google Maps
Consent to display content from Google