Limited discount | Register now & Save 40%
Limited discount | Register now & Save 40%

Limited Discount 40% Off | Ends soon

For East 2022 please rever to the event app for the in-person conference and the our virtual platform for our most current virtual schedule

We are delighted to announce our East 2023 Preliminary Schedule!

90+ Additional Sessions Coming Soon

All sessions are scheduled in EST time zone (Eastern Standard Time)

  • ODSC Talks/Keynotes schedule includes Tuesday May 9 –  Thursday May 11. In-person sessions are available to Gold, Platinum, Mini-Bootcamp, and VIP Pass holders. Business talks are available to  Ai x Pass holders. Virtual Sessions are available to Virtual Premium, Virtual Platinum & Virtual Mini-Bootcamp pass holders.
  • ODSC Trainings are scheduled from Tuesday May 9 –  Thursday May 11. In-person sessions are available to Platinum, Mini-Bootcamp, and VIP Pass holders. Virtual Sessions are available to Virtual Platinum & Virtual Mini-Bootcamp pass holders.
  • ODSC Workshop/Tutorials are scheduled from Tuesday May 9th to Thursday May 11th. All in-person sessions are available to VIP, Platinum, Mini-Bootcamp and Gold pass holders. Silver Pass holders can attend only on Wednesday and Thursday.  Virtual Sessions are available for Virtual Premium, Virtual Platinum & Virtual Mini-Bootcamp pass holders.
  • ODSC Bootcamp Sessions are scheduled VIRTUALLY on Monday May 8 as pre conference training. They are ONLY available for In person Mini-Bootcamp, and VIP Pass and Virtual Mini-Bootcamp holders.
Speaker and speaker schedule times are subject to change.

Bootcamp Sessions
--Pre-Bootcamp live training warm up
-Monday, 8th May (Virtual)
Trainings/Workshops
---Tuesday, 9th May
--Wednesday, 10th May
-Thursday, 11th May
Keynotes/Talks
---Tuesday, 9th May
--Wednesday, 10th May
-Thursday, 11th May
--Pre-Bootcamp live training warm up
-Monday, 8th May (Virtual)
---Tuesday, 9th May
--Wednesday, 10th May
-Thursday, 11th May
---Tuesday, 9th May
--Wednesday, 10th May
-Thursday, 11th May
09:00 - 11:45
Introduction to Machine Learning

Virtual | Bootcamp | Machine Learning | Beginner

 

The Introduction to Machine Learning Workshop will build upon the attendee’s foundation of math and coding knowledge to develop a basic understanding of the most popular machine learning algorithms used in industry today. We will answer such questions as: What are the different types of ML algorithms ? What is Overfitting and how can we avoid it? Why is XGBoost consistently outperform other algorithms?…more details

Introduction to Machine Learning image
Julia Lintern
Director of Data Science | Gartner
09:00 - 11:45
Programming with Data: Python and Pandas

Virtual | Bootcamp | Machine Learning | MLOps | Intermediate

 

 

In this training, you will learn how to accelerate your data analyses using the Python language and Pandas, a library specifically designed for tabular data analysis. We start by learning the core Pandas data structures, the Series and DataFrame. From these foundations, we will learn to use the split-apply-combine paradigm for grouped computations, manipulate time series, and perform advanced joins between datasets. Specifically, loading, filtering, grouping, and transforming data. Having completed this workshop, you will understand the fundamentals and advanced features of Pandas, be aware of common pitfalls, and be ready to perform your own analyses…more details

Programming with Data: Python and Pandas image
Daniel Gerlanc
Sr. Director - Data Science & ML Engineering | Ampersand
March 2, 2023: ODSC East Bootcamp Warmup: Data Primer Course

 

This course is aimed at helping people begin their AI journey and gain valuable insights that we will build up in subsequent SQL, programming and AI courses…more details

March 2, 2023: ODSC East Bootcamp Warmup: Data Primer Course image
Sheamus McGovern
CEO and Software Architect, Data Engineer, and AI expert | ODSC
March 14, 2023: ODSC East Bootcamp Warmup: SQL Primer Course

 

This course covers topics such as database design and normalization, data wrangling, aggregate functions, subqueries, and join operations. Students will learn how to design and write SQL code to solve real-world problems …more details

March 14,  2023: ODSC East Bootcamp Warmup: SQL Primer Course image
Sheamus McGovern
CEO and Software Architect, Data Engineer, and AI expert | ODSC
April 6, 2023: ODSC East Bootcamp Warmup: Programming Primer Course with Python

 

This course aims to provide a basic foundation in Python and help participants develop the skills needed to progress in the field of data science and machine learning…more details

April 6, 2023: ODSC East Bootcamp Warmup: Programming Primer Course with Python image
Sheamus McGovern
CEO and Software Architect, Data Engineer, and AI expert | ODSC
April 26, 2023: ODSC East Bootcamp Warmup: AI Primer Course

 

This AI literacy course is designed to introduce participants to the basic of artificial intelligence (AI) and machine learning.

Upon completions, individuals will have a foundational understanding of machine learning and its capabilities…more details

April 26, 2023: ODSC East Bootcamp Warmup: AI Primer Course image
Sheamus McGovern
CEO and Software Architect, Data Engineer, and AI expert | ODSC
12:15 - 15:00
Mathematics for Data Science

Virtual | Bootcamp | Machine Learning | Beginner

 

 

Data science uses a combination of mathematics, statistics and computer science to help us solve questions of importance in a large number fields. In this workshop we will introduce the underlying mathematical principles of the field, with example problems gleaned from a number of different industries. By the end of the workshop the participant will know enough data science to explore their own problems, and be ready for more intermediate and advanced courses…more details

Mathematics for Data Science image
Eric Eager, PhD
VP of Research and Development | SumerSports
12:15 - 15:00
An Introduction to Data Wrangling with SQL

Virtual | Bootcamp | Machine Learning | Beginner

 

 

Data wrangling is an essential foundational topic for anyone considering a role in data engineering, data science, or machine learning. This session will help you understand core data wrangling concepts including what is data, data generation and collecting, data cleaning, profiling, transformation, and other essential data wrangling topics. As this is an interactive training session, in addition to covering these topics, we will layer on hands-on SQL training and an introduction to relational databases. As we journey through the data workflow we will use SQL to wrangle and transform the data as needed. SQL consistently makes the top 5 job requirements list for data scientists, data analysts, machine learning engineers, and other related data roles.  The SQL standard is the universal go-to tool for manipulating structured data stores including relational databases…more details

An Introduction to Data Wrangling with SQL image
Sheamus McGovern
CEO and Software Architect, Data Engineer, and AI expert | ODSC
09:40 - 12:55
Introduction to scikit-learn: Machine Learning in Python

In-person | Half-Day Training | Machine Learning | Deep Learning | Beginner

 

Scikit-learn is a Python machine-learning library used by data science practitioners from many disciplines. We will start this training by learning about scikit-learn’s API for supervised machine learning. scikit-learn’s API mainly consists of three methods: fit to build models, predict to make predictions from models, and transform to modify data…more details

Introduction to scikit-learn: Machine Learning in Python image
Thomas J. Fan
Staff Software Engineer | Quansight Labs
09:40 - 10:55
Idiomatic Pandas

In-person | Workshop | Machine Learning | Beginner (Recommended for Bootcamp)

 

Pandas can be tricky, and there is a lot of bad advice floating around. This tutorial will cut through some of the biggest issues I’ve seen with Pandas code after working with the library for a while and writing three books on it…more details

Idiomatic Pandas image
Matt Harrison
Python & Data Science Corporate Trainer | Consultant | MetaSnake
09:40 - 17:15
Beyond the Basics: Data Visualization in Python

In-person | Full-Day Training | Data Visualization & Data Analysis | Machine Learning | Intermediate-Advanced

 

The human brain excels at finding patterns in visual representations, which is why data visualizations are essential to any analysis. Done right, they bridge the gap between those analyzing the data and those consuming the analysis. However, learning to create impactful, aesthetically-pleasing visualizations can often be challenging. This session will equip you with the skills to make customized visualizations for your data using Python…more details

Beyond the Basics: Data Visualization in Python image
Stefanie Molin
Data Scientist | Bloomberg | Author of Hands-On Data Analysis with Pandas
09:40 - 12:55
Advanced Gradient Boosting (II): Calibration, Probabilistic Regression and Conformal Prediction

In-person | Half-Day Training | Machine Learning | Intermediate-Advanced

 

Gradient Boosting remains the most effective method for classification and regression problems on tabular data. This session is Part Two of two, covering advanced topics that are newer and may be less familiar. First, we will discuss how to calibrate the probabilities of classification models, reviewing the major techniques. Next, we will discuss Probabilistic Regression, wherein the goal is to predict the full probability distribution of the numerical target given the features, demonstrating different approaches to this problem. Finally, we will present tools for Conformal Prediction – a hot topic which can provide prediction intervals with strong theoretical guarantees.more details

Advanced Gradient Boosting (II): Calibration, Probabilistic Regression and Conformal Prediction image
Brian Lucena, PhD
Principal | Numeristical
09:40 - 10:55
Graph Viz: Exploring, Analyzing and Visualizing Graphs and Networks with Gephi and ChatGPT

In-person | Workshop | Data Visualization & Data Analysis | Beginner-Intermediate

 


By completing this workshop, you will learn how to create compelling visualizations, using directed and undirected graphs, dynamic graphs, and clustering. You will also learn about centrality metrics and network density. Additionally, you will learn about different layout algorithms, as well as the strategies for interpreting and communicating the graph data in meaningful ways.,,more details

Graph Viz: Exploring, Analyzing and Visualizing Graphs and Networks with Gephi and ChatGPT image
Tamilla Triantoro, PhD
Associate Professor of Computer Information Systems | Quinnipiac University
09:40 - 10:55
Hyper-productive NLP with Hugging Face Transformers

In-person | Workshop | NLP | Machine Learning | Beginner-Intermediate (Recommended for Bootcamp)

 

 

In this workshop, you’ll walk through a complete end-to-end example of using Hugging Face Transformers, involving both our open-source libraries and some of our commercial products. Starting from a dataset containing real-life product reviews from Amazon.com, you’ll train and deploy a text classification model predicting the star rating for similar reviews…more details

Hyper-productive NLP with Hugging Face Transformers image
Julien Simon
Chief Evangelist | Hugging Face
09:40 - 12:55
Advanced Fraud Modeling & Anomaly Detection with Python & R part 1

In-person | Half-Day Training | Machine Learning Safety and Security | All Levels

 

This course outlines the typical fraud framework at an organization and where data science can play a role as well as lay out how to build an analytically advanced fraud system…more details

Advanced Fraud Modeling & Anomaly Detection with Python & R part 1 image
Aric LaBarr, PhD
Associate Professor of Analytics | Institute for Advanced Analytics at NC State University
09:40 - 12:55
Getting Started with Hyperparameter Optimisation

In-person | Half-Day Training | Machine Learning | Intermediate

 

The workshop participants will then get a chance to complete a set of tasks revolving around the various optimisation techniques and observe the outcomes. The tasks will include hyperparameter optimisation for a deep neural network and optimization of the parameters of one ensemble model (Random Forest)…more details

Getting Started with Hyperparameter Optimisation image
Nikolay Manchev, PhD
Head of Data Science for EMEA | Domino Data Lab
10:40 - 13:25
Creative AI

Virtual | Half-Day Training | NLP | Intermediate

 

This workshop is designed to explore how artificial intelligence can be used to generate creative outputs and to inspire technical audiences to use their skills in new and creative ways. The workshop will also include a series of code exercises designed to give participants hands-on experience working with AI models to generate creative outputs…more details

Creative AI image
Leonardo De Marchi
VP of Labs | Thomson Reuters
10:40 - 16:40
NLP Fundamentals

Virtual | Full-Day Training | NLP | Beginner (Recommended for Bootcamp)

 

In this course we will go through Natural Language Processing fundamentals, such as pre-processing techniques,tf-idf, embeddings, and more. It will be followed by practical coding examples, in python, to teach how to apply the theory to real use cases…more details

NLP Fundamentals image
Leonardo De Marchi
VP of Labs | Thomson Reuters
NLP Fundamentals image
Laura Skylaki, PhD
Manager of Applied Research | Thomson Reuters Labs
10:40 - 12:10
From Big Data to NLP insights: Getting started with PySpark and Spark NLP

Virtual | Workshop | NLP | Machine Learning | Deep Learning, Data Engineering & Big Data | Beginner-Intermediate (Recommended for Bootcamp)

 

This workshop will introduce you to the fundamentals of PySpark (Spark’s Python API), the Spark NLP library and other best practices in Spark programming when working with textual or natural language data…more details

From Big Data to NLP insights: Getting started with PySpark and Spark NLP image
Akash Tandon
Co-Founder | Co-author, Advanced Analytics with PySpark | Looppanel | O'Reilly Media
10:40 - 12:10
Mastering Adversarial Evaluation for NLP: A Practical Workshop

Virtual | Workshop | NLP | Beginner-Intermediate

 

This workshop will equip participants with the skills and knowledge to conduct adversarial evaluation of NLP systems. Through active exercises and examples, we will discuss how to identify and address system weaknesses and explore how this approach can improve accuracy, reduce risk, and uncover potential blind spots. Participants will gain a greater understanding of how to use adversarial evaluation to detect and prevent errors in their NLP systems…more details

Mastering Adversarial Evaluation for NLP: A Practical Workshop image
Panos Alexopoulos, PhD
Head of Ontology | Textkernel BV
11:00 - 12:15
Modern NLP: Pre-training, Fine-tuning, Prompt Engineering, and Human Feedback

In-person | Workshop | NLP | Intermediate

 


Leaving this workshop, you will understand each of these topics, and you will have gained the practical, hands-on expertise to start integrating modern NLP in your domain. Participants will fine-tune and prompt engineer state-of-the-art models like BART and XLM-Roberta, and they will peer behind the curtain of world shaking technologies like ChatGPT to understand their utility and architectures…more details

Modern NLP: Pre-training, Fine-tuning, Prompt Engineering, and Human Feedback image
Daniel Whitenack, PhD
Data Scientist | SIL International
11:00 - 12:15
Bagging to BERT – A Tour of Applied NLP

In-person | Workshop | NLP | Deep Learning | Intermediate

 

In this workshop we will explore some popular NLP techniques that have broad applicability. From the basics of bagging and word vectors to the creating of contextualized representations of words and sentences, the workshop will equip participants with the tools they need to turn raw text data into useful insights…more details

 

Bagging to BERT – A Tour of Applied NLP image
Benjamin Batorsky, PhD
Senior Data Scientist | Institute for Experiential AI at Northeastern University
11:00 - 12:15
Machine Learning with XGBoost

In-person | Workshop | Machine Learning | Intermediate

 

This workshop will show how to use XGBoost. It will demonstrate model creation, model tuning, model evaluation, and model interpretation…more details

Machine Learning with XGBoost image
Matt Harrison
Python & Data Science Corporate Trainer | Consultant | MetaSnake
11:00 - 12:15
When Privacy Meets AI – Your Kick-Start Guide to Machine Learning with Synthetic Data

In-person | Tutorial | Machine Learning Safety and Security | Machine Learning | Deep Learning | NLP | All Levels

 

Join Alexandra for a hands-on tutorial on synthetic data fundamentals to learn how to create synthetic data you can trust, assess its quality, and use it for privacy-preserving ML training. As a bonus, we’ll look into boosting your ML performance with smart upsampling…more details

When Privacy Meets AI – Your Kick-Start Guide to Machine Learning with Synthetic Data image
Alexandra Ebert
Chief Trust Officer | Chair of the IEEE Synthetic Data IC Expert Group | AI, Privacy & GDPR Expert | MOSTLY AI | EEE Standards Association | #humanAIze
11:25 - 12:40
AI4Cyber: An Overview of Artificial Intelligence for Cybersecurity and an Open-Source Virtual Machine

In-person | Workshop | Machine Learning Safety and Security | Machine Learning | Deep Learning | Data Engineering | Intermediate

 

The workshop will present an overview of the VM’s operations. Sample illustrations of AI for cybersecurity will be demonstrated, including detecting vulnerable code on GitHub repositories and emerging threats from the Dark Web for proactive cyber threat intelligence capabilities…more details

AI4Cyber: An Overview of Artificial Intelligence for Cybersecurity and an Open-Source Virtual Machine image
Sagar Samtani, PhD
Assistant Professor and Grant Thornton Scholar | Indiana University
11:25 - 12:40
How to Build Stunning Data Science Web Applications in Python – Taipy Tutorial

In-person | Workshop | Machine Learning | All Levels

 

This workshop presents Taipy, a new low-code Python package that allows you to create complete Data Science applications, including graphical visualization and managing algorithms, pipelines, and scenarios…more details

How to Build Stunning Data Science Web Applications in Python – Taipy Tutorial image
Florian Jacta
Customer Success Manager | Taipy
How to Build Stunning Data Science Web Applications in Python – Taipy Tutorial image
Albert Vu
Customer Success Manager | Quantitative Analyst | MSTS
12:00 - 13:15
Ace the Data Science Interview with Nick Singh

In-person | Career Workshop | Beginner

 

Get a crash course on the most common types of technical interview questions that show up in FAANG Data Science, ML, and Data Analyst interviews, and how to best solve them. Then, practice what you learned, by collaboratively solving a real SQL, Statistics, ML, and Product Analytics interview question with Nick Singh, an Ex-Facebook & Google employee turned best-selling author of Ace the Data Science Interview…more details

Ace the Data Science Interview with Nick Singh image
Nick Singh
Co-Author | Ace The Data Science Interview
12:20 - 13:50
Self-Supervised and Unsupervised Learning for Conversational AI and NLP

Virtual | Workshop | NLP | Machine Learning | Beginner-Intermediate (Recommended for Bootcamp)

 

 

In this talk, I will be giving some background in Conversational AI, NLP along with Self-supervised and Unsupervised techniques. Transformers based large language models (LLMs) such as GPT-3, Jurasic, T5 have been foundational to the advances that we see. I will walk the audience through hands-on examples and how they can leverage transformers and large language models for few-shot, zero-shot learning in a variety of NLP applications such as text classification, summarization and question-answering…more details

Self-Supervised and Unsupervised Learning for Conversational AI and NLP image
Chandra Khatri
Chief Scientist, Head of AI and Co-Founder | Got It AI
12:20 - 13:50
Colossal-AI: A Unified Deep Learning System For Large-Scale Parallel Training

Virtual | Tutorial | Deep Learning | Intermediate-Advanced

 

In this tutorial, we introduce Colossal-AI, which is a unified parallel training system designed to seamlessly integrate different paradigms of parallelization techniques including data parallelism, pipeline parallelism, multiple tensor parallelism, and sequence parallelism…more details

Colossal-AI: A Unified Deep Learning System For Large-Scale Parallel Training image
James Demmel, PhD
Professor of Mathematics and Computer Science | UC Berkeley
Colossal-AI: A Unified Deep Learning System For Large-Scale Parallel Training image
Yang You, PhD
Presidential Young Professor | National University of Singapore
14:00 - 15:15
Relational Dataset Analytics for Clear Customer Insights

In-person | Workshop | Machine Learning | Data Analytics | Beginner-Intermediate

 

HPCC Systems is a completely free, open source Big Data/Data Lake platform created by LexisNexis Risk Solutions and used by companies globally. The workshop attendee will be provided with interactive code examples and solutions on an actual cluster created for ODSC attendees…more details

Relational Dataset Analytics for Clear Customer Insights image
Bob Foreman
Software Engineering Lead | LexisNexis Risk Solutions
14:00 - 15:15
Learn how to Build Interactive Data Apps with Plotly Dash

In-person | Workshop | Data Visualization & Data Analysis | Machine Learning | Deep Learning | NLP | Data Engineering | All Levels

 

In today’s data-driven world, static dashboards are no longer sufficient for the needs of data consumers and businesses. People need quick access to information and must be able to rapidly create and deploy data applications for both creating and consuming analytics, models, and more…more details

Learn how to Build Interactive Data Apps with Plotly Dash image
Mingo Sanchez
Senior Sales Engineer | Plotly
14:00 - 17:15
Advanced Gradient Boosting (I): Fundamentals, Interpretability, and Categorical Structure

In-person | Half-Day Training | Machine Learning | Intermediate-Advanced

 

Gradient Boosting remains the most effective method for classification and regression problems on tabular data. This session is Part One of two. We will start with the fundamentals of how boosting works and best practices for model building and hyper-parameter tuning. Next, we will discuss how to interpret the model, understanding what features are important generally and for a specific prediction. Finally, we will discuss how to exploit categorical structure, when the different values of a categorical variable have a known relationship to one another.more details

Advanced Gradient Boosting (I): Fundamentals, Interpretability, and Categorical Structure image
Brian Lucena, PhD
Principal | Numeristical
14:00 - 15:15
Introduction to Interpretability in ML (XAI)

In-person | Workshop | Responsible AI | Machine Learning Safety and Security | Intermediate

 

 

Machine learning projects are rarely like a kaggle competition. It is thrilling to see your name jump up on the leaderboard which makes competitions exciting and dare I say addictive. However predictive power in real life is much less important than kaggle competitions would have you believe. Often it is just as important to understand why a model makes a certain prediction. The ‘why’ plays an important role during the model development phase as well as after deployment. Complex machine learning pipelines are difficult to debug and issues can go unnoticed. One way to help increase trust in the model during the development stage is to improve its interpretability…more details

 

Introduction to Interpretability in ML (XAI) image
Andras Zsom, PhD
Assistant Professor of the Practice | Data Science Initiative, Brown University
14:00 - 15:15
Hybrid AI for Complex Applications with Scruff

In-person | Workshop | Machine Learning | Intermediate-Advanced

 


In this workshop, I will explain the core principles of Scruff and the main programming concepts. I will then demonstrate how we used Scruff to create a tool for wildfire risk assessment and mitigation that includes climate models, historical fire data, and fire propagation simulators. Finally, we will work through a hands-on session of getting up and running with Scruff and implementing and running simple models…more details

Hybrid AI for Complex Applications with Scruff image
Avi Pfeffer, PhD
Author, Chief Scientist | Charles River Analytics
Hybrid AI for Complex Applications with Scruff image
Sanja Cvijic, PhD
Senior Scientist | Charles River Analytics
14:00 - 17:15
Advanced Fraud Modeling & Anomaly Detection with Python & R part 2

In-person | Half-Day Training | Machine Learning Safety and Security | All Levels

 

This course outlines the typical fraud framework at an organization and where data science can play a role as well as lay out how to build an analytically advanced fraud system…more details

Advanced Fraud Modeling & Anomaly Detection with Python & R part 2 image
Aric LaBarr, PhD
Associate Professor of Analytics | Institute for Advanced Analytics at NC State University
14:00 - 15:15
Interpreting Features in Deep Networks

In-person | Tutorial | NLP | Intermediate-Advanced

 

This tutorial is targeted at learners who have experience with neural network model and are interested in gaining a deeper understanding of how they work…more details

Interpreting Features in Deep Networks image
Jacob Andreas, PhD
Assistant Professor | MIT
14:00 - 15:15
Building Computer Vision Models and Optimizing Hyperparameters using PyTorch and SAS Viya

In-person | Workshop | Deep Learning | All Levels

 

Deep learning is an area of machine learning that has become ubiquitous with artificial intelligence. PyTorch provides a comprehensive framework for the development of deep learning models. However, project requirements often extend beyond the model development process. SAS has a rich set of established and unique capabilities that support model development and deployment, including some new features that use the TorchScript language. In this workshop, we will demonstrate how to integrate PyTorch with SAS to leverage the benefits of both technologies. The workshop will focus on computer vision applications, but the framework can easily be extended to other deep learning tasks…more details

Building Computer Vision Models and Optimizing Hyperparameters using PyTorch and SAS Viya image
Robert Blanchard
Sr. Analytical Training Consultant at SAS
Building Computer Vision Models and Optimizing Hyperparameters using PyTorch and SAS Viya image
Ari Zitin
Analytical Training Consultant | SAS
14:00 - 15:15
Unifying ML With One Line of Code

In-person | Tutorial | Deep Learning |Machine Learning | All Levels

 

Why should we try to unify the ML frameworks? Won’t we just create a new incompatible standard and make the ML fragmentation even worse? I will argue that the answer to these sensible and important questions is no…more details

Unifying ML With One Line of Code image
Daniel Lenton, PhD
CEO | Ivy
14:00 - 17:15
Intermediate Machine Learning with scikit-learn: Pandas Interoperability, Categorical Data, Parameter Tuning, and Model Evaluation

In-person | Half-Day Training | Machine Learning | Deep Learning | Intermediate

 

Scikit-learn is a Python machine-learning library used by data science practitioners from many disciplines. We will learn about Pandas interoperability, categorical data, parameter tuning, and model evaluation. For Pandas interoperability, the ColumnTransformer applies data transformations on different columns from a Pandas DataFrame…more details

Intermediate Machine Learning with scikit-learn: Pandas Interoperability, Categorical Data, Parameter Tuning, and Model Evaluation image
Thomas J. Fan
Staff Software Engineer | Quansight Labs
14:00 - 17:15
Deep Learning with PyTorch and TensorFlow part 1

In-person | Full-Day Training | Deep Learning | Machine Learning | Beginner-Intermediate (Recommended for Bootcamp)

 

 

Obscure until recently, Deep Learning is ubiquitous today across data-driven applications as diverse as machine vision, natural language processing, generative A.I., and superhuman game-playing.

This training is an introduction to Deep Learning that brings high-level theory to life with interactive examples featuring PyTorch, TensorFlow 2, and Keras — all three of the principal Python libraries for Deep Learning. Essential theory will be covered in a manner that provides students with a complete intuitive understanding of Deep Learning’s underlying foundations…more details

Deep Learning with PyTorch and TensorFlow part 1 image
Dr. Jon Krohn
Chief Data Scientist, Author of Deep Learning Illustrated | Nebula.io
14:20 - 15:50
Next-Level Data Visualization in Python: A Practical Guide to Upgrading Your Plots by Making the Most of Matplotlib and More

Virtual | Tutorial | Data Visualization & Data Analysis | All Levels

 

This tutorial will discuss ways to level-up your quick plots for use in published papers, automated reporting, and any other scenario where crisp and/or custom visualizations are called for…more details

Next-Level Data Visualization in Python: A Practical Guide to Upgrading Your Plots by Making the Most of Matplotlib and More image
Melanie Veale, PhD
Data Solutions Architect | Anomalo
14:20 - 15:50
Creating a Custom Vocabulary for NLP Tasks Using exBERT and spaCY

Virtual | Tutorial | NLP | Machine Learning | Deep Learning | Intermediate

 

In this tutorial, we show an approach on how to create a custom vocabulary that can be further used for any NLP tasks…more details

Creating a Custom Vocabulary for NLP Tasks Using exBERT and spaCY image
Swagata Ashwani
Senior Data Scientist | Boomi
14:20 - 15:50
The Data Cards Playbook: A Toolkit for Transparency in Dataset Documentation

Virtual | Tutorial | Responsible AI | Machine Learning | All Levels

 

 

Data Cards are transparency artifacts that provide structured summaries of ML datasets with explanations of processes and rationale that shape the data. They also describe how the data may be used to train or evaluate ML models. In practice, two critical factors determine the success of a transparency artifact: (1) the ability to identify the information decision-makers use and (2) the establishment of processes and guidance needed to acquire that information. To initiate practice-oriented foundations in transparency that support responsible AI development in cross-functional groups and organizations, we created the Data Cards Playbook — an open-sourced, self-service, comprehensive toolkit consisting of participatory activities, frameworks, and guidance designed to address specific challenges faced by teams, product areas and companies when setting up an AI dataset transparency effort…more details

The Data Cards Playbook: A Toolkit for Transparency in Dataset Documentation image
Andrew Zaldivar, PhD
Senior Developer Relations Engineer | Google Research
The Data Cards Playbook: A Toolkit for Transparency in Dataset Documentation image
Mahima Pushkarna
Senior User Experience Designer | Google
14:20 - 15:50
Automate Machine Learning Workflows with PyCaret 3.0

Virtual | Workshop | Machine Learning | Data Engineering & Big Data | NLP | Deep Learning | Intermediate

 

In this talk, we will explore how PyCaret 3, an open-source machine learning library in Python, can significantly accelerate machine learning workflows. PyCaret 3 offers a low-code approach to building, training, and deploying machine learning models, making it an ideal tool for data scientists and developers who want to focus on the business problem rather than the technical details…more details

Automate Machine Learning Workflows with PyCaret 3.0 image
Moez Ali
Inventor and creator of PyCaret | Product Director - Artificial Intelligence | antuit.ai
15:30 - 16:45
Introduction to AutoML: Hyperparameter Optimization and Neural Architecture Search

In-person | Tutorial | Machine Learning | Intermediate

 

 

This tutorial is targeted towards Data Scientists and machine learning engineers who work on machine learning and deep learning models.
Given a task, one is interested in finding a well-performing model to solve that task. Very often, this would involve tweaking the model either by changing the hyper parameters or modifying its architecture in order to find a better performing model. In the past, this was always done manually. But, with the advent of Automated Machine Learning, we can now leave that to the machines. In this tutorial, we will provide an overview of Hyperparameter Optimization (HPO) and Neural Architecture Search (NAS)…more details

Introduction to AutoML: Hyperparameter Optimization and Neural Architecture Search image
Tejaswini Pedapati
Research Engineer | IBM TJ Watson
15:45 - 17:00
Learn how to Efficiently Build and Operationalize Time Series Models in 2023

In-person | Workshop | Machine Learning | Deep Learning | All Levels

 

During this workshop, you will build complex time series forecasting and anomaly detection models from the ground up, perform feature engineering and selection, assess the accuracy, and utilize the ModelZoo browser and root cause analysis functionalities to investigate the outcomes…more details

Learn how to Efficiently Build and Operationalize Time Series Models in 2023 image
Philip Wauters
Customer Success Manager and Value Engineer | Tangent Works
15:45 - 17:00
Patient Level Prediction with Supervised Learning Models in Federated Data Networks

In-person | Tutorial | ML for Biotech and Pharma | Machine Learning | Intermediate-Advanced

 

In this tutorial we will review the fundamentals of standardized data sources in federated health data networks and describe the data standards that enable open-source software development. We will demonstrate how to use a suite of patient level prediction tools to develop data-driven prediction models using standardized observational health data…more details

Patient Level Prediction with Supervised Learning Models in Federated Data Networks image
Frank DeFalco
Director, Observational Health Data Analytics | Janssen Research & Development
Patient Level Prediction with Supervised Learning Models in Federated Data Networks image
Jenna Reps, PhD
Director, Observational Health Data Analytics | Janssen Research & Development
15:45 - 17:00
Building Recommendation Systems

In-person | Workshop | Deep Learning | NLP | Machine Learning | Beginner-Intermediate (Recommended for Bootcamp)

 

 

Recommendation describes suggesting, or recommending, items tailored to a particular user. As generative AI creates an explosion of digital content, personalization will be more important than ever! Whether the application is sneaker designs, blog posts, or even pre-trained machine learning model weights, most recommendation tasks have a similar underlying structure. We need some way to represent items and users, typically as vectors, as well as a way to index them for fast computation. We also need to design intuitive APIs that interface the recommendation system to application developers. Weaviate is an open-source vector search database that has many unique search and database features…more details

Building Recommendation Systems image
Connor Shorten, PhD
Research Scientist | SeMI Technologies
15:45 - 17:00
Deepfakes: How’re They Made, Detected, and How They Impact Society

In-person | Tutorial | Deep Learning | All Levels

 

Deepfake photos and videos are already impacting many industries and sectors of society, in both positive and negative ways. In this session I’ll weave between the social context of deepfakes (how they’ve been used and what impact they’ve had) and the technical side of them (how they’re made, and some approaches to detecting them). This is the multifaceted story of deepfakes…more details

Deepfakes: How’re They Made, Detected, and How They Impact Society image
Noah Giansiracusa, PhD
Assistant Professor of Mathematics, Data Science at Bentley University
15:45 - 17:00
A Practical Tutorial on Building Machine Learning Demos with Gradio

In-person | Workshop | Machine Learning | Deep Learning | Data Engineering | NLP | Beginner (Recommended for Bootcamp)

 

In this workshop, we will cover how to build machine learning web applications using the Gradio (www.gradio.dev) library…more details

A Practical Tutorial on Building Machine Learning Demos with Gradio image
Freddy Boulton
Senior Software Engineer | Alteryx Innovation Labs
09:00 - 09:30
ODSC Keynote – Secrets of Successful AI Projects

Virtual | Keynote | Machine Learning | All Levels

 

 

AI is red hot, but in practice many projects still fail. This talk will cover some of the key things you need to know to succeed, including:
– What current AI is and is not good for
– The difference between a demo and a product
– Pitfalls to avoid
– Organizing AI teams…more details

ODSC Keynote – Secrets of Successful AI Projects image
Pedro Domingos, PhD
Professor Emeritus | University of Washington
09:00 - 09:30
ODSC Keynote – Confidential Data Computing and Collaboration for Data Scientists

In-person | Keynote | Responsible Ai | Machine Learning Safety and Security | All Levels

 


In this talk, I will describe our open-source platform MC2 (multi-party confidential computing) which enables data owners to encrypt their data and the data scientists to run analytics or machine learning on the encrypted data without having access to the data. MC2 is based on years of research at UC Berkeley and on publications at top tier security and privacy conferences…more details

 

 

 

ODSC Keynote – Confidential Data Computing and Collaboration for Data Scientists image
Raluca Ada Popa, PhD
Associate Professor | Co-Founder | Berkeley | PreVeil | Opaque Systems
09:40 - 10:25
Unlock the Power of Data Science for Real Change: A Blueprint for Decision Intelligence

In-person | Track Keynote | All | All Levels

 

In this talk, we’ll explore the critical role that decision intelligence workflows play in driving business value with data-derived insights. We’ll discuss the building blocks of these workflows, from ML-based recommendations to user-friendly interfaces that facilitate action-taking, and best practices to influence your decision-makers to act. Join us to learn how to effect change and give your analytical outputs the best chance of driving measurable business value…more details

Unlock the Power of Data Science for Real Change: A Blueprint for Decision Intelligence image
Joe Dery, PhD
VP & Dean, Data Analytics | Western Governors University
10:20 - 10:50
Development Principles for Biotech Data Teams

In-person | Business Talk | ML Biotech and Pharma | All Levels

 

This presentation is aimed at technical leaders in biotech organizations who are ready to take on the challenge of making their data teams, their bench scientists and everyone in between work more effectively with data and digital tools. Maybe you entered biotech from a tech background. Or maybe you became a data expert starting from a background in biology or chemistry. Either way, if you know what an organization that uses data effectively looks like, the principles in this presentation will help you build that within your own organization…more details

Development Principles for Biotech Data Teams image
Jesse Johnson
Vice President, Data Science & Engineering | Dewpoint Therapeutics
10:40 - 11:25
Semantic Search

Virtual | Talk | NLP | Beginner-Intermediate

 

The method of semantic search is especially helpful for multilingual and cross-lingual search: Previously, you had to spend a lot of time to tune lexical search systems for each language individually to work e.g. with synonyms, spelling variations, spelling mistakes etc. Now, with semantic search this is extremely simplified: Within minutes, you get a system that works amazingly well on 100+ languages…more details

Semantic Search image
Nils Reimers
Director of Machine Learning | cohere.ai
11:00 - 11:45
Five Ways to Improve Your Algorithms for Circular Business

In-person | Talk | Responsible AI | Beginner-Intermediate

 

This ODSC talk will discuss five design challenges with five down-to-earth design solutions that are needed to develop algorithms for circular business…more details

Five Ways to Improve Your Algorithms for Circular Business image
Eric van Heck, PhD
Professor | Erasmus University Rotterdam
11:00 - 11:45
Enabling Data Mesh With Even Driven Data Architecture

In-person | Talk | Data Engineering & Big Data | All Levels

 


In this talk we will explore how the adoption of an event based data architecture can enable an organization’s sustainable transition to Data Mesh. This will include an overview of event-based architecture, architectural patterns for event-based data systems, and organizational considerations…more details

Enabling Data Mesh With Even Driven Data Architecture image
Elliott Cordo
Head of Data | Capsule
11:00 - 11:45
Machine Learning Models for Quantitative Finance and Trading

In-person | Talk | Machine Learning | Deep Learning | Intermediate

 


This talk will provide a brief overview of the following topics:
• The broad application of machine learning in finance: opportunities and challenges.
• Use of alternative data such as News and Geo-locational/Extreme weather data to build signals and trading strategies for the financial markets.
• Machine Learning techniques for asset pricing, enhancing complex quant models (i.e., PDE solutions, Monte Carlo Simulations) for an efficient pricing of derivative and illiquid securities using data driven methods.
more details

Machine Learning Models for Quantitative Finance and Trading image
Arun Verma, PhD
Head of Quant Research Solutions | Bloomberg
11:00 - 11:45
Ace the Data Job Hunt

In-person | Career Talk | Beginner

 

Want to land your dream job in data? Learn what makes a Data resume stand out, how a portfolio project is a job hunting cheat code when you avoid these 6 mistakes, why cold email is a networking super-power, and how to craft a winning personal story for the behavioral interview. These tips led Nick Singh, best-selling author of Ace the Data Science interview, to work at Facebook & Google, and helped 200+ of his coaching clients land top jobs in tech…more details

Ace the Data Job Hunt image
Nick Singh
Co-Author | Ace The Data Science Interview
11:00 - 11:45
Democracy and the Pursuit of Randomness

In-person | Talk | Responsible Ai and Social Good | All Levels

 

Sortition is a storied paradigm of democracy built on the idea of choosing representatives through lotteries instead of elections. In recent years this idea has found renewed popularity in the form of citizens’ assemblies, which bring together randomly selected people from all walks of life to discuss key questions and deliver policy recommendations…more details

Democracy and the Pursuit of Randomness image
Ariel Procaccia, PhD
Professor | Harvard University
11:00 - 11:45
Causation, Collision, and Confusion: Avoiding the most dangerous error in statistics

In-person | Talk | Deep Learning | Beginner

 

In this talk, I will present examples of collision bias and show how it can be caused by a biased sampling process or induced by inappropriate statistical controls; and I will introduce causal diagrams as a tool for representing causal hypotheses and diagnosing collision bias…more details