TRAINING & WORKSHOPS
 Learn about the latest advances in data science and machine learning, like Apache Superset and more, and how you can utilize these tools in your own work from some of the best and brightest in the industry.
Featured World-Class Data Science Experts

Dr. Jon Krohn
Jon Krohn is Co-Founder and Chief Data Scientist at the machine learning company Nebula. He authored the book Deep Learning Illustrated, an instant #1 bestseller that was translated into seven languages. He is also the host of SuperDataScience, the data science industry’s most listened-to podcast. Jon is renowned for his compelling lectures, which he offers at leading universities and conferences, as well as via his award-winning YouTube channel. He holds a PhD from Oxford and has been publishing on machine learning in prominent academic journals since 2010.
Deep Learning with PyTorch and TensorFlow(Training)
NLP with GPT-4 and other LLMs: From Training to Deployment with Hugging Face and PyTorch Lightning(Training)

Matt Harrison
Matt Harrison has been using Python since 2000. He runs MetaSnake, a Python and Data Science consultancy and corporate training shop. In the past, he has worked across the domains of search, build management and testing, business intelligence, and storage.
He has presented and taught tutorials at conferences such as Strata, SciPy, SCALE, PyCON, and OSCON as well as local user conferences.
Machine Learning with XGBoost(Workshop)
Idiomatic Pandas(Workshop)

Stefanie Molin
Stefanie Molin is a software engineer and data scientist at Bloomberg in New York City, where she tackles tough problems in information security, particularly those revolving around data wrangling/visualization, building tools for gathering data, and knowledge sharing. She is also the author of “Hands-On Data Analysis with Pandas,” which is currently in its second edition. She holds a bachelor’s of science degree in operations research from Columbia University’s Fu Foundation School of Engineering and Applied Science, as well as a master’s degree in computer science, with a specialization in machine learning, from Georgia Tech. In her free time, she enjoys traveling the world, inventing new recipes, and learning new languages spoken among both people and computers.

Thomas J. Fan
Thomas J. Fan is a Staff Software Engineer at Quansight Labs and is a maintainer for scikit-learn, an open-source machine learning library for Python. Previously, Thomas worked at Columbia University to improve interoperability between scikit-learn and AutoML systems. He is a maintainer for skorch, a neural network library that wraps PyTorch. Thomas has a Masters in Mathematics from NYU and a Masters in Physics from Stony Brook University.
Introduction to scikit-learn: Machine Learning in Python (Training)

Irina Rish, PhD
Irina Rish is an Associate Professor in the Computer Science and Operations Research Department at the UniversitĂ© de MontrĂ©al (UdeM) and a core faculty member of MILA – Quebec AI Institute. She holds Canada Excellence Research Chair (CERC) in Autonomous AI and a Canadian Institute for Advanced Research (CIFAR) Canada AI Chair. She received her MSc and PhD in AI from University of California, Irvine and MSc in Applied Mathematics from Moscow Gubkin Institute. Dr. Rish’s research focus is on machine learning, neural data analysis and neuroscience-inspired AI. Before joining UdeM and MILA in 2019, Irina was a research scientist at the IBM T.J. Watson Research Center, where she worked on various projects at the intersection of neuroscience and AI, and led the Neuro-AI challenge. She received multiple IBM awards, including IBM Eminence & Excellence Award and IBM Outstanding Innovation Award in 2018, IBM Outstanding Technical Achievement Award in 2017, and IBM Research Accomplishment Award in 2009. Dr. Rish holds 64 patents, has published over 80 research papers in peer-reviewed conferences and journals, several book chapters, three edited books, and a monograph on Sparse Modeling.
Recent Advances in Foundation Models: Scaling Laws, Emergent Behaviors, and AI Democratization(Talk)

Leonardo De Marchi
Leonardo De Marchi holds a Master in Artificial intelligence and has worked as a Data Scientist in the sports world, with clients such as the New York Knicks. He now works in Thomson Reuters as VP of Labs, and also provides consultancy and training for small and large companies. His previous experience includes being Head of Data Science and Analytics in Bumble, the largest dating site with over 500 million users, heading the team through acquisition and an IPO.
Generative AI(Training)

Brian Lucena, PhD
Brian Lucena is Principal at Numeristical, where he advises companies of all sizes on how to apply modern machine learning techniques to solve real-world problems with data. He is the creator of three Python packages: StructureBoost, ML-Insights, and SplineCalib. In previous roles he has served as Principal Data Scientist at Clover Health, Senior VP of Analytics at PCCI, and Chief Mathematician at Guardian Analytics. He has taught at numerous institutions including UC-Berkeley, Brown, USF, and the Metis Data Science Bootcamp.
Uncertainty Quantification: Approaches and Methods(Training)

Tejaswini Pedapati
Tejaswini Pedapati works at IBM Research. Her research is focused on interpretability and automating deep learning. To that end, she was involved in developing tools and algorithms to provide these capabilities for IBM products. She has a masters’ degree from Columbia University.
Introduction to AutoML: Hyperparameter Optimization and Neural Architecture Search(Tutorial)

Aric LaBarr, PhD
A Teaching Associate Professor in the Institute for Advanced Analytics, Dr. Aric LaBarr is passionate about helping people solve challenges using their data. There he helps design the innovative program to prepare a modern workforce to wisely communicate and handle a data-driven future at the nation’s first Master of Science in Analytics degree program. He teaches courses in predictive modeling, forecasting, simulation, financial analytics, and risk management. Previously, he was Director and Senior Scientist at Elder Research, where he mentored and led a team of data scientists and software engineers. As director of the Raleigh, NC office he worked closely with clients and partners to solve problems in the fields of banking, consumer product goods, healthcare, and government. Dr. LaBarr holds a B.S. in economics, as well as a B.S., M.S., and Ph.D. in statistics — all from NC State University.

Jacob Andreas, PhD
Jacob Andreas is the X Consortium Assistant Professor at MIT. His research aims to build intelligent systems that can communicate effectively using language and learn from human guidance. Jacob earned his Ph.D. from UC Berkeley, his M.Phil. from Cambridge (where he studied as a Churchill scholar) and his B.S. from Columbia. As a researcher at Microsoft Semantic Machines, he founded the language generation team and helped develop core pieces of the technology that powers conversational interaction in Microsoft Outlook. He has been the recipient of Samsung’s AI Researcher of the Year award, MIT’s Kolokotrones teaching award, and paper awards at NAACL and ICML.
Interpreting Features in Deep Networks(Tutorial)

Daniel Lenton, PhD
Daniel Lenton is the creator of Ivy, which is an open-source framework with an ambitious mission to unify all other ML frameworks. Prior to starting Ivy, Daniel was a PhD student at Imperial College London, where he published research in the areas of machine learning, robotics and computer vision.
Unifying ML With One Line of Code(Tutorial)

Akash Tandon
Akash Tandon is co-founder and CTO of Looppanel where he builds software to help product teams record, store and analyze user research data. He is a co-author of Advanced Analytics with PySpark, published by O’Reilly. Previously, Akash worked as a senior data engineer at Atlan, SocialCops and RedCarpet where he built data infrastructure for enterprise, government and finance use-cases. He has also been a participant and mentor in the Google Summer of Code program with the R Project for Statistical Computing.
From Big Data to NLP insights: Getting started with PySpark and Spark NLP(Workshop)

Nikolay Manchev, PhD
Nikolay is an experienced Data Science professional who currently leads the EMEA Data Science team at Domino Data Lab. He holds an MSc in Software Technologies, an MSc in Data Science, and is currently undertaking postgraduate research at King’s College London. His area of expertise is Statistics, Mathematics, and Data Science in general, and his research interests are in Neural Networks with emphasis on biological plausibility. He writes articles and blogs regularly and speaks at various European conferences (ODSC, Big Data Spain, Strata, Big Data London etc.) to build awareness about data science and artificial intelligence. He is also the organizer of the London Data Science and Machine Learning meetup and recipient of several technical mastery awards like the Oracle ACE Award and the IBM Outstanding Technical Achievement Award.

Moez Ali
Innovator, Technologist, and a Data Scientist turned Product Manager with proven track record of building and scaling data products, platforms, and communities. Experienced in building and leading teams of data scientists, data engineers, and product managers. Strongly opinionated tech visionary and a thought partner to C-level leadership.
Moez Ali is an inventor and creator of PyCaret. PyCaret is an open-source, low-code, machine learning software. Ranked in top 1%, 8M+ downloads, 7K+ GitHub stars, 100+ contributors, and 1000+ citations.
Globally recognized personality for open-source work on PyCaret. Keynote speaker and top ten most-read writer in the field of artificial intelligence. Teaching AI and ML courses at Cornell, NY and Queens University, CA. Currently building world’s first hyper-focused Data and ML Platform.
Automate Machine Learning Workflows with PyCaret 3.0(Workshop)

Benjamin Batorsky, PhD
Ben is a Senior Data Scientist at the Institute for Experiential AI at Northeastern University. He obtained his Masters in Public Health (MPH) from Johns Hopkins and his PhD in Policy Analysis from the Pardee RAND Graduate School. Since 2014, he has been working in data science for government, academia and the private sector. His major focus has been on Natural Language Processing (NLP) technology and applications. Throughout his career, he has pursued opportunities to contribute to the larger data science community. He has presented his work at conferences, published articles, taught courses in data science and NLP, and is co-organizer of the Boston chapter of PyData. He also contributes to volunteer projects applying data science tools for public good.
Bagging to BERT – A Tour of Applied NLP(Workshop)

Freddy Boulton
Freddy Boulton started his career as a data scientist for Nielsen where he built predictive models of television viewing behavior to make television ratings more accurate. This gave him a first hand-view of one of the biggest challenges faced by industry data scientists – being able to easily communicate and share machine learning models with stakeholders. He is currently solving that problem by working on Gradio, an open-source python library that lets data scientists create fully interactive demos of machine learning models with just a few lines of code.
A Practical Tutorial on Building Machine Learning Demos with Gradio(Workshop)

Panos Alexopoulos, PhD
Panos Alexopoulos has been working since 2006 at the intersection of data, semantics, and software, building intelligent systems that deliver value to business and society. Born and raised in Athens, Greece, he currently works as Head of Ontology at Textkernel, in Amsterdam, Netherlands, where he leads a team of Data Professionals in developing and delivering a large cross-lingual Knowledge Graph in the HR and Recruitment domain. Panos holds a PhD in Knowledge Engineering and Management from National Technical University of Athens, and has published more than 60 papers at international conferences, journals and books. He is the author of the book “Semantic Modeling for Data – Avoiding Pitfalls and Breaking Dilemmas” (O’Reilly, 2020), and a regular speaker and trainer in both academic and industry venues.

Julien Simon
Julien is currently Chief Evangelist at Hugging Face. He’s recently spent 6 years at Amazon Web Services where he was the Global Technical Evangelist for AI & Machine Learning. Prior to joining AWS, Julien served for 10 years as CTO/VP Engineering in large-scale startups.
Hyper-productive NLP with Hugging Face Transformers(Workshop)

Chandra Khatri
Chandra Khatri is the Chief Scientist and Head of AI at Got It AI, wherein, his team is transforming AI space by leveraging state-of-the-art technologies to deliver the world’s first fully autonomous Conversational AI system. Under his leadership, Got It AI is democratizing Conversational AI and related ecosystems through automation. Prior to Got-It, Chandra was leading various AI applied and research groups at Uber, Amazon Alexa and eBay.
At Uber, he was leading Conversational AI, Multi-modal AI, and Recommendation Systems. At Amazon he was the founding member of the Alexa Prize Competition and Alexa AI, wherein he was leading the R&D and got the opportunity to significantly advance the field of Conversational AI, particularly Open-domain Dialog Systems, which is considered as the holy-grail of Conversational AI and is one of the open-ended problems in AI. And at eBay he was driving NLP, Deep Learning, and Recommendation Systems related applied research projects.
He graduated from Georgia Tech with a specialization in Deep Learning in 2015 and holds an undergraduate degree from BITS Pilani, India. His current areas of research include Artificial and General Intelligence, Democratization of AI, Reinforcement Learning, Language and Multi-modal Understanding, and Introducing Common Sense within Artificial Agents.
Truth Checker: Generative Large Language Models and Hallucinations(Talk)

Andras Zsom, PhD
Andras Zsom is an Assistant Professor of the Practice and Director of Graduate Studies at the Data Science Initiative at Brown University, Providence, RI. He is teaching two mandatory courses in the data science master’s program, and helps the students navigate through their studies and curriculum. He also supervises interns on various research projects related to missing data, interpretability, and developing machine learning pipelines.

Connor Shorten, PhD
Connor Shorten is a Research Scientist at Weaviate, an Open-Source Vector Search Database. Connor has had a role in the development of Ref2Vec, Hybrid Search, Generative Search, Weaviate’s Pipe API, and Re-Ranking. Connor has also hosted 34 episodes of the Weaviate podcast featuring guests from OpenAI, Cohere, You.com, MosaicML, Jina AI, Deepset, Neural Magic and many others! Connor also co-hosts Weaviate meetups in Boston and New York City! Prior to Weaviate, Connor has earned a Ph.D. in Computer Science from Florida Atlantic University. Connor’s Ph.D. was primarily focusing on Data Augmentation in Deep Learning and Applications of Deep Learning for COVID-19. Connor’s publication “A survey on image data augmentation in deep learning” has achieved over 5,000 citations.
Building Recommendation Systems(Workshop)

Frank DeFalco
Frank DeFalco is the Director of Epidemiology Analytics at Janssen Research and Development where he architects software solutions and data platforms for the analysis and application of observational data sources. He is currently the leader and Benevolent Dictator of the OHDSI open source architecture working group. Frank is a presenter and panelist at OHDSI symposiums and has served as faculty for OHDSI symposium tutorials classes on architecture and common data model vocabulary. In addition to leading the OHDSI Architecture working group Frank initiated development of a standardized platform for observational analytics known as ATLAS. He is an active contributor to the open source software repositories developed and released by OHDSI including ATLAS, WebAPI, Achilles, Circe, Arachne, Visualizations, Hermes, Helios and others. Frank’s areas of expertise include computation epidemiology, large scale data platforms, software development and architecture, data visualization and informatics. Prior to joining Janssen Research and Development, Frank held the position of Senior Principal and Director of Collaboration and Analytics at British Telecom where he was a strategic advisor for multiple Fortune 100 companies across sectors including Consumer Products, Telecommunications and Pharmaceuticals. Frank received his undergraduate degrees in Computer Science and Psychology at Rutgers University.”
Patient Level Prediction with Supervised Learning Models in Federated Data Networks(Tutorial)

Matteo Pirotta
Bio Coming Soon!
Exploration in Reinforcement Learning(Tutorial)

Joe Dery, PhD
Joe Dery joined Western Governors University’s College of IT as the VP & Dean of Data Analytics in summer, 2022. At WGU, Joe is working to help more than 3,000 current analytics students learn how to effect change in their professional roles – surgically balancing a combination of mathematics, data management, programming, and business influence skills. Prior to joining academia full-time, Joe spent much of his corporate career working for EMC – and later, Dell Technologies – where he joined as a “hands-on-keyboard” Data Scientist in 2011. Joe went on to hold leadership positions in Dell’s Sales, Finance, and Supply Chain organizations driving efforts in Data Science, Business Intelligence, Digital Strategy, and Digital Transformation. Across these domains, Joe’s efforts touched a wide variety of business problems, including ML-driven sales quota allocations, sales forecasting & opportunity prioritization, customer cross-sell/whitespace targeting, addressable marketing opportunity sizing, sales territory optimization, supply chain planning optimization, data/analytics literacy training, and self-service BI. Building from his experiences, Joe is often invited to speak on the crucial role of decision intelligence frameworks, change management, and “improv” in bringing analytics solutions to life. Joe holds a Ph.D in Business Analytics & an M.S. in Marketing Analytics, both from Bentley University.
Unlock the Power of Data Science for Real Change: A Blueprint for Decision Intelligence(Track Keynote)

Avi Pfeffer, PhD
Dr. Avi Pfeffer is Chief Scientist at Charles River Analytics. Dr. Pfeffer is a leading researcher on a variety of computational intelligence techniques including probabilistic reasoning, machine learning, and computational game theory. Dr. Pfeffer has developed numerous innovative probabilistic representation and reasoning frameworks, such as probabilistic programming, which enables the development of probabilistic models using the full power of programming languages, and statistical relational learning, which provides the ability to combine probabilistic and relational reasoning. He is the lead developer of Charles River Analytics’ Figaro™ probabilistic programming language. As an Associate Professor at Harvard, he developed IBAL, the first general-purpose probabilistic programming language. While at Harvard, he also produced systems for representing, reasoning about, and learning the beliefs, preferences, and decision making strategies of people in strategic situations. Prior to joining Harvard, he invented object-oriented Bayesian networks and probabilistic relational models, which form the foundation of the field of statistical relational learning. Dr. Pfeffer serves as Action Editor of the Journal of Machine Learning Research and served as Associate Editor of Artificial Intelligence Journal and as Program Chair of the Conference on Uncertainty in Artificial Intelligence. He has published many journal and conference articles and is the author of a text on probabilistic programming. Dr. Pfeffer received his Ph.D. in computer science from Stanford University and his B.A. in computer science from the University of California, Berkeley.
More instructors added weekly
More Instructors Coming Soon
Free Pre-Bootcamp Primer Courses
Data, Coding, and AI preparation courses for ODSC Mini-Bootcamps
ODSC Bootcamp Primer Courses
These primer courses can be taken stand alone or as part of our Mini-Bootcamp series. This foundations series is built from the ground up to boost your understanding of data-centric AI






Hosted on Ai+ Training and included FREE as part of your ODSC AI Mini-Bootcamp Pass.
Pre-Bootcamp Workshop Dates *
Pre-Bootcamp Warmup Workshops are available both live and on-demand (post-date) Â * schedule is subject to change
Data Primer – available on-demand;
SQL – available on-demand;
Programming Primer Course with Python – available on-demand;
AI Primer – Thursday, October 5th, 2023Â
Data Wrangling with Python – Thursday, October 19th, 2023
LLMs, Prompt Engineering, & Gen AI – Data to be Announced
Beginner to Advanced Level Training
From the Leading Instructors in the Industry
Machine Learning
Meta-learning for Machine Learning
Self Supervised learning; new techniques
Federated Learning for Data Privacy
Explainable AI and Bias in machine learning
Machine Learning at Scale using Apache Spark
Safety & Robustness in Machine Learning Modeling
Semi-supervised learning
Causal Inference with Machine Learning
Deep Learning
Deep Reinforcement learning
Deep Learning with PyTorch & Tensorflow
Deep Learning Deep Dive
Computer Vision 1/2 Day Training
Deep Learning with Keras
Introduction to Deep learning
Deepfakes Tutorial
Graph Representation Learning
NLP
Self Supervised learning; new techniques
Transfer Learning in NLP
Introduction to NLP and Topic Modeling
NLP Pre-trained Transformer Models with Bert, Ernie,, and GPT-2
State-of-the-Art NLP with PyTorch and Tensorflow
Semi-supervised learning
Hugging Face Transformer Library Workshop
Applications of NLP; Sentiment Analysis, Dialog Systems, and Semantic Search
ADDITIONAL TUTORIALS & WORKSHOPS
Machine Learning for Cyber Security
Real-time Streaming Analytics
MLOps and Machine Learning Pipelines
Introduction to Machine Learning Using scikit-learn
Auto Machine Learning (AutoML)
Distributed Machine Learning
Introduction to Data Analysis with Python Pandas
Machine Learning Workflow with Kubeflow & Kubernetes
Ai+ Training Tracks
Access included with select passes








Ai+ is the only hands-on training platform solely developed for AI practitioners. Keep training with the top names in the industry.Â
Online registration ends in
Choose your Pass
Virtual Bootcamp Orientation Sessions for in-person and virtual attendees
Monday | Virtual Mini-Bootcamp Training Sessions
Premium 1-Year Subscription to AI+ Training (value = $700)
Access to All Virtual Sessions & Events (Tue-Thu)
ODSC Keynotes & Talks (Wed-Thu)
4 Prep-Bootcamp live tutorials on Data Literacy, AI Literacy, Programming, and SQL (Value $796)
On-demand Access to All Conference recordings
Access to AI Solution Showcase Expo Area (Wed-Thu)
Access to ODSC In-Person Workshops & Training Sessions (Tue&Thu)
Access to In-person Mini-Bootcamp Training Sessions
Need More Reason To Sign Up?
ODSC Training Includes
Opportunities to form working relationships with some of the world’s top data scientists.
Access to 40+ training sessions and 70+ workshops.
Hands-on experience with the latest frameworks and breakthroughs in data science.
Affordable training–equivalent training at other conferences costs much more.
Professionally prepared learning materials, custom- tailored to each course.
Opportunities to connect with other ambitious, like-minded data scientists.
ODSC Newsletter
Stay current with the latest news and updates in open source data science. In addition, we’ll inform you about our many upcoming Virtual and in person events in Boston, NYC, Sao Paulo, San Francisco, and London. And keep a lookout for special discount codes, only available to our newsletter subscribers!