ODSC West 2022

 Schedule

more sessions added weekly

REGISTER NOW

ODSC West 2022

 Schedule

more sessions added weekly

REGISTER NOW
Please Note: In-Person attendees will have access to virtual sessions. If you have a virtual pass, please note that we will not live-stream any in-person sessions. Only virtual sessions will be recorded.  The schedule overview is available HERE.

50+ Sessions Added

100+ more coming soon.

Bootcamp/Pre-Bootcamp

Pre-Botocamp: Introduction to Data Course

Pre-Bootcamp: Introduction to Programming with Python Course

Pre-Bootcamp: Data Wrangling with Python Course

Pre-Bootcamp: Introduction to SQL Course

Pre-Bootcamp: Introduction to AI Course

Please Note: In-Person attendees will have access to virtual sessions. If you have a virtual pass, please note that we will not live-stream any in-person sessions. Only virtual sessions will be recorded.  The schedule overview is available HERE.

70+ Sessions Added

100+ more coming soon.

West Keynotes / Talks
---Tuesday, 31st October
--Wednesday, 1st November
-Thursday, 2nd November
West Trainings / Workshops
----Monday, 30th October
---Tuesday, 31st October
--Wednesday, 1st November
-Thursday, 2nd November
West Bootcamp
----Monday, 30th October
--Pre-Bootcamp live training warm up
---Tuesday, 31st October
--Wednesday, 1st November
-Thursday, 2nd November
----Monday, 30th October
---Tuesday, 31st October
--Wednesday, 1st November
-Thursday, 2nd November
----Monday, 30th October
--Pre-Bootcamp live training warm up
ODSC Keynote – Neural Networks Make Stuff up. What Should We do About it?

In-person | Keynote | Machine Learning | All Levels

 

In this talk, I’ll discuss some ways that we might cope with and address the unreliability of neural network models. As an initial coping strategy, I will first discuss a technique for detecting whether some content was generated by a machine learning model, leveraging the probability distribution that the model assigns to different content. Next, I will describe an approach for enabling neural network models to better estimate what they don’t know, such that they can refrain from making predictions on such inputs (i.e. selective classification). I will lastly describe methods for adapting models with small amounts of data to improve their accuracy under distribution shift…more details

ODSC Keynote – Neural Networks Make Stuff up. What Should We do About it? image
Chelsea Finn
Assistant Professor | Stanford University
Towards Explainable and Language-Agnostic LLMs

Virtual | Talk | NLP & LLMs | Intermediate

 

Large language models (LLMs) have achieved a milestone that undeniably changed many held beliefs in artificial intelligence (AI). However, there re-mains many limitations of these LLMs when it comes to true language un-derstanding, limitations that are a byproduct of the underlying architecture of deep neural networks. Moreover, and due to their subsymbolic nature, whatever knowledge these models acquire about how language works will always be buried in billions of microfeatures (weights), none of which is meaningful on its own, making such models hopelessly unexplainable…more details

Towards Explainable and Language-Agnostic LLMs image
Walid S. Saba
Senior Research Scientist | Institute for Experiential AI at Northeastern University
A Semi-Supervised Anomaly Detection System Through Ensemble Stacking Algorithm

Virtual | Talk | Machine Learning | Intermediate

 

Many retail giants are experiencing huge inventory loss and shrinkage problems because they process a huge number of transaction activities every day across the U.S. and offer a liberal shopping policy to provide a convenient customer shopping and return experience. Due to the facts that 1) they have highly imbalance and complex transaction data set as there are enormous transaction data while different types of anomalies are exceedingly rare; 2) there are seldom predefined labels available as it is not feasible to have human experts manually review every transaction and identify anomalies, it is a challenging task to investigate customers’ return behaviors and prevent fraudulent activities…more details

A Semi-Supervised Anomaly Detection System Through Ensemble Stacking Algorithm image
Chuying Ma
Senior Data Scientist | Walmart
Representation Learning on Graphs and Networks

Virtual | Talk | Machine Learning | Intermediate

 

In this talk, I will attempt to provide several “bird’s eye” views on GNNs. Following a quick motivation on the utility of graph representation learning, I will derive GNNs from first principles of permutation invariance and equivariance. We will discuss how we can build GNNs that are not strictly reliant on the input graph structure…more details

Representation Learning on Graphs and Networks image
Dr. Petar Veličković
Staff Research Scientist | Affiliated Lecturer | DeepMind | University of Cambridge
Connecting Large Language Models – Common Pitfalls & Challenges

Virtual | Talk | NLP & LLMs | Beginner

 

 

Generative models have demonstrated how helpful they can be on general knowledge, helping students on their writing assignments. But as soon as you want to run it in a professional setting with prompts like “what are the three main feature requests from our largest customers?”, they demonstrate their lack of knowledge. In this session, I will introduce how Large Language Models (LLMs) can be connected to your data via semantic search. As I will present, there are many pitfalls and challenges. Some can be solved, when using the right technologies, others are still open problems…more details

Connecting Large Language Models – Common Pitfalls & Challenges image
Nils Reimers
Director of Machine Learning | cohere.ai
Delivering Gen AI Solutions to Executives: A Nimble Framework for Rapid and Tailored Deployment

In-person | Case Study

 

By adopting the proposed framework, organizations can effectively harness the potential of Gen AI, empowering all employees with invaluable insights to make data-driven decisions. This talk aims to inspire organizations to embrace a safe agile approach and cultivate a culture of experimentation and collaboration for the successful deployment of Gen AI solutions…more details

Delivering Gen AI Solutions to Executives: A Nimble Framework for Rapid and Tailored Deployment image
Shea Watrin
Senior Manager Data Sciences | Amgen
Implementing Gen AI in Practice

In-person | Track Keynote | Generative AI | All Levels

 

 

Generative AI has taken over the world by storm, but building Gen AI applications for production comes with a unique set of challenges. Questions around cost performance, risk, simplifying implementation for production, setting guardrails, adding automation where possible and leveraging CI/CD for ML all become even more important when Gen AI is involved…more details

Implementing Gen AI in Practice image
Yaron Haviv
Co-Founder and CTO | Iguazio (acquired by McKinsey & Company)
Troubleshooting Large Language Models in Production with Embeddings and Evals

In-person | Talk | MLOps | Data Engineering & Big Data | Machine Learning | Deep Learning | Intermediate

 

In this presentation, Amber Roberts, Machine Learning Engineer at Arize AI, will present findings from research on ways to measure vector/embedding drift for image and language models. With lessons learned from testing different approaches (including Euclidean and Cosine distance) across billions of streams and use cases, Roberts will dive into how to detect whether two unstructured language datasets are different — and, if so, how to understand that difference using techniques such as UMAP…more details

Troubleshooting Large Language Models in Production with Embeddings and Evals image
Amber Roberts
Data Scientist, Growth Lead | Arize AI
Completing Knowledge Discovery Fast at High Quality with AI

In-person | Talk | Machine Learning | All Levels

 

In this presentation, our speaker will commence by conducting a comprehensive review of a list of common failure factors. Additionally, they will present an AI-driven ecosystem approach for knowledge discovery and discuss a real-world test of this approach involving 14 knowledge discovery projects. Through this, attendees will gain valuable insights into the crucial connection between the success of data-driven knowledge discovery projects and advancements in AI technology…more details

Completing Knowledge Discovery Fast at High Quality with AI image
Alex Liu, Ph.D.
Founder and Director | RMDS Lab
Building a Data-Driven Workforce

In-person | Talk | Data Visualization & Data Analysis | All Levels

 

Save time and money by equipping everyone with fundamental data literacy and analytics skills. It sounds great, but most organizations bite off more than they can chew. Instead of trying to turn everyone into a data analyst, teach people the most vital descriptive analytics skills, and eliminate the flow of tedious work requests to your data experts…more details

Building a Data-Driven Workforce image
Dominic Bohan
Co-Founder | StoryIQ
Building Robust and Scalable Recommendation Engines for Online Food Delivery

Virtual |Talk | Machine Learning | Beginner – Intermediate

 

In this training session, we will delve into the intricacies of building robust and scalable recommendation engines specifically tailored for online food delivery services. We will introduce a newly released dataset called the Delivery Hero Recommendation Dataset (DHRH) and understand how this can be used for training different recommendation models.We will explore the challenges faced in this domain and discuss the techniques and best practices to overcome them, ensuring our recommendation systems can handle large-scale operations and adapt to changing customer preferences…more details 

Building Robust and Scalable Recommendation Engines for Online Food Delivery image
Raghav Bali
Staff Data Scientist | Delivery Hero
Building Robust and Scalable Recommendation Engines for Online Food Delivery image
Vishal Natani
Manager, Data Science | Delivery Hero
Causality and LLMs

Virtual | Talk | Machine Learning | LLMs | Intermediate

 

A highlight of this workshop is the introduction to the “causal LLM”, a pioneering concept where LLMs are designed using foundational causal principles. By the end of this session, participants will be equipped with the skills and knowledge to employ LLMs effectively in discerning complex causal models and pioneering advancements in the field of causal AI…more details

Causality and LLMs image
Robert Osazuwa Ness, PhD
Senior Researcher | Microsoft
Attribution and Moral Rights in Generative AI

Virtual | Talk | Generative AI | All Levels

 

 

Human creators have provided the works (whether prose writing, source code, or visual works) that act as the basis for training huge DNNs, arguably create an obligation for the AI outputs. This feels most evident when promps ask AIs to create something “in the style of such-and-such-human.” While such outputs are often flawed in interesting ways, they are also usually recognizable in their connection to the prompted human creator. What rights should those source humans have to control those uses, including the moral right simply to be formally recognized as the source? Few laws exist governing attribution and moral rights in generative AI, but many will come to exist soon. Laws and technical standards may follow good or bad principles, both ethical and technical…more details

Attribution and Moral Rights in Generative AI image
David Mertz, Ph.D.
Director of Epistemology | KDM Training
Attack on Machine Learning, Defend with MLOps

Virtual | Talk | MLOps and Data Engineering | Intermediate

 

 

With the wide adoption of generative artificial intelligence (AI), more than ever, ensuring the robustness of machine learning models is becoming crucial. One of the most concerning security threats to machine learning (ML) is the potential for adversarial attacks, a technique to exploit vulnerabilities of models to cause incorrect output. They are unapparent to humans but sufficient for machine learning models to misclassify the data, potentially harming the end users. Therefore, including adversarial training in the ML lifecycle is important to consider as you build out your model and prepare it for production usage. Join this talk to learn how the symbiosis between the ML security open source projects like Adversarial Robustness Toolbox and an ML operations (MLOps) project, Kubeflow, can streamline your machine learning workflow and improve your model’s robustness and security. With numerous benefits of the MLOps to assist in generating and defending your model, accelerate your development of secure machine learning models…more details

Attack on Machine Learning, Defend with MLOps image
Anna Jung
Sr. ML Open Source Engineer | VMware
PyTorch 2.1 – New Developments

Virtual | Talk | Machine Learning | Deep Learning | Intermediate-Advanced

 

In this session we will deep dive into all the new developments and techniques in PyTorch and provide recommendations on how you can accelerate your models using native PyTorch code…more details

PyTorch 2.1 – New Developments image
Supriya Rao
Engineering Manager | Meta
How to Deliver Contextually Accurate LLMs

In-person | Talk | LLMs | Machine Learning | Intermediate

 

 

In the realm of advanced computational linguistics, the efficacy of Large Language Models (LLMs) is intrinsically tied to their contextual precision. In this session presented by Jake from Cloudera (not State Farm), we’ll navigate the complexities of ensuring LLMs yield contextually accurate results, a necessity in today’s intricate data environments. Crucially, attendees will be treated to a live demonstration showcasing the utilization of RAG (Retrieval-Augmented Generation) and PEFT (Parameter Efficient Fine-Tuning) techniques, two of the leading approaches for this task that underpin the success of Cloudera’s Applied ML Prototypes (AMPs)…more details

 

How to Deliver Contextually Accurate LLMs image
Jake Bengtson
Sr. Product Marketing Manager | Cloudera
General and Efficient Self-supervised Learning with data2vec

In-Person | Talk | NLP & LLMs | GenAI | Advanced

 

In this talk, I will present data2vec, a framework for general self-supervised learning that uses the same learning method for either speech, NLP or computer vision. The core idea is to predict latent representations of the full input data based on a masked view of the input in a self-distillation setup using a standard Transformer architecture. Instead of predicting modality-specific targets such as words, visual tokens or units of human speech which are local in nature, data2vec predicts contextualized latent representations that contain information from the entire input. Experiments on the major benchmarks of speech recognition, image classification, and natural language understanding demonstrate a new state of the art or competitive performance to predominant approaches…more details

General and Efficient Self-supervised Learning with data2vec image
Michael Auli
Principal Research Scientist | Director | FAIR | Meta AI
The Crucial Role of Digital Experimentation and A/B Testing in the AI Landscape

Virtual | Talk | Responsible AI | Intermediate

 

 

Digital experimentation and A/B testing are invaluable methodologies within the AI domain, facilitating the validation and continuous improvement of models, solutions, and systems. This talk will delve into the intricate role of these testing mechanisms in the AI landscape.
We will start by introducing the concept of digital experimentation and A/B testing, elaborating on their integral role in testing hypotheses and making data-driven decisions. The discussion will further touch upon the traditional uses of these methodologies in the digital marketing sphere, enabling businesses to optimize their online content and increase user engagement…more details

The Crucial Role of Digital Experimentation and A/B Testing in the AI Landscape image
Alessandro Romano
Senior Data Scientist | Kuehne+Nagel
Data Morph: A Cautionary Tale of Summary Statistics

In-person | Talk | Machine Learning | Beginner

 

Statistics do not come intuitively to humans; they always try to find simple ways to describe complex things. Given a complex dataset, they may feel tempted to use simple summary statistics like the mean, median, or standard deviation to describe it. However, these numbers are not a replacement for visualizing the distribution…more details

Data Morph: A Cautionary Tale of Summary Statistics image
Stefanie Molin
Software Engineer, Data Scientist, Chief Information Security Office, Author of Hands-On Data Analysis with Pandas | Bloomberg
Security First, Create a Robust Machine Learning Model

Virtual | Talk | Machine Learning Safety and Security | Intermediate

 

This session will focus on security threats to machine learning models, including a demonstration of the similarities and differences between adversarial attacks in the domains of computer vision and natural language processing (NLP) with examples from open source projects like Adversarial Robustness Toolbox and TextAttack. Join the session to discuss applying adversarial research to real-world systems and learn how to be proactive with machine learning security…more details

Security First, Create a Robust Machine Learning Model image
Teodora Sechkova
Open Source Software Engineer | VMware
The Devil in the Details: How defining an NLP task can undermine or catalyze its successful implementation

Virtual | Talk | NLP | Beginner

 


In this talk, we will delve into the intricate relationship between NLP task definition and the outcomes of NLP system development, exploring how the precise formulation of tasks can either propel or hinder progress. Through case studies and examples, attendees will understand how ambiguous or biased task definitions can lead to misguided model objectives, high data acquisition costs, and misleading system evaluations, and learn how to strike the right balance between specificity and adaptability in NLP task design…more details

 

 

The Devil in the Details: How defining an NLP task can undermine or catalyze its successful implementation image
Panos Alexopoulos, PhD
Head of Ontology | Textkernel BV
The Open Source ML Advantage

In-person | Talk | All Levels

 

Open source technology has historically driven technological progress, from operating systems, to web development, and now, AI and machine learning. We’re at a point in history where the canonical “open source ML stack” has yet to be discovered. In this talk, you’ll hear about HPE’s approach to building an end-to-end ML stack, including Determined AI for model training and Pachyderm for data preparation. You’ll also see a live demo of how to use Determined AI for a transfer learning use case in the biomedical image domain. We’ll also touch on some key collaborations, including our partnerships with the AI Infrastructure Alliance, NVIDIA, and Aleph Alpha…more details

The Open Source ML Advantage image
Isha Ghodgaonkar
Machine Learning Developer Advocate | Hewlett Packard Enterprise (HPE)
Using Machine Learning to Discover Business Insights

In-person | Ai X Talk | Intermediate

 

In an era of Generative AI and Large Language Models, using old school machine learning might sound quaint. The reality is, however, that hidden with your corporate data is a trove of information and hidden business patterns. Leveraging key techniques from Machine Learning can help businesses uncover hidden patterns that will have immediate impact on your bottom line and on your ROI…more details

Using Machine Learning to Discover Business Insights image
Ryohei Fujimaki, PhD
CEO | dotData
Understanding the Landscape of Large Models

In-person | Talk | NLP & LLMs | GenAI | Intermediate

 

There seems to be a new large ML model grabbing headlines every week. Whether it’s OpenAI’s big releases like GPT-3, Dalle-2 or Whisper, or one of the many open source projects generating state-of-the-art models, like Stable Diffusion, OpenFold or Craiyon, these models have found their way into the mainstream. Lukas will map the landscape for you and share how these teams use W&B to accelerate their work…more details

Understanding the Landscape of Large Models image
Lukas Biewald
Founder | Weights & Biases
Intersection of Gen AI and Legal, or How to Apply LLMs as Agents in Production Features.

In-person | Ai X Talk

 

Abstract Coming Soon!

Intersection of Gen AI and Legal, or How to Apply LLMs as Agents in Production Features. image
Cai GoGwilt
Co-Founder and Chief Architect | Ironclad
Adopting Language Models Requires Risk Management — This is How

Virtual | Talk | LLMs | ML/AI Safety and Security | All Levels

 

Language models are incredible engineering breakthroughs but require auditing and risk management before productization. These systems raise concerns about toxicity, transparency and reproducibility, intellectual property licensing and ownership, disinformation and misinformation, supply chains, and more. How can your organization leverage these new tools without taking on undue or unknown risks? While language models and associated risk management are in their infancy, a small number of best practices in governance and risk are starting to emerge. If you have a language model use case in mind, want to understand your risks, and do something about them, this presentation is for you..more details

Adopting Language Models Requires Risk Management — This is How image
Patrick Hall
Assistant professor | Principal Scientist | George Washington University School of Business | BNH.AI
Building Generative AI Applications: An LLM Case Study

In-person | Talk | Generative AI | NLP | Machine Learning | Research Frontiers (Research topics in ML, DL, Data Science etc) | All Levels

 

This talk will dive into the end-to-end process of and framework for building a generative AI application, leveraging a fun and engaging case study with open-source tooling (e.g., HuggingFace models, Python, PyTorch, Jupyter Notebooks). We will guide attendees through key stages from model selection and training to deployment, while also addressing fine-tuning versus prompt-engineering for specific tasks, ensuring the quality of output, and mitigating risks…more details

Building Generative AI Applications: An LLM Case Study image
Michelle Yi
Board Member | Women in Data
AI and Video Games : The Evolution

In-person | Talk | Beginner

 

 

Little is known about the history of Neural Networks. For instance, “”Artificial Intelligence.”” The history goes back to around 1946 when researchers noticed that the mathematics involved with linear algebra wherein materials were stretched that the neighboring atoms were affected, was best modeled with a branch of Math known as tensors. Tensors are used today to create neural networks. Neural networks are run on a powerful Graphics Processing Unit (GPU.) GPUs came about because of video games and entertainment. Thus we can say that video games laid the groundwork for AI…more details

AI and Video Games : The Evolution image
Jack McCauley
Board Trustee at University of California, Berkeley, Former co-founder and Engineer, Oculus VR. Faculty Member Jacobs Institute, McCauley Chair in Drug Policy Innovation at RAND Corporation, MSRI Trustee | Black Lab LLC
The English SDK for Apache Spark™

In-person | Talk | LLMs | Machine Learning | Intermediate

 

 

In the fast-paced world of data science and AI, we will explore how large language models (LLMs) can elevate the development process of Apache Spark™ applications.

We’ll demonstrate how LLMs can simplify SQL query creation, data ingestion, and DataFrame transformations, leading to faster development and more precise code that’s easier to review and understand. We’ll also show how LLMs can assist in creating visualizations and clarifying data insights, making complex data easy to understand…more details

The English SDK for Apache Spark™ image
Allison Wang
Senior Software Engineer | Databricks
The English SDK for Apache Spark™ image
Gengliang Wang
Senior Software Engineer | Databricks
Capturing CAP in a Kappa Data Architecture

In-person | Talk | Data Analytics and Big data | Machine Learning | MLOps | Beginner – Intermediate

 

When choosing different architectures for your enterprise data environment, you always want to make sure how to manage choices in the CAP theorem. The CAP theorem states that you can not have consistency, availability, and partition-tolerance at the same time, but what if choosing for Kappa architecture makes it possible to have it all?…more details

Capturing CAP in a Kappa Data Architecture image
Joep Kokkeler
Senior Data Engineer | Dataworkz NL
Hidden Insights in Financial Audio

In-person | Talk | All Levels

 

Attendees will learn how to apply NLP to financial data. Earnings data covers all industries in the US; attendees will be surprised to learn how much information is hiding in plain sight. Earnings calls are public information and I’m going to show you what you’re able to learn from them using ASR and NLP tools.


Speech recognition technology has come a long way in recent years, but there are still some larger market challenges when it comes to transcribing speech accurately. This comes into play especially for domain-specific terminology around certain industries. For example, in the renewable energy industry, transcription might include terms like “hydro DM,” “biomass conversion,” and “renewable portfolio standard,” that are outside normal conversation…more details

 

Hidden Insights in Financial Audio image
Katie Kuzin
Product Lead, Voice to Text | Kensho Technologies, S&P Global Market Intelligence
Data Science Applied to Manufacturing Problems

In-person | Talk | Beginner – Intermediate

 

Manufacturers of modern age are looking for data science solutions to help move their 3 key metrics, – Produce more, Be efficient and optimize resource utilization, and ship with highest possible quality. Given these 3 KPIs , we will understand how data science influences each of these and will deep dive over the key projects that have influenced these KPIs. We deep down on specific projects. Eg how we use recursive hypothesis testing to reduce the testing frequency, how we understand the physics behind a test and optimize, how we make mass production decisions based on design of experiments and small distribution analysis…more details

Data Science Applied to Manufacturing Problems image
Angad Arora
Manufacturing Data Scientist | Google
Keras Core: Keras for TensorFlow, JAX, and PyTorch

In-person | Talk | Deep Learning | Machine Learning | Intermediate

 

Keras, the popular deep learning library, is now becoming multi-backend with support for TensorFlow, JAX, and PyTorch. Available now, Keras Core allows developers to create models with all of the simple high-level components of Keras while interchanging between frameworks to take advantage of the benefits of each. Existing users of each framework can seamlessly integrate their code, including backend-specific code, with Keras. Additionally, Keras Core’s ops suite, which includes the NumPy API, allows you to develop custom components for use on any framework. For current users of tf.keras, Keras Core with the TensorFlow backend serves as a drop-in replacement, meaning that changing your imports is all that is necessary to start taking advantage of the framework-agnostic future…more details

Keras Core: Keras for TensorFlow, JAX, and PyTorch image
Neel Kovelamudi
Software Engineer on Keras Team | Google
Integrating Language Models for Automating Feature Engineering Ideation

In-person | Talk | LLMs | Machine Learning | Intermediate

 

 

In this presentation, we explore an approach that utilizes LLMs to guide feature engineering. By leveraging the contextual understanding within LLMs, we have developed a system for LLM-assisted feature engineering. Our research demonstrates the practical benefits of this synergy – from feature ideation to improved feature relevance to enhanced model interpretability and efficiency…more details

 

Integrating Language Models for Automating Feature Engineering Ideation image
Sergey Yurgenson
Head of Semantic Data Science | Featurebyte
ODSC Keynote – Human Centered AI

In-person | Keynote | Machine Learning | All Levels

 

We have seen amazing technical progress in AI applications in recent years. This talk considered the human side rather than the technical side: how can we gain confidence that our applications will be fair, just, truthful, beneficial, and well-stirred for their users, the other stakeholders, and society at large…more details

ODSC Keynote – Human Centered AI image
Peter Norvig
Engineering Director | Education Fellow | Google | Stanford Institute for Human-Centered Artificial Intelligence (HAI)
Driving Success for Sellers by Infusing AI in CRM Platform

In-Person | Business Talk | All Levels

 

By harnessing the power of machine learning and AI, companies can unlock advanced capabilities and gain valuable insights that streamline sales processes and enhance sellers’ performance. This infusion eliminates the laborious task of manual information analysis across multiple disparate data sources, empowering sellers to dedicate their time and expertise to building meaningful customer relationships. In architecting AI-based solutions, it is of high importance to understand the users’ daily workflow and challenges, and not just surfacing the insights but surfacing the right information at the right place to simplify users’ workflow…more details

Driving Success for Sellers by Infusing AI in CRM Platform image
Sarah Kefayati
Associate Principal Data Scientist | IBM
Generative AI in Enterprises: Unleashing Potential and Navigating Challenges

In-person | Ai X Talk | Generative AI

 

In this session you will learn about some early experiences, and best practices of deploying Generative AI solutions in enterprises…more details

Generative AI in Enterprises: Unleashing Potential and Navigating Challenges image
Rama Akkiraju
VP AI/ML for IT | NVIDIA
Accelerate your AI/ML Initiatives and Deliver Business Value Quickly by Implementing Practical and Innovative Approaches

In-person | Ai X Talk | Intermediate

 

Critical success factor for enterprise level AI adoption and sponsorship is the ability to rapidly deploy AI technologies, solve use cases quickly, and deliver iterative business value to stakeholders. In this session, we will discuss the challenges of AI adoption, share some pragmatic approaches that enterprises can adopt to accelerate their AI / ML initiatives and deliver quick and iterative business value. Examples of some approaches would include leveraging pre-integrated AI frameworks and automation techniques such as AutoML, along with ready-to-use industry-specific AI solutions. The faster and more incremental the delivery of business value through AI, the higher the likelihood of successful adoption and implementation of AI within enterprises…more details

Accelerate your AI/ML Initiatives and Deliver Business Value Quickly by Implementing Practical and Innovative Approaches image
Durga Kota
Chief Technology | Officer Fujitsu North America, Inc.
From Raw Data through Vectors to a Comprehensive Recommendation Model

In-person | Talk | Intermediate-Advanced

 

In today’s data-driven world, recommendation systems have become ubiquitous, driving user engagement and increasing revenue for businesses across various domains. This talk will take you on a journey from the raw data to vectorization techniques, ultimately culminating in the creation of a robust recommendation system. Whether you’re a seasoned data scientist or just starting your journey into recommendation systems, this presentation will provide valuable insights and practical takeaways for building powerful recommendation engines…more details

From Raw Data through Vectors to a Comprehensive Recommendation Model image
Hudson Buzby
Solution Architect | Qwak
Bridging the Interpretability Gap in Customer Segmentation

In-person | Talk | Machine Learning | Deep Learning | Intermediate

 

In this talk I will present a new, hybrid approach which combines the best aspects of both methods. The process begins with a careful observation of customer data and assessment of whether there are naturally formed clusters in the data. It continues with the selection of a clustering algorithm and the fine-tuning of a model to create clusters…more details

Bridging the Interpretability Gap in Customer Segmentation image
Evie Fowler
Senior Data Scientist | Fulcrum Analytics
Machine Learning Has Become Necromancy

In-person | Talk | Machine Learning | Intermediate

 

Machine learning has undergone a profound transformation with open source. From a technology that can only do naive curve fitting to technology that could potentially end humanities dominion. In 2017, Ali Rahimi declared that Machine learning is the new alchemy and we’d like to go further and claim that Machine Learning is the new necromancy. A forgotten science with a passionately strong open source ethos that was eventually destroyed by the catholic church…more details

Machine Learning Has Become Necromancy image
Mark Saroufim
Author Breaking Stagnation
Fine-tuning LLMs on Slack Messages

In-person | Talk | NLP & LLMs | Intermediate-Advanced

 

 

In this session, we will take a deep dive into a novel application of AI: training Large Language Models (LLM) on individual employee’s Slack messages. The first portion of our discussion is dedicated to the technical aspects of this process, where we will explain the steps involved in fine-tuning the LLM. We will demonstrate how such models, tailored to mimic specific individual’s textual styles, can serve as the foundation for applications in text generation and automated question answering systems.

Transitioning into the second part of our talk, we will spotlight the often-underemphasized side of AI deployment – the risks and ethical concerns. Using our project as a case study, we will expose you to the potential pitfalls we encountered and the proactive measures taken to manage these risks. Our aim is to underline the necessity of an encompassing risk management framework for AI…more details

Fine-tuning LLMs on Slack Messages image
Eli Chen
CTO and Co-Founder | Credo.AI
Running Data Quality Checks in Your Data Pipelines

In-person | Talk | MLOps | Intermediate

 

 

Ensuring proper data quality is critical in the effective implementation of data pipelines for ML, data science, geospatial analysis, or general analytics.
Most engineering teams address data quality and pipeline orchestration as two separate tasks. In this presentation, Sandy Ryza will explain the benefits of a model in which arbitrary checks are included in the data orchestration logic, resulting in better control and integration of data quality checks at various steps in the pipeline…more details

 

 

 

 

 

 

 

Running Data Quality Checks in Your Data Pipelines image
Sandy Ryza
Lead Engineer on the Dagster Project | Dagster Labs
Scope of LLMs and GPT Models in Security Domain

In-person | Talk | Generative AI | Machine Learning Safety and Security | Beginner-Intermediate

 

In today’s digital world, we have access to everything at our fingertips. We generate and consume data at a lightning speed. Data is growing at an exponential rate. But how safe is your data? Our data is always surrounded by advanced persistent threats (APTs). On the other hand, Data Science, Machine Learning & Artificial Intelligence (AI) with its sub-domains like Deep Learning and Large Language Models (LLMs) are making tremendous progress. In this talk, I will focus on the advanced technologies like LLMs and GPT models in the domain of cyber security. But some of these concepts are relatable to other industries as well…more details

Scope of LLMs and GPT Models in Security Domain image
Nirmal Budhathoki
Senior Data Scientist | Microsoft
Battle Scars from the MLOps Trenches of the Robotaxi Industry

In-person | Talk | Machine Learning | MLOps and Data Engineering | All Levels

 

The robotaxi industry has one of the most extreme use case for Machine Learning with very large image/LiDAR datasets and resource-hungry training jobs. I will recount my experience leading the ML Infrastructure team at Cruise, the #1 robotaxi company. From training on desktop GPUs to being entirely cloud-based, I will detail the challenges we faced, how we solved them. I will also detail what guarantees need to be built into any production-grade ML platform to ensure fast, robust, and safe ML development…more details

Battle Scars from the MLOps Trenches of the Robotaxi Industry image
Emmanuel Turlay
Founder/CEO | Sematic
Scaling your Data Science Workflows by Changing a Single Line of Code

In-person | Talk | MLOps | Data Engineering & Big Data | All Levels

 

Over the past decade, the democratization of data science tooling, particularly through Python libraries like pandas and NumPy, has empowered practitioners of all levels to work with data efficiently. Yet, despite the popularity of these tools, they present challenges as practitioners look to scale their workflows to production. In this talk, we explore the limitations of these tools and pain points that data scientists encounter when working with data at scale. I will share how our open-source project Modin (10M+ downloads) addresses this issue by seamlessly scaling up your pandas code with just a single line of code change…more details

Scaling your Data Science Workflows by Changing a Single Line of Code image
Doris Lee
CEO and Cofounder | Ponder
Democratizing Fine-tuning of Open-Source Large Models with Joint Systems Optimization

In-person | Talk | Deep Learning | MLOps and Data Engineering | Intermediate – Advanced

 

In this talk, we’ll provide an overview of the core ideas behind Saturn, how it works on a technical level to reduce runtimes & costs, and the process of using Saturn for large-model finetuning. We’ll demonstrate how Saturn can accelerate and optimize large-model workloads in just a few lines of code and describe some high-value real-world use cases we’ve already seen in industry & academia…more details

Democratizing Fine-tuning of Open-Source Large Models with Joint Systems Optimization image
Kabir Nagrecha
PhD Student | UC San Diego
MLOps v LMOps – What’s Different?

In-person | Talk | Machine Learning

 

 

Deploying advanced Machine Learning technology to serve customers and/or business needs requires a rigorous approach and production-ready systems, which has led to the development of MLOps. Large models make rigorous engineering and scalable architectures even more important, which in turn has led to the emergence of LMOps. Just the size of the models themselves, and the datasets used for training, require highly efficient infrastructure. More complex pipeline topologies which include transfer learning, fine tuning, instruction tuning, and evaluation along a collection of complex dimensions, require a high degree of flexibility for customization. Add to this the complex inference-time systems and requirements, and LMOps starts to look somewhat challenging to implement…more details

MLOps v LMOps – What’s Different? image
Robert Crowe
Product Manager, MLOps and TF OSS | Google
A Unified and User Friendly Approach to Develop ML Solutions in MySQL HeatWave AutoML

In-person | Talk | Deep Learning | NLP | Machine Learning | MLOps and Data Engineering | Beginner – Intermediate

 

We discuss a unified and user friendly approach to develop ML applications on MySQL HeatWave AutoML. We describe a simple to use MySQL API for HeatWave AutoML, its extension for different use cases and its ability to interface with third party applications. We also discuss how this facilitates the development of classification, regression, anomaly detection, forecasting and recommendation systems use cases, and present its comparison with other platforms that offer ML features for data bases. Lastly, we present how a user can fine tune the AutoML pipeline for their needs…more details

A Unified and User Friendly Approach to Develop ML Solutions in MySQL HeatWave AutoML image
Sanjay Jinturkar
Senior Director, MySQL HeatWave | Oracle
A Unified and User Friendly Approach to Develop ML Solutions in MySQL HeatWave AutoML image
Sandeep Agrawal, PhD
Consulting Principal Member of Technical Staff | Oracle
Uncertainty Quantification: Approaches and Methods

In-person | Half-Day Training | Machine Learning | Intermediate-Advanced

 

 

As machine learning models have become more present in our lives, there has been increasing attention on the reliability of these models. A major component of this is understanding the how uncertain the model is about its prediction. No model is exactly right 100% of the time, so we need methods and approaches by which we can quantify the level of uncertainty around a prediction.

Approaches to uncertainty quantification (UQ) vary, and depend on the type of problem. For classification problems, the primary approach is probability calibration: making sure that the model outputs corresponding to each class “behaves well” as a probability. For regression problems, there are several different approaches. One can configure models to output an interval, rather than a single point prediction, along with a “coverage” value that specifies the probability that the interval covers the true value. The framework of Conformal Prediction provides theoretical guarantees around such interval predictions. Or one can use methods that output an entire conditional density for y given X. This is called probabilistic regression, or conditional density estimation. Several parametric and non-parametric approaches exist for this problem including PrestoBoost, Coarsage, and NGBoost…more details

Uncertainty Quantification: Approaches and Methods image
Brian Lucena, PhD
Principal | Numeristical
Change the Game with Graph ML

In-person | Half-Day Training | Machine Learning | All Levels

 

In this 90-minute hands-on lab, you’ll learn how to start your Graph journey with ArangoDB. We will walk you through getting a basic deployment up and running in minutes. We will then help you create a deployment in the ArangoGraph insights platform and learn how to load and query your data…more details

Change the Game with Graph ML image
Asif Kazi
VP of Technical Customer Success | ArangoDB
Change the Game with Graph ML image
Arthur Keen, PhD
Principal Solution Architect | ArangoDB
Idiomatic Pandas

In-person | Workshop | Machine Learning | Beginner

 

Pandas can be tricky, and there is a lot of bad advice floating around. This tutorial will cut through some of the biggest issues I’ve seen with Pandas code after working with the library for a while and writing three books on it…more details

Idiomatic Pandas image
Matt Harrison
Python & Data Science Corporate Trainer | Consultant | MetaSnake
Generative AI, Autonomous AI Agents, and AGI – How new Advancements in AI will Improve the Products we Build

Virtual | Training | Generative AI | All Levels

 

This engaging workshop provides a hands-on journey into the world of Generative AI and Autonomous Agents, crucial building blocks towards achieving Artificial General Intelligence (AGI). Participants will delve into the transformative shift in AI, gaining insights into the revolutionary impact of Generative AI across various domains such as text, image, video, and 3D object generation, as well as data augmentation. The workshop will equip attendees with cutting-edge tools to substantially boost their AI capabilities and efficiency. A key feature of the session is a practical exercise in crafting an Autonomous AI Agent, a technology poised to work in synergy with us in the near future. This succinct yet thorough workshop is tailored for those eager to maintain a competitive edge in the swiftly advancing field of AI…more details

Generative AI, Autonomous AI Agents, and AGI – How new Advancements in AI will Improve the Products we Build image
Martin Musiol
Co-Founder and Instructor | Principal Data Science Manager | Generative AI.net | Infosys Consulting
MLOps: Monitoring and Managing Drift

In-person | Training | MLOps and Data Engineering | Machine Learning | Deep Learning | All Levels

 

Our objective is to ensure that you are equipped with the essential knowledge and practical tools to proficiently manage your machine-learning models in a real-world production environment…more details

MLOps: Monitoring and Managing Drift image
Oliver Zeigermann
Blue Collar ML Architect | Freelancer
Overview of Mojo🔥: Usability of Python, Performance of C

In-person | Workshop | Machine Learning | Deep Learning | NLP | ALL | Intermediate

 

In this hands-on, example driven workshop, we’ll introduce Mojo🔥 language features by starting with Python code and making minor changes to convert into high-performance Mojo🔥 code. Mojo provides full interoperability with the Python ecosystem, and we’ll show you how to integrate Mojo🔥 into your existing Python workflows. We’ll share the workshop material as a hosted tutorial with several Mojo🔥 scripts and Jupyter Notebooks which you can use as a starting point for your projects…more details

Overview of Mojo🔥: Usability of Python, Performance of C image
Shashank Prasanna
AI Developer Advocate | Modular
Bridging the Gap: Light Code Solutions to Uniting Social Science and Modern Knowledge Graphs

In-person | Workshop | Data Engineering & Big Data | Machine Learning | Generative AI | Data Visualization | Beginner

 

In this presentation, Alison will take you through the example of one independent researchers journey from no-code to knowledge graph master. She will walk you through the steps the researcher took, including how to format data stores, what data to collect, ways to leverage LLMs in the process and ultimately have a working knowledge graph in a dashboard with minimal coding experience…more details

Bridging the Gap: Light Code Solutions to Uniting Social Science and Modern Knowledge Graphs image
Alison Cossette
GDS Developer Advocate - Data Scientist | Neo4j, Inc.
Missing Data: A Synthetic Data Approach for Missing Data Imputation

Virtual | Workshop | Machine Learning | Intermediate

 

In this talk, we will cover the use of Generative models, such as LLMs and GANs, for the generation of smart synthetic data that can be leveraged to impute missing data. By using a generative model to impute missing data, we can generate new samples that are representative of the underlying data distribution, which can help to reduce the impact of missing data on our models. In addition, these models can be fine-tuned to specific datasets, allowing us to generate synthetic data that is tailored to our particular use case…more details

Missing Data: A Synthetic Data Approach for Missing Data Imputation image
Fabiana Clemente
Co-founder and CDO | YData
Deploying Trustworthy Generative AI

Virtual | Tutorial | GenAI | LLM | All Levels

 

The tutorial will consist of the following parts: (1) Introduction and overview of the generative AI landscape, (2) Technical and ethical challenges with generative AI, and (3) Solutions for alleviating the challenges with real-world use cases and case studies (including practical challenges and lessons learned in industry)…more details

Deploying Trustworthy Generative AI image
Krishnaram Kenthapadi
Chief AI Officer & Chief Scientist | Fiddler AI
Causal AI: from Data to Action

Virtual | Workshop | Machine Learning | Intermediate

 

 

In this talk, we will explore and demystify th world of Causal AI for data science practitioners, with a focus on understand cause-and-effect relationships within data to drive optimal decisions. In this talk, we will focus on:

* from shapley to DAGs: the dangers of using post-hoc explainability methods as tools for decision making, and how tranditional ML isn’t suited in situations where want to perform interventions on the system.
* discovering causality: how do we figure out what is causal and what isn’t, with a brief introduction to methods of structure learning and causal discovery
* optimal decision making: by understanding causality, we now can accurately estimate the impact we can make on our system – how to use this knowledge to derive the best possible actions to make?…more details

Causal AI: from Data to Action image
Dr. Andre Franca
CTO | connectedFlow
The AI Paradigm Shift: Under the Hood of a Large Language Models

Virtual | Workshop | LLMs | GenAI | Beginner

 

Abstract Coming Soon!

The AI Paradigm Shift: Under the Hood of a Large Language Models image
Valentina Alto
Azure Specialist - Data and Artificial Intelligence | Microsoft
Beyond Demos and Prototypes: How to Build Production-Ready Applications Using Open-Source LLMs

Virtual | Workshop | NLP & LLMs | Intermediate

 

The workshop will be hands-on and composed of 4 modules, targeting data scientists, machine learning experts, software engineers, and product managers. Prior exposure to LLMs is not necessary. At the end of the workshops, participants will have gained insights on how to effectively make the best of open-source LLMs and learned the necessary steps to bridge the gap between prototype and production-ready applications…more details

Beyond Demos and Prototypes: How to Build Production-Ready Applications Using Open-Source LLMs image
Suhas Pai
Chief Technology Officer | Bedrock AI
Evaluation Techniques for Large Language Models

In-person | Tutorial | Generative AI | Machine Learning | Beginner – Intermediate

 

The tutorial will cover the existing research on the capabilities of LLMs versus small traditional ML models. If an LLM is the best solution, the tutorial covers several techniques, including evaluation suites like the EleutherAI Harness, head-to-head competition approaches, and using LLMs for evaluating other LLMs. The tutorial will also touch on subtle factors that affect evaluation, including role of prompts, tokenization, and requirements for factual accuracy. Finally, a discussion of model bias and ethics will be integrated into the working examples…more details

Evaluation Techniques for Large Language Models image
Rajiv Shah, PhD
Machine Learning Engineer | Hugging Face
Facial Recognition from Scratch with Python and JS

In-person | Workshop | Deep Learning | All Levels

 

Facial recognition systems are everywhere. Of course, it’s where you would expect it, such as airports, border crossings, and government offices. However, it’s also in some public surveillance cameras, all over social media, embedded in smart home solutions, and even in your phone. Have you ever wondered how facial recognition systems work? In this hands-on session, we will build a facial recognition system from scratch using open-source technologies and publically available pre-trained models…more details