Europe 2023 Schedule!
All sessions are scheduled in the GMT time zone (UK time zone)
- ODSC Talks schedule includes Wednesday, June 14th – Thursday, June 15th. In-person sessions are available to Platinum and Mini-Bootcamp pass holders. Virtual Sessions are available to Virtual Premium, Virtual Platinum & Virtual Mini-Bootcamp pass holders.
- ODSC Trainings are scheduled from June 14th – Thursday, June 15th. In-person sessions are available to Platinum & Mini-Bootcamp Pass holders. Virtual Sessions are available to Virtual Platinum & Virtual Mini-Bootcamp pass holders.
- ODSC Workshop/Tutorials are scheduled from June 14th to Thursday, June 15th. All in-person sessions are available to Platinum & Mini-Bootcamp holders. Virtual Sessions are available for Virtual Premium, Virtual Platinum & Virtual Mini-Bootcamp pass holders.
- ODSC Bootcamp Sessions are scheduled VIRTUALLY on Tuesday, June 13 as pre-conference training. They are ONLY available for In-person Mini-Bootcamp, and VIP Pass and Virtual Mini-Bootcamp holders.
Speaker and speaker schedule times are subject to change.
Please Note: In-Persons attendees will have access to virtual sessions. If you have a virtual Pass, please note that we will not live-stream any in-person sessions. Only virtual sessions will be recorded.
The prerequisites to the workshop and training sessions are available HERE
Please review the final schedule:
– for in-person: Download TBM Engage app
Enter the app code: europe2023 in the App Store
– for virtual: live.odsc.com (agenda section) Log in using your email address while registering.
In-person | Bootcamp | Machine Learning | Beginner
In this workshop, you will get acquainted with the pandas library, which is the most widely used package for reading, analyzing and exporting datasets in Python. You will also learn how to visualize many kinds of tabular data using the plotnine package, along with some tips and tricks on how to make your visualizations stand out. Lastly, you will have the opportunity make predictions and take decisions using data, based on basic statistical methods…more details
Leonidas (Leo) is a Senior Data Scientist at Astrazeneca. His work is focused around machine learning in oncology, including clinical and non clinical applications. He is also enthusiastic about NLP applications in oncology and how this can be used to leverage patient treatment. He is also a workshop facilitator in the European Leadership University (ELU), NL and has also been a data science educator at DataCamp. He holds a PhD from the University of Warwick, UK. in bioinformatics and ML, an MSc in statistics from Imperial College London, UK and a BSc in Statistics and Insurance Science from the University of Piraeus, GR.
Virtual | Bootcamp | Machine Learning | Beginner
The Introduction to Machine Learning Workshop will build upon the attendee’s foundation of math and coding knowledge to develop a basic understanding of the most popular machine learning algorithms used in industry today. We will answer such questions as: What are the different types of ML algorithms ? What is Overfitting and how can we avoid it? Why is XGBoost consistently outperform other algorithms?…more details
Julia Lintern currently works as a Director of Data Science at Gartner. Previously, she worked as a Data Scientist for the New York Times. Julia began her career as a structures engineer designing repairs for damaged aircraft. Julia holds an MA in applied math from Hunter College, where she focused on visualizations of various numerical methods and discovered a deep appreciation for the combination of mathematics and visualizations. During certain seasons of her career, she has also worked on creative side projects such as Lia Lintern, her own fashion label.
Virtual | Bootcamp | Beginner
In this class students will install Anaconda Python and Jupyter Labs. Using this Jupyter Lab interface I will cover the basics of Python programming. Topics will include built in data structures, functions, looping, decisions, and importing other libraries…more details
Phil Tracton is an IC design engineer at Medtronic and an instructor at UCLA Extension. He has worked at Medtronic for over 20 years and has experience in implementing firmware, FPGAs, and custom ASICs. Many thousands of people have his work implanted in them. Most of these devices are focused on Neuromodulation. He has recently joined an internal team focused on long term research for implantable devices.
At UCLA he teaches multiple Python based courses including Learning Python and Python on the Raspberry Pi.
He is interested in low power AI on edge devices.
He will be running the Fundamentals of Python training class. This is his second time teaching at an ODSC event.
In-person | Talk | AI Safety | Machine Learning | Deep Learning | Advanced
In this talk, I will cover several new and practical tools for improving evaluation in safety-critical settings that improve statistical guarantees of estimates, as well as provide more insights on how to perform robust evaluation in situations where traditional assumptions cannot be met. I will draw connections with topics from interpretability, causal inference and uncertainty estimation and discuss how these are all key for evaluation…more details
Sonali is an Assistant Professor and leader of the AI for Actionable Impact Group at Imperial College London. Her research focuses on decision-making in uncertainty, causal inference and building interpretable models to improve clinical care and deepen our understanding of human health, with applications in areas such as HIV and critical care. Prior to this, Sonali was a postdoctoral research fellow at Harvard. Her work has been published at a number of machine learning conferences (NeurIPS, AAAI, ICML, AISTATS) and medical journals (Nature Medicine, Nature Communications, AMIA, PLoS One, JAIDS). She was also a Swiss National Science Fellow and was named a Rising Star in AI in 2021. Sonali received her PhD (summa cum laude) in 2019 from the University of Basel, Switzerland, where she built intelligent models for understanding the interplay between host and virus in the fight against HIV. Apart from her research, Sonali is also passionate about encouraging more discussion about the role of ethics in developing machine learning technologies to improve society.
In-person | Track Keynote | Machine Learning | Intermediate
In this talk I will present two new open-source packages that make up a powerful and state-of-the-art marketing analytics toolbox. Specifically, PyMC-Marketing is a new library built on top of the popular Bayesian modeling library PyMC. PyMC-Marketing allows robust estimation of customer acquisition costs (via media mix modeling) as well as customer lifetime value…more details
Thomas Wiecki is co-creator of PyMC, the industry-standard tool for statistical data science in Python. To help businesses solve advanced analytical problems he founded PyMC Labs (www.pymc-labs.io) consisting of world-class experts in Bayesian modeling.
Virtual | Keynote | All | All Levels
In this session we take a deep-dive into Azure Machine Learning, a cloud service that you can use to track as you build, train, deploy, and manage models. We use the Azure Machine Learning Python SDK to manage the complete life cycle of a PyTorch model, from managing the data, to train the model and finally run it into a production Kubernetes cluster…more details
Henk is a Cloud Advocate specializing in Artificial intelligence and Azure with a background in application development. He is currently part of the AI cloud advocate team and based in the Netherlands. Before joining Microsoft, he was a Microsoft AI MVP and worked as a software developer and architect building lots of AI powered platforms on Azure.
He loves to share his knowledge about topics such as DevOps, Azure and Artificial Intelligence by providing training courses and he is a regular speaker at user groups and international conferences.
In-person | Talk | Machine Learning | Deep Learning | Intermediate-Advanced
In this session, we will provide examples of when quantum computing is best applied to accelerate health care-specific applications and biomedical research. We will also provide an overview of how quantum computing works and a short overview of how to leverage open-source libraries, specifically Qiskit and Q#, to build, train, and evaluate a machine learning model for breast cancer prediction using an open dataset. We will also review how to build and run these models on local simulators and how these algorithms can be deployed on quantum hardware through cloud providers such as Azure…more details
Dr. Schulz is a physician scientist with a background in computational healthcare, molecular biology, and virology. Dr. Schulz has over 20 years’ experience in software development with a focus on enterprise system architecture and has a research interests in the management of large, biomedical data sets and the use of real-world data for predictive modeling. At Yale School of Medicine, he has led the deployment of the organization’s data science infrastructure which consists of a composable computing infrastructure to support the development of biomedical AI applications. Dr. Schulz is also a co-founder of Refactor Health, a digital health startup focused on the development of AI-driven digital signatures and automated healthcare DataOps.
In-person | Talk | Data Engineering & Big Data | ALL | All Levels
During this session, we will discuss the enablers that organizations need to unlock productivity with analytics and the importance of optimized algorithmic performance in the cloud to reduce costs, so organizations can derive maximum value from their investments...more details
Spiros Potamitis is a data scientist and a global product marketing manager of forecasting and optimization at SAS. He has extensive experience in the development and implementation of advanced analytics solutions across different industries and provides subject matter expertise in the areas of forecasting, machine learning and AI. Prior to joining SAS, Spiros worked and led advanced analytics teams in various sectors such as credit risk, customer insights and CRM.
In-person | Talk | Machine Learning for Finance | Intermediate-Advanced
In the last years, several machine learning innovations have been introduced to improve the robustness of asset allocation with hierarchical clustering and seriation-based approaches, to improve the transparency of these heuristics with explainable AI and to generate synthetic correlations and correlated market returns to improve the coverage of backtests and scenario analysis beyond the historical paths. Together, these innovations offer a consistent pipeline for better understanding rule-based dynamic portfolio allocation strategies. This talk reviews recent developments and puts them into the context of the current market challenges…more details
Peter Schwendner leads the Institute of Wealth & Asset Management at Zurich University of Applied Sciences, School of Management and Law, Switzerland. His interests are financial markets, asset management and machine learning applications. With the European Stability Mechanism (ESM), he has been developing analytics for primary and secondary bond markets and tools for optimizing the issuance process. Currently, he is working on the BRIDGE Discovery project “Spatial sustainable finance: Satellite-based ratings of company footprints in biodiversity and water”. Within the European COST Action «Fintech and AI in Finance», he leads the working group «Transparency into Investment Product Performance for Clients».
In-person | Talk | Generative AI | Machine Learning, Deep Learning | All Levels
The session will cover the importance of explaining AI models and their limitations, building effective next-gen data products, evaluating audience and user needs, and the aspects of visualisation that will always require human input. It will focus on the practical implications of AI tools on the roles of data professionals – and look at how we can thrive in this exciting new era…more details
Alan Rutter is the founder of consultancy Fire Plus Algebra, and is a specialist in communicating complex subjects through data visualisation, writing and design. He has worked as a journalist, product owner and trainer for brands and organisations including Guardian Masterclasses, WIRED, Riskified,the Home Office, the Biotechnology and Biological Sciences Research Council and Liverpool School of Tropical Medicine.
In-person | Talk | ML for Finance | ML Safety (AI Safety) & ML Security | MLOps and Data Engineering | Intermediate
In this talk, I will explain what Zero Trust Architecture is, which problems in data science it solves and how you could implement this into DataOps and MLOps processes. Furthermore, I will connect the concepts to the GDPR and the new/ proposed AI Act and use concrete examples from my projects in cyber security, banking and retail…more details
Dr. Casper Rutjes is Chief Technology Officer (CTO) at ADC (Amsterdam Data Collective), a Data & AI Consultancy in Europe. Rutjes is responsible for R&D, (Tech) Partnerships, consulting quality & standardization and IT. He leads global teams of consulting specialists in the areas of strategy & innovation, data engineering and data science across our key industries, mainly healthcare/life science, public and finance. At clients he is a trusted advisor and senior project lead for challenges on the interface of regulation, IT and business.
In-person | Talk
Data Storytelling for Business delves into two indispensable elements of giving great presentations with and about data: compelling data visualizations and designing a coherent and persuasive narrative around data. These skills are relevant across all industries…more details
Isaac Reyes is a TEDx speaker, data scientist and international keynote presenter in data analytics, data visualization and data presentation. In 2018, his “Art of Data Storytelling” speaking tour visited 23 cities across 5 continents, impacting over 15,000 people with Data Storytelling skills. He is the Co-founder of StoryIQ, a data visualization training company with full-time speakers in New York City, Manila and Singapore. In previous roles, he was the Head of Data Science at Altis Consulting and lectured in statistical theory at the Australian National University. A participant experience focused trainer, he was a keynote speaker at the 2019 Open Data Science Conference in Brazil.
Virtual | Talk | Responsible AI | NLP | Deep Learning | GenAI | Machine Learning | All Levels
Western societies are marked by diverse and extensive biases and inequality that are unavoidably embedded in the data used to train machine learning. Algorithms trained on biased data will, without intervention, produce biased outcomes and increase the inequality experienced by historically disadvantaged groups…more details
Professor Sandra Wachter is Professor of Technology and Regulation at the Oxford Internet Institute at the University of Oxford where she researches the legal and ethical implications of AI, Big Data, and robotics as well as Internet and platform regulation. At the OII, Professor Sandra Wachter leads and coordinates the Governance of Emerging Technologies (GET) Research Programme that investigates legal, ethical, and technical aspects of AI, machine learning, and other emerging technologies.
Professor Wachter is also an affiliate and member at numerous institutions, such as the Berkman Klein Center for Internet & Society at Harvard University, World Economic Forum’s Global Futures Council on Values, Ethics and Innovation, the European Commission’s Expert Group on Autonomous Cars, the Law Committee of the IEEE, the World Bank’s Task Force on Access to Justice and Technology, the United Kingdom Police Ethics Guidance Group, the British Standards Institution, the Bonavero Institute of Human Rights at Oxford’s Law Faculty and the Oxford Martin School. Professor Wachter also serves as a policy advisor for governments, companies, and NGO’s around the world on regulatory and ethical questions concerning emerging technologies.
Virtual | Talk | Machine Learning | Deep Learning | NLP | Beginner-Intermediate
In the field of healthcare, AI has been applied across the spectrum from diagnostics to prognostics. Many of these applications have been successfully commercialised yet only some are used in everyday patient care. This talk will introduce the audience to the science behind AI for disease detection (diagnosis) and prediction (prognosis) with a particular focus on musculoskeletal health. We will explore the link between big health data and AI, and finally highlight challenges and opportunities in reliable, representative, scalable and ethical uptake of AI technology in real-world clinical practice…more details
Sara is a Senior Research Associate in Biomedical Data Science and University Research Lecturer at the University of Oxford, where she is the Machine Learning Lead in the Centre for Statistics in Medicine. She has 12 years of experience in machine learning, signal processing, and intelligent remote monitoring research, with applications in biomedical and planetary health informatics. Sara has served on the NASA Frontier Development Lab Artificial Intelligence Panel and the NASA Climate Challenge Big Think. She is a National Geographic Society Explorer in Tracking Plastic Pollution with Remote Monitoring and Machine Learning. Sara is also a University of Oxford Ambassador for Women in Data Science.
In-person | Talk | Machine Learning | Intermediate
Pandas 2 brings new Arrow data types, faster calculations and better scalability. Dask scales Pandas across cores. Polars is a new competitor to Pandas designed around Arrow with native multicore support. Which should you choose for modern research workflows? We’ll solve a “just about fits in ram” data task using the 3 solutions, talking about the pros and cons so you can make the best choice for your research workflow. You’ll leave with a clear idea of whether Pandas 2, Dask or Polars is the tool to invest in…more details
Ian is a Chief Data Scientist, has helped co-organise the annual PyDataLondon conference raising $100k+ annually for the open source movement along with the associated 12,000+ member monthly meetup. Using data science he’s helped clients find $2M in recoverable fraud, created the core IP which opened funding rounds for automated recruitment start-ups and diagnosed how major media companies can better supply recommendations to viewers. He gives conference talks internationally often as keynote speaker and is the author of the bestselling O’Reilly book High Performance Python (2nd edition). He has over 25 years of experience as a senior data science leader, trainer and team coach. For fun he’s walked by his high-energy Springer Spaniel, surfs the Cornish coast and drinks fine coffee. Past talks and articles can be found at:
https://notanumber.email/
https://github.com/ianozsvald/
Tweets by ianozsvald
https://fosstodon.org/@ianozsvald
https://www.linkedin.com/in/ianozsvald/
In-person | Talk | Generative AI | Machine Learning | Deep Learning | Beginner
ChatGPT is the fastest-growing user application in history. Still, this application only has access to information it saw during training and sometimes produces false information, called hallucinations. In this talk, we will show you how to bring your data to LLMs and how to evaluate LLMs for your use case using open-source technology…more details
Timo Möller is Co-Founder of deepset and Head of Solution Engineering. He works closely together with deepset’s clients to bring modern NLP into production. He is an open-source fan and a passionate NLP engineer. Currently, he works on retrieval augmented generation, auto-generating training data, and ways to detect hallucinations.
In-person | Talk | Machine Learning for Finance
In this talk, we will look at how deep learning techniques can be used for building fast option pricers. A large set of representative training data is generated by using the numerical pricers. Then deep neural networks are used to learn the non-linear pricing functions…more details
Chakri Cherukuri is a senior researcher in the Quantitative Financial Research Group at Bloomberg LP in NYC. His research interests include quantitative portfolio management, algorithmic trading strategies, and applied machine learning. He has extensive experience in scientific computing and software development. Previously, he built analytical tools for the trading desks at Goldman Sachs and Lehman Brothers. He holds an undergraduate degree in mechanical engineering from the Indian Institute of Technology (IIT) Madras, India, and an MS in computational finance from Carnegie Mellon University.
In-person | Talk | Machine Learning Safety and Security | Deep Learning | Intermediate
This lecture will describe progress with developing automated certification techniques for learnt software components to ensure safety and adversarial robustness of their decisions. I will discuss different dimensions of robustness, including to bounded perturbations and causal interventions, as well as the role of uncertainty and explainability…more details
Marta Kwiatkowska is Professor of Computing Systems and Fellow of Trinity College, University of Oxford. She is known for fundamental contributions to the theory and practice of model checking for probabilistic systems, and is currently focusing on safety, robustness and fairness of automated decision making in Artificial Intelligence. She led the development of the PRISM model checker (www.prismmodelchecker.org), which has been adopted in diverse fields, including wireless networks, security, robotics, healthcare and DNA computing, with genuine flaws found and corrected in real-world protocols. Her research has been supported by two ERC Advanced Grants, VERIWARE and FUN2MODEL, EPSRC Programme Grant on Mobile Autonomy and EPSRC Prosperity Partnership FAIR. Kwiatkowska won the Royal Society Milner Award, the BCS Lovelace Medal and the Van Wijngaarden Award, and received an honorary doctorate from KTH Royal Institute of Technology in Stockholm. She is a Fellow of the Royal Society, Fellow of ACM and Member of Academia Europea.
In-person | Track Keynote | Data Engineering & Big Data | Deep Learning | Machine Learning | All Levels
The biggest challenges for developers of AI applications very often consist in building & delivering software to be used as a decision-making tool by operational staff. We will present how these challenges have been addressed using 2 successful projects: a cash flow prediction application (for one of Europe’s largest retailers) and a sales prediction app for a Quick Restaurant service…more details
Florian Jacta is a specialist of Taipy, a low-code open-source Python package enabling any Python developers to easily develop a production-ready AI application. Package pre-sales and after-sales functions. He is data Scientist for Groupe Les Mousquetaires (Intermarche) and ATOS. He developed several Predictive Models as part of strategic AI projects. Also, Florian got his master’s degree in Applied Mathematics from INSA, Major in Data Science and Mathematical Optimization.
Marine has 5+ years of experience as Data Scientist. She is skilled in Machine Learning techniques, Python, Rule-based models & AI. She has strong experience in Predictive and Descriptive Analytics, Fraud detection. She has done her Master’s Degree, Msc Big Data Analytics for Business from IÉSEG School of Management. Accounting & Finance from McGill University, Hong Kong University of Science and Technology and Europe Business School.
Virtual | Talk | Deep Learning | Machine Learning | NLP | Beginner-Intermediate
This talk compares a cloud-native data streaming architecture to traditional batch and big data alternatives and explains benefits like the simplified architecture, the ability of reprocessing events in the same order for training different models, and the possibility to build a scalable, mission-critical ML architecture for real time predictions with muss less headaches and problems…more details
Kai Waehner is Field CTO at Confluent. He works with customers and partners across the globe and with internal teams like engineering and marketing. Kai’s main area of expertise lies within the fields of Data Streaming, Analytics, Hybrid Cloud Architectures and Internet of Things. Kai is a regular speaker at international conferences, writes articles for professional journals, and shares his experiences with industry use cases and new technologies on his blog: www.kai-waehner.de. Contact: kai.waehner@confluent.io / @KaiWaehner / linkedin.com/in/kaiwaehner.
Virtual | Talk | Machine Learning for Finance | All Levels
The principal component analysis (PCA) is a staple statistical and unsupervised machine learning technique in finance. The application of PCA in a financial setting is associated with several difficulties, such as numerical instability and nonstationarity. We attempt to resolve them by proposing two new variants of PCA: an iterated principal component analysis (IPCA) and an exponentially weighted moving principal component analysis (EWMPCA). Both variants rely on the Ogita-Aishima iteration as a crucial step…more details
Bio Coming Soon!
Virtual | Talk | Deep Learning | Machine Learning | Intermediate
A central challenge to contemporary AI is to integrate learning and reasoning. The integration of learning and reasoning has been studied for decades already in the fields of statistical relational artificial intelligence and probabilistic programming. Statistical relational AI has focussed on unifying logic and probability, the two key frameworks for reasoning, and has extended this probabilistic logics machine learning principles…more details
Luc De Raedt is full professor at the Department of Computer Science, KU Leuven, and director of Leuven.AI, the newly founded KU Leuven Institute for AI. He is a guestprofessor at Örebro University in the Wallenberg AI, Autonomous Systems and Software Program. He received his PhD in Computer Science from KU Leuven (1991), and was full professor (C4) and Chair of Machine Learning at the Albert-Ludwigs-University Freiburg, Germany (1999-2006). His research interests are in Artificial Intelligence, Machine Learning and Data Mining, as well as their applications. He is well known for his contributions in the areas of learning and reasoning, in particular for his work on probabilistic and inductive programming. He co-chaired important conferences such as ECMLPKDD 2001 and ICML 2005 (the European and International Conferences on Machine Learning), ECAI 2012 and IJCAI-ECAI in 2022 (the European and international AI conferences). He is on the editorial board of Artificial Intelligence, Machine Learning and the Journal of Machine Learning Research. He is an EurAI and AAAI fellow, an IJCAI Trustee and received and ERC Advanced Grant in 2015.
Virtual | Talk | Generative AI | NLP | Deep Learning | Machine Learning
This session will provide an overview of these challenges and opportunities of PLMs for text summarisation using the biomedical domain as an example…more details
Sophia Ananiadou is Professor in Computer Science, Department of Computer Science, the University of Manchester. She is also Director of the National Centre for Text Mining (NaCTeM)); Deputy Director of the University’s Institute of Data Science and AI (IDSAI); Distinguished Research Fellow at the AI Research Centre of the National Institute of Advanced Industrial Science and Technology, Japan; Alan Turing Institute Fellow; Honorary Professor, University of the Aegean and Member of European Laboratory for Learning and Intelligent Systems Society. Her research interests evolved from abstract work on fragments of linguistic theory and logic to exploration of how AI systems could acquire and exploit knowledge of language, particularly in specialised domains (biomedicine, chemistry, exposome, law, public health). Research contributions include neural information extraction, text summarisation and simplification, emotion detection, terminology, development of resources (lexica, terminologies and labelled data), annotation tools and interoperable platforms for NLP workflows. She has developed tools such as the RobotAnalyst to improve evidence-based decisions, cut costs and improve efficiency and robustness of key policy decisions in public health.
In-person | Talk | Generative AI | NLP
Language models are increasingly attracting interest from writers. However, such models lack long-range semantic coherence, limiting their usefulness for longform creative writing. We address this limitation by applying language models hierarchically, in a system we call Dramatron…more details
Dr. Piotr Mirowski is a Staff Research Scientist at DeepMind. His research on artificial intelligence covers the subjects of reinforcement learning, navigation, weather and climate forecasting, as well as a socio-technical systems approach to human-machine collaboration and to computational creativity. He is the author of over 60 papers that have been published in Nature, Genome Biology, Clinical Neurophysiology or at ICLR, AAAI and NeurIPS. Piotr studied computer science in France at ENSEEIHT Toulouse and obtained his PhD in computer science in 2011 at New York University, with a thesis supervised by Prof. Yann LeCun (Outstanding Dissertation Award, 2011). A trained actor himself, Piotr founded and directs Improbotics, a theatre company where human actors and robots improvise live comedy performances and investigate the use of AI for artistic human and machine-based co-creation. https://piotrmirowski.com
In-person | Talk | All Levels
Services like ChatGPT and others powered by Generative AI are fueling innovation and efficiency across industries. However, for enterprises these services do not come without their risks, as they raise critical questions regarding data privacy and contextual accuracy considerations. In this presentation, we delve into the deployment of open source LLMs within secure environments. We discuss the advantages of this approach for enterprises, including heightened data privacy, improved accuracy, and greater control over AI implementations in enterprise settings…more details
Jake currently holds the position of Principal Technical Evangelist at Cloudera, where he promotes the strengths of Cloudera’s Lakehouse for delivering trusted AI. His tenure at Cloudera began as a Senior Product Marketing Manager for Cloudera Machine Learning (CML).
Before Cloudera, Jake developed his ML expertise at ExxonMobil, starting as a Data Scientist and later transitioning to a Data Science and Analytics Solution Architect role. He also contributed significantly at FarmersEdge, taking on responsibilities as a Senior Data Scientist and subsequently as a Data Science Manager.
Jake earned both his bachelor’s and master’s degrees from Brigham Young University in Information Systems Management with an emphasis in Statistics.
Outside of work, Jake is passionate about outdoor activities. He enjoys skiing, golfing, rafting, and hiking. However, spending time with his family amidst the mountains remains his most rewarding pastime.
In-person | Talk | Machine Learning | Beginner-Intermediate
In this session, we confront the widely acknowledged limitation in traditional statistical analysis and machine learning: ‘correlation is not causation.’ We start by dissecting this concept, outlining the challenges it presents when trying to derive meaningful insights from data…more details
Bernardo is a Data & AI leader, passionate about powering data transformation in companies and promoting social good in society using data.
He is specialized in Data Science, Machine Learning and AI, having won two awards in this field (Innovation in Big Data Award by Thomson Reuters and Machine Learning & Neural Computation Award by Imperial College London). His goal is to be able to take any challenge, no matter how complex, and to solve it using a fusion of art & science, business & technology capabilities, data & analytics to make it happen.
Bernardo has an MRes in Advanced Computing from Imperial College London, with a specialization in Machine Learning and a BSc in Electrical and Computer Engineering from Instituto Superior Técnico.
In-person | Talk | Machine Learning | Deep Learning | Intermediate
In this session, we’ll explore and discuss the following:
– Why and what is Ray
– How AIR, built atop Ray, allows you to program and scale your machine learning workloads easily
– AIR’s interoperability and easy integration points with other systems for storage and metadata needs
– AIR’s cutting-edge features for accelerating the machine learning lifecycle such as data preprocessing, last-mile data ingestion, tuning and training, and serving at scale..more details
Kai Fricke is a senior software engineer at Anyscale. As a core maintainer of the Ray AI Runtime he is building software for distributed machine learning training and tuning. During his postdoc at Cambridge he utilized reinforcement learning to optimize large graph structures and co-authored two open source reinforcement learning libraries.
In-person | Business Talk | Machine Learning | ML for Finance | MLOps and Data Engineering | Beginner-Intermediate
The successful deployment of machine learning (ML) models into production has traditionally been a complex and resource-intensive process that many organizations struggle with. With the rise of MLOps, a methodology that applies DevOps principles to ML, this process has become much more streamlined. At the Dutch fintech Mollie, we have fully embraced MLOps and implemented a cloud-based ML platform that supports both batch and real-time inference, as well as a suite of MLOps tools to facilitate the entire development cycle…more details
In-person | Talk | Generative AI | NLP Deep Learning | Machine Learning | Intermediate
Large Language Models (LLMs) such as GPT, LLaMa etc are everywhere these days. In this talk, we will see how to leverage the LLMs when using the Julia Programming Language. We will discuss how to run inference using these models from Julia, how to fine tune them, and even how to access third party hosted models from Julia code. At the end of this session, a Julia developer will have all the tools needed to use LLMs when writing Julia code…more details
Avik Sengupta is the head of product development and software engineering at Julia Computing, contributor to open source Julia and maintainer of several Julia packages. Avik is the author of Julia High Performance, co-founder of two artificial intelligence start-ups in the financial services sector and creator of large complex trading systems for the world’s leading investment banks.
Virtual | Talk | Responsible AI | NLP | Deep Learning | GenAI | Machine Learning | Beginner
In recent years fairness in machine learning (ML) and artificial intelligence (AI) has emerged as a highly active area of research and development. Most define fairness in simple terms, where fairness means reducing gaps in performance or outcomes between demographic groups while preserving as much of the accuracy of the original system as possible. This oversimplification of equality through fairness measures is troubling. Many current fairness measures suffer from both fairness and performance degradation, or “levelling down,” where fairness is achieved by making every group worse off, or by bringing better performing groups down to the level of the worst off…more details
Professor Brent Mittelstadt is an Associate Professor, Senior Research Fellow, and Director of Research at the Oxford Internet Institute, University of Oxford. He leads the Governance of Emerging Technologies (GET) research programme which works across ethics, law, and emerging information technologies. He is a prominent data ethicist and philosopher specializing in AI ethics, algorithmic fairness and explainability, and technology law and policy. Prof. Mittelstadt is the author of foundational works addressing the ethics of algorithms, AI, and Big Data; fairness, accountability, and transparency in machine learning; data protection and non-discrimination law; group privacy; ethical auditing of automated systems; and digital epidemiology and public health ethics. His contributions in these areas are widely cited and have been implemented by researchers, policy-makers, and companies internationally, featuring in policy proposals and guidelines from the UK government, Information Commissioner’s Office, and European Commission, as well as products from Google, Amazon, and Microsoft.
Virtual | Talk | NLP | Machine Learning | Deep Learning | Generative AI | All Levels
In this talk, I will first introduce the field of semantics and the task of semantic analysis, a.k.a, semantic parsing from a multilingual perspective. In particular, we will first discuss the layers of meaning, from morphology to pragmatics, and then define the scope of semantics as a field…more details
Dr. Gözde Gül Şahin is an Assistant Prof. at Koç University and a KUIS AI Fellow since February 2022. Previously, she was a postdoctoral researcher in the Ubiquitous Knowledge Processing (UKP) Lab at the Technical University of Darmstadt, Germany. Her research spans the fields of linguistics and machine learning, in particular semantics, multilingual representations and large language models. She completed her PhD studies in Istanbul Technical University (İTÜ) Computer Engineering department in 2018. She was a visiting researcher at the Institute for Language, Cognition and Computation (ILCC) of the University of Edinburgh in 2017. Before her Ph.D., she received her Masters and Bachelor degrees from Sabancı University in 2011 and İTÜ in 2009, respectively. She regularly serves as a PC member for *ACL conferences and is a co-organizer for the Workshop on Multilingual Representation Learning (MRL). Her research on NLP has been funded by Tübitak 2232, and 2236 grant programs that are granted to outstanding young principal investigators.
Virtual | Talk | Responsible AI and Social Good | Machine Learning | Deep Learning | Intermediate
The presentation will provide information on value-alignment methods, will give insights on how to address the construction of morality in machines, and will discuss the importance of teaching tecno-ethics in education…more details
Carles Sierra is Director of the Artificial Intelligence Research Institute (IIIA) of the Spanish National Research Council (CSIC) located in Barcelona. He is the President of EurAI, the European Association of Artificial Intelligence. He has been contributing to Artificial Intelligence research since 1985 in the areas of Knowledge Representation, Auctions, Electronic Institutions, Autonomous Agents, Multiagent Systems and Agreement Technologies. He is or has been a member of several editorial boards of journals, including AIJ and JAIR, two of the most prestigious generalist journals, and was the editor in chief of the JAAMAS journal, specialized in autonomous agents. He organized IJCAI, the most important international artificial intelligence conference in 2011 in Barcelona and was the President of the IJCAI Program Committee in 2017 in Melbourne. He is a Fellow of the European Association of AI, EurAI, and recipient of the ACM/SIGAI Autonomous Agents Research Award 2019.
Virtual | Talk | Generative AI | Machine Learning | Deep Learning | Beginner
In this talk, I will describe the latest developments in methodologies that can be used to detect social biases in texts generated by GAI systems. In particular, I will describe methods that can be used to detect social biases expressed not only in English but other languages as well, with minimal human intervention…more details
Danushka Bollegala is a Professor in the Department of Computer Science, University of Liverpool, UK. He obtained his PhD from the University of Tokyo in 2009 and worked as an Assistant Professor before moving to the UK. He has worked on various problems related to Natural Language Processing and Machine Learning. He has received numerous awards for his research excellence such as the IEEE Young Author Award, best paper awards at GECCO and PRICAI. His research has been supported by various research council and industrial grants such as EU, DSTL, Innovate UK, JSPS, Google and MSRA. He is an Amazon Scholar.
In-person | Talk | Machine Learning | Intermediate
The session will commence with an overview of the bottom-up and top-down modelling approaches, highlighting their respective strengths and limitations in various data science applications. Attendees will learn how bottom-up modelling focuses on individual components and their interactions, such as modelling individual customer demand in a supply chain, while top-down modelling emphasises the high-level relationships between components to provide a broader perspective, like analysing the overall market trends affecting the supply chain…more details
Gustavo is the esteemed Vice President of Research at Vortexa Ltd., where he has focused on applying statistical modelling and Machine Learning to the energy and freight markets. His research interests span computational neuroscience, medical imaging, and the development of innovative solutions for the energy sector.
Prior to his tenure at Vortexa, Gustavo amassed a wealth of experience in both the academic and professional realms. He has published his research in prestigious international journals and presented his findings at scientific conferences across the globe. Gustavo’s dedication to finding optimal solutions for complex business problems is evident in his work.
Gustavo holds an SB and MEng in Computer Science and Electrical Engineering from the Massachusetts Institute of Technology (MIT) and a PhD from the University of Tokyo. As an expert in his field, Gustavo brings a depth of knowledge and experience to ODSC, where attendees can expect to learn from his invaluable insights.
In-person | Talk | ML for Finance | Machine Learning | Deep Learning | Intermediate
This talk will focus on forecasting inflation with ML and ‘alternative data’. It will show the steps of building such models, the improvements over the traditional econometric models, and will describe the many hurdles a practical implementation of such an approach entails…more details
Alexander is a Quant & Data Scientist with 20 years of accumulated experience both in specialist and leadership positions in global financial institutions.. Mastering the main AI/ML techniques, he is also strictly specialized and personally contributed to the field of Probabilistic Graphical Models, Causal AI and Alternative Data. Alexander has authored/co-authored 10+ papers and 3 books on these topics. He holds a degree in Mathematical Finance from University of Oxford where he is a Visiting Lecturer on Bayesian Risk Management and Alternative Data. Currently he is CEO of Turnleaf Analytics.
In-person | Talk | NLP | Machine Learning | Intermediate
In this talk, I will first give an overview of the built-in functionality available in spaCy, using pretrained models. I will showcase how linguistic information such as part-of-speech tags and dependency parses can help you identify interesting patterns or phrases in your documents and ultimately perform document classification or other information retrieval tasks…more details
Sofie is a machine learning and NLP engineer who firmly believes in the power of data to transform decision making in industry. She has a Master in Computer Science (software engineering) and a PhD in Sciences (Bioinformatics), and more than 16 years of experience in Natural Language Processing and Machine Learning, including in the pharmaceutical industry and the food industry. In 2019, she joined Explosion to work on the open-source NLP library spaCy. She is currently leading the open-source team developing and maintaining spaCy, as well as various other open-source developer tools for data scientists.
In-person | Talk | Machine Learning | Deep Learning | NLP | Beginner-Intermediate
During the talk, we’ll show how Ludwig’s novel compositional model architecture referred to as encoder-combiner-decoder makes it possible to easily mix multiple modalities of data such as text, images, audio with structured data in a way that is consistently easy across tasks like regressions, classification, and even generation…more details
Dev is co-founder and Chief Product Officer for Predibase, a company looking to redefine how data scientists and engineers build models with a declarative approach. Prior to Predibase, he was a ML PM at Google working across products like Firebase, Google Research and the Google Assistant as well as Vertex AI. While there, Dev was also the first product manager for Kaggle – a data science and machine learning community with over 8 million users worldwide. Dev’s academic background is in computer science and statistics, and he holds a masters in computer science from Harvard University focused on machine learning.
In-person | Talk | Data Engineering | MLOps | Intermediate
In this talk, Leanne will take us through how the FT, already with a large number of models in production, are spearheading a journey to improve, iterate and upgrade the way they develop, deploy and monitor their ML and Data Science capabilities, all whilst keeping their current capabilities running. Leanne will highlight they key approaches and considerations when looking to improve your MLOps processes, and how you can expedite your ML in production activities, while ensuring you keep “the car on the road”…more details
Leanne is Director of Data Science at the Financial Times and is a passionate, experienced data leader having built and developed empowered data science and analytics teams for a variety of businesses; from startups to large organisations. Leanne is in her element when developing and implementing strategic, technical and cultural solutions to getting data & analytical capabilities into the operational ecosystem. She is an active part of the data and technology community, sharing innovation and insights to encourage best practice, from Manchester, UK to Austin, TX and is an Advisory Panel Board Member. Outside of all things data you can ask Leanne about her golf swing (it’s not good – yet), her passion for American Football (specifically the Cincinnati Bengals), her latest sewing project, and her love for good music, food and whisky.
In-person | Talk | MLOps & Data Engineering | Responsible AI | Beginner
From this talk you will learn:
– What ML Governance is meant to achieve
– How to get started with a template process
– The role of documentation (and especially Google Model Cards)
– Which roles have what responsibilities
– The relevance of a governance board
Ryan Dawson is a technologist passionate about data. Ryan works with clients on large-scale data and AI initiatives, helping organizations get more value from data. His work includes strategies to productionize machine learning, organizing the way data is captured and shared, selecting the right data technologies and optimal team structures, as well as writing the code to make it happen. He has over 15 years of experience and, as well as many widely read articles about MLOps, software design, and delivery. is author of the Thoughtworks Guide to Evaluating MLOps Platforms.
Meissane Chami serves ThoughtWorks, Inc. as a Senior ML Engineer, advising and developing innovative data science and machine learning solutions from proof of concept to production. She has gained expertise setting up innovation frameworks and conducting fast cycle proof of concepts. Her primary areas of expertise are in Natural Language processing, MLOps, DevOps, cloud computing, containerisation and Python. She holds a MSc degree in Machine Learning and Data Science form University College London School of Engineering.
Virtual | Talk | Generative AI | Deep Learning | NLP | All Levels
Deep learning especially large language models has been gaining a lot of recent traction from research community. This talk builds some background in deep learning towards explaining the concepts of large language models. Afterward, this talk lists different popular large language models, conducts brief comparison in terms of techniques and accuracy results among different large language models…more details
Hossam Amer joined Microsoft as a scientist in 2021. His research interests are Image/Video Compression, Computer Vision, and most recently Natural Language Processing. Hossam is contributing to many products including Microsoft Translator and Microsoft SwiftKey. Prior to joining Microsoft, Hossam was a Postdoctoral-Fellow at the Multimedia Communications Lab at the University of Waterloo (UW), where he mentored several MSc and PhD students. He obtained his PhD from the same lab, where he received the prestigious annual UW teaching award based on students’ and instructors’ nominations as well as published papers in top venues. Hossam also acts as a reviewer in several IEEE conferences and journals and supervises students in research and teaching. In addition, Hossam was the Chair of the ECE Graduate Student Association at UW. Hossam is a strong believer in constantly transferring his knowledge in order to make a difference.
Virtual | Talk | Machine Learning | Deep Learning | Intermediate
A common problem in the cybersecurity industry is how to detect and track botnets when there are billions of daily attacks. Botnets are internet connected devices that perform repetitive tasks, such as Distributed Denial of Service (DDoS). In many cases, these consumer devices are infected with malicious malware that is controlled by an external entity, often without the owner’s knowledge…more details
Ori Nakar is a principal cyber-security researcher, a data engineer, and a data scientist at Imperva Threat Research group. Ori has many years of experience as a software engineer and engineering manager, focused on cloud technologies and big data infrastructure. Ori also has an AWS Data Analytics certification. In the Threat Research group, Ori is responsible for the data infrastructure and involved in analytics projects, machine learning, and innovation projects.
In-person | Talk | Generative AI | Beginner-Intermediate
Daily communication via text between customer service agents and clients is rapidly increasing day by day, and banking is not an exception. In this talk we will explain how we experimented with generative NLP models to assist financial advisors in their daily interactions with clients. For this work we have used a seq2seq deep learning neural network architecture based on two LSTM acting as encoder and decoder…more details
Clara is senior data scientist at BBVA AI Factory. She has worked in the data science field for many years applying NLP techniques to different sectors such as media or banking. At the BBC in London she worked building recommender systems for BBC News and developed several tools to help editors understand audience feedback. At the banking sector in BBVA she has worked on building data products to help financial advisors better manage customers queries. She currently leads the collections data science team at BBVA AI factory. Prior to her industry experience she carried out her PhD in artificial intelligence and bioinformatics and holds a degree in computer science. Clara advocates for a responsible use of technology and is actively involved in activities which encourage women and girls to pursue a career in technology and science to help bridge the gender gap in these disciplines.
María is Senior Data Scientist and Data Product Owner at BBVA AI Factory, with ten years of experience in the Data Science field, she was one of the first Data Scientists in BBVA, taking part in the Big Data ecosystems set up in the bank. Graduated in Mathematics and Computer Engineering, she holds a MSc in Computational Intelligence from Universidad Autónoma de Madrid (UAM), specialized in Aspect-based sentiment Analysis and Item Recommendation.
She has worked in several analytical domains, ranging from Retail and Urban Analysis to Customer Intelligence. Now, she is trying to enhance the customers’ relationship with the bank through Natural Language Processing and Text Analytics. María focuses on understanding business challenges and developing the best analytical solution for each problem.
In-person | Business Talk | Deep Learning | Machine Learning | Beginner-Intermediate
Sam will lift the lid on the deep learning models used by Ocado Technology and how these have been adapted for the challenges faced in online grocery, showcasing the positive results achieved by the retailers who have adopted these forecasting solutions including 50% improvement in accuracy, drastically reduced waste, automation of replenishment decisions and big financial savings. Join this session to get a glimpse into a real life example of deep learning in production and how it is having an impressive impact in the ecommerce space…more details
Sam leads the supply chain machine learning team at Ocado Technology, responsible for the demand forecasting and replenishment optimisation algorithms used by Ocado’s international partners. Sam holds a DPhil in Condensed Matter Physics from the University of Oxford and volunteers as an ambassador for DataKindUK. Prior to joining Ocado he spent a number of years working in AI startups in the Netherlands.
In-person | Talk | MLOPs | All Levels
If your models are doing great in experimentation but you are still trying to put all the production pieces together, This session might help you understand what’s going wrong and how to fix it. By working according to this methodology data scientists can iterate rapidly which is at the core of a successful ML project…more details
Yuval Fernbach is the Co-founder & CTO of Qwak, where he is focused on building next-generation ML Infrastructure for ML teams of various sizes. Before Qwak, Yuval was an ML Specialist at AWS , where he helped AWS Customers across EMEA with their ML challenges. Previous to that, he was the CTO of the IT department of the IDF (“Mamram”).
Virtual | Talk | NLP | LLM | Intermediate-Advanced
In this talk, I address the challenge of learning from limited data for a range of natural language understanding tasks and applications. I will present our work on few-shot learning approaches to NLP in both monolingual and cross-lingual settings and present findings in tasks such as word sense disambiguation, syntactic parsing and text classification. Finally, I will present recent research on approaches that can enable higher levels of data efficiency, and show how they can outperform much more computationally complex counterparts…more details
Helen Yannakoudakis is an Assistant Professor in Natural Language Processing (NLP) at the Department of Informatics, King’s College London, and a Visiting Researcher at the Department of Computer Science & Technology, University of Cambridge. She is also a co-founder and Chief Scientific Officer at Kinhub (formerly Kami), translating research outcomes to deployable real-world applications in health and wellbeing. Her research focuses on machine learning for NLP, and specifically on transfer learning, few-shot learning, lifelong learning, multilingual NLP, and societal and health applications, such as language assessment, abusive language detection, misinformation, emotion and mental health detection. Helen is a Fellow of the Higher Education Academy, has received funding awards from both industry and academia, has won international competitions such as the NeurIPS 2020 Hateful Memes Challenge, and currently serves as an Area Chair for NeurIPS 2023.
Virtual | Talk | MLOps & Data Engineering | Intermediate
The benefits of Real-Time Machine Learning are becoming increasingly apparent. Digital native companies have long proven that use cases like fraud detection, recommendation systems, and dynamic pricing all benefit from lower latencies. In a recent KDD paper*, Booking.com found that even a 30% increase in model serving latency caused a .5% decrease in user conversion, a significant cost to their business…more details
Dillon Bostwick is a Solutions Architect at Databricks, where he’s spent the last five years advising customers ranging from startups to Fortune 500 enterprises. He currently helps lead a team of field ambassadors for streaming products and is interested in improving industry awareness of effective streaming patterns for data integration and production machine learning. He previously worked as a product engineer in infrastructure automation.
Avinash Sooriyarachchi is a Senior Solutions Architect at Databricks. His current work involves working with large Retail and Consumer Packaged Goods organizations across the United States and enabling them to build Machine Learning based systems. His specific interests include streaming machine learning systems and building applications leveraging foundation models. Avi holds a Master’s degree in Mechanical Engineering and Applied Mechanics from the University of Pennsylvania.
Virtual | Talk | Responsible AI | Machine Learning | All Levels
AI-powered coding assistants, such as GitHub Copilot, are spreading rapidly in the software engineering community. Copilot was developed by Microsoft and OpenAI on top of Codex, a transformer-based Large Language Model, and overtook 400.000 subscribers in the first month. It was praised by influential engineers, including Guido van Rossum, the inventor of the Python language…more details
Emanuele is Engineer by education, Data Scientist by choice, researcher and lecturer by passion. During his PhD in ML, he got invited to EPFL Lausanne for a 6-month visit and published 9 papers in top journals.He is the co-founder of xtream, an AI boutique applying academic research to business. Contributing to the community is part of their mission: He was a speaker and track organizer at eRum, AMLD, and PyCon and he lectured at Italian, Swiss, and Polish universities.
Virtual | Talk | Deep Learning | NLP | Machine Learning | All Levels
This talk will demonstrate the power of compound sparsity for model compression and inference speedup for NLP and CV domains, with a special focus on the recently popular Large Language Models…more details
Damian is engineer, roboticist, software developer, and problem solver. Previous experience in autonomous driving (Argo AI), AI in industrial robotics (Arrival), and building machines that build machines (Tesla). Currently working in Neural Magic, focusing on the sparse future of AI computation. Works towards unlocking creative and economic potential with intelligent robotics while avoiding the uprising of sentient machines.
Konstantin Gulin is a Machine Learning Engineer at Neural Magic working on bringing sparse computation to the forefront of industry. With prior experience in applying machine learning to remote sensing (NASA) and space mission simulation (The Aerospace Corporation), he’s turned his focus to enabling effective model deployment in even the most constrained environments. He’s passionate about technology and ethical engineering and strives for the thoughtful advancement of AI.
In-person | Business Talk | AI for Transportation | All Levels
In this talk we will be focusing on the third point, showing how a digital strategy can be driven by information more than by data, while still relying on advanced algorithms to solve very large scale problems. We will draw from extensive expertise working with companies in the logistics and distribution industry, optimizing distribution networks and their operation: hub-and-spoke configuration, intermodal operation, truck scheduling, and driver fleet optimization. In all these cases, we will discuss how capturing and digitizing the right information in the form of constraints has been critical to producing realistic recommendations accepted by the operation teams on the ground. As a side yet non-negligible benefit, digitized information is knowledge that stays within the company instead of leaving when the expert employee changes job or retires. This results in more resilient companies, robust operations ready for scale, a proactive mindset instead of reactive, and a positive environmental impact in the form of supply chain decarbonization…more details
Tomasz M. Grzegorczyk is the founder and CEO of Teranalytics, an AI and optimization company specializing in large-scale logistics operations such as production, manufacturing, shipping, and distribution. Before creating Teranalytics, he was a Chief Scientist at BAE Systems and MIT Research Scientist where he worked on computational electromagnetics, scattering in complex media, optical forces, and wave propagation in metamaterials. Tomasz holds a PhD from the Swiss Federal Institute of Technology in Lausanne, an MBA from the Massachusetts Institute of Technology, and is a senior member of the IEEE. He served as editor and board member of two international peer-reviewed
journals and one international conference, has authored more than a hundred publications and a book on metamaterials.
In-person | Talk | Machine Learning | Intermediate-Advanced
This session is designed for data practitioners who wish to maintain control and confidence over their projects even after deployment in production. We will explore two methods from the O’Reilly book “Fundamentals of Data Observability” that can be easily adopted to ensure the reliability of data pipelines throughout the whole process, from ingestion to analytics…more details
Andy Petrella is the CPO and founder of Kensu, a data observability solution that helps data teams trust what they deliver and create more value from data.
Andy is an entrepreneur with a background in data mining, data engineering, and data science. He is known as an early evangelist of Apache Spark and the Spark Notebook creator in the data community.
Since 2015, Andy has been an O’Reilly instructor and author, including the first O’Reilly book about Data Observability: “Fundamentals of Data Observability”
In-person | Talk | Deep Learning | MLOps | Intermediate
Deploying advanced Machine Learning technology to serve customers and/or business needs requires a rigorous approach and production-ready systems. This is especially true for maintaining and improving model performance over the lifetime of a production application. Unfortunately, the issues involved and approaches available are often poorly understood…more details
A data scientist and ML enthusiast, Robert has a passion for helping developers quickly learn what they need to be productive. Robert is currently the Senior Product Manager for TensorFlow Open-Source and MLOps at Google and helps ML teams meet the challenges of creating products and services with ML. Previously Robert led software engineering teams for both large and small companies, always focusing on moving fast to implement clean, elegant solutions to well-defined needs. You can find him on LinkedIn at robert-crowe.
Virtual | Talk | Machine Learning | All Levels
The talk is intended for graduate students, professionals, and MBA students seeking an introduction to forecasting methods without diving too deep into theoretical details. Participants will develop skills, mindsets, and behaviors sought after in the industry today…more details
Tanvir Ahmed Shaikh is a highly entrepreneurial and visionary data strategist with a passion for driving business growth through innovative data-driven solutions. With a track record of success in data science and digital transformation, Tanvir has been instrumental in developing and implementing strategies that improve efficiency, quality, and compliance. He possesses strong collaboration skills and effectively communicates technical concepts to non-technical stakeholders.
Currently serving as a Data Strategist (Director) at Genentech Inc., Tanvir leads the digital roadmap for the Global Pharma Manufacturing Quality organization. His expertise in prioritizing digital initiatives, building consensus, and driving change management has resulted in significant positive impacts on the organization.
Tanvir’s leadership abilities are exemplified through his role as the Founder and Digital Strategy Lead of the Roche Intrapreneur Network, a global network of over 350+ Roche technologists focused on executive capabilities and experiential learning. Through this network, he fosters a culture of entrepreneurship, product management, and storytelling, encouraging innovation and empowering individuals to think like CEOs of their products.
In his previous role as a Principal Data Scientist, Tanvir spearheaded cross-functional projects, driving operational excellence in forecasting, automation, and AI education. His contributions have led to substantial cost savings and increased efficiency within the organization. Tanvir’s passion for education and continuous learning is evident in his role as an Adjunct Professor at Carnegie Mellon University. He teaches courses on Time Series Forecasting in Python, AI Product Management, and Storytelling with Data, inspiring students to think holistically and take an end-to-end view of problem-solving. He actively promotes a culture of continuous learning, inclusive community building, and inspirational storytelling. Beyond his professional pursuits, Tanvir embraces a diverse range of interests. He finds joy in the culinary arts, experimenting with new recipes and creating culinary delights. Music also holds a special place in his heart, and he enjoys singing and playing the ukulele in his free time. Tanvir’s curiosity extends to the financial world, where he actively researches stocks and shares his knowledge, promoting personal finance education. Additionally, he stays active through the sport of tennis, both in competitive settings and for leisure. Tanvir’s dedication to data-driven strategies, love for storytelling, and commitment to personal growth and education make him a versatile and accomplished professional. He embodies the values of continuous learning, community building, and innovative thinking, making a significant impact in the field of data science and beyond.
Virtual | Talk | Responsible AI | Machine Learning | All Levels
Statistical reasoning shapes our collective sense of what is true, what is best, and what should happen next. Even before we mechanized statistical prediction through machine learning, it was a habitual convention that was used as a marker of quality, rigorous science and democratic fairness…more details
Jutta Treviranus is the Director of the Inclusive Design Research Centre (IDRC) and professor in the faculty of Design at OCAD University in Toronto (http://idrc.ocadu.ca ). Jutta established the IDRC in 1993 as the nexus of a growing global community that proactively works to ensure that our digitally transformed and globally connected society is designed inclusively. Dr. Treviranus also founded an innovative graduate program in inclusive design at OCAD University. Jutta is credited with developing an inclusive design methodology that has been adopted by large enterprise companies such as Microsoft, as well as public sector organizations internationally. In 2022 Jutta was recognized for her work in AI by Women in AI with the AI for Good – DEI AI Leader of the Year award.
Virtual | Talk | Machine Learning for Finance | Intermediate
The objective of this session is to make attendees familiar with the reasons why probabilistic machine learning is the next generation of AI in finance and investing…more details
Deepak Kanungo is the founder and CEO of Hedged Capital LLC, an AI-powered, proprietary trading and analytics firm built around probabilistic machine learning technologies. In 2005, long before machine learning was an industry buzzword, Deepak invented a probabilistic machine learning method and software system for managing the risks and returns of project portfolios. It is a unique framework that has been cited by IBM and Accenture, among others. Previously, Deepak was a financial advisor at Morgan Stanley, a Silicon Valley fintech entrepreneur, and a director in the Global Planning Department at Mastercard International. He was educated at Princeton University (astrophysics) and the London School of Economics (finance and information systems).
In-person | Talk | Machine Learning | MLOps and Data Engineering | Beginner
Machine learning has become an integral part of modern business operations, but the success of these projects depends on the quality of the underlying software. Unfortunately, many machine-learning prototypes fail to reach production systems because data science teams incur accidental and intentional technical debt faster than they get to their solution…more details
Yetunde Dada is the Director of Product Management at QuantumBlack, an AI-focused branch of McKinsey. She is instrumental in building products for Data Engineers and Data Scientists, including a notable Python library known as Kedro. Kedro is a distinguished product, marking the first open-source offering from McKinsey and QuantumBlack.
She holds an MBA from the Said Business School at the University of Oxford, earned in the 2017/2018 academic year. Her professional background is diverse and includes roles such as Data Engineer and Data Product Manager at Absa (formerly known as Barclays Africa Group Limited), Innovation Consultant at Engineers Without Borders South Africa, and a Mechanical Engineer.
In-person | Business Talk | Machine Learning for Finance | Machine Learning | All Levels
The purpose of this talk is to explain which business skills are most needed by analytics professionals, illustrate why each is so critical, and help analytics leaders to foster these skills within their teams. We will progress through the skills roughly in the order they are needed—from skills for the first year out of university up through skills needed to run an entire analytics program. In this talk, Dr. Stephenson will draw on best practices, case studies, research, and personal anecdotes from his twenty years of hands-on analytic leadership of teams of analytics professionals spanning six continents, as well as several years helping design and teach executive programs as an adjunct at the Amsterdam Business School…more details
David Stephenson has over 20 years of experience leading analytics initiatives, including as Head of Global Business Analytics at eBay Classifieds Group. Since founding DSI Analytics in 2014, he has worked directly with dozens of companies across a wide range of industries (Adidas, Miro, Janssen Pharmaceuticals, ABN Amro, Sky Broadcasting, etc). Dr. Stephenson also serves as part time faculty at the University of Amsterdam Business School, has published two books, and has developed and delivered data science trainings for hundreds of analytics professionals around the globe.
In-person | Talk | Machine Learning | Machine Learning Safety and Security | Data Engineering & Big Data | Responsible AI | Intermediate
If you’ve ever asked one of the questions above, then this talk is for you! You’ll learn how the ability to interpret a model can identify poor model performance or, worse, bias that could ultimately impact the fairness of your machine learning applications. You’ll learn about some of the most common algorithms, how they work and how you can get started using them yourself…more details
Ed Shee, Head of Developer Relations at Seldon. Having previously led a tech team at IBM, Ed comes from a cloud computing background and is a strong believer in making deployments as easy as possible for developers. With an education in computational modelling and an enthusiasm for machine learning, Ed has blended his work in ML and cloud native computing together to cement himself firmly in the emerging field of MLOps.
In-person | Tutorial | Data Engineering & Big Data | Deep Learning | Machine Learning | All Levels
This workshop presents Taipy, a new low-code Python package that allows you to create complete Data Science applications, including graphical visualization and managing algorithms, pipelines, and scenarios…more details
Florian Jacta is a specialist of Taipy, a low-code open-source Python package enabling any Python developers to easily develop a production-ready AI application. Package pre-sales and after-sales functions. He is data Scientist for Groupe Les Mousquetaires (Intermarche) and ATOS. He developed several Predictive Models as part of strategic AI projects. Also, Florian got his master’s degree in Applied Mathematics from INSA, Major in Data Science and Mathematical Optimization.
Alexandre worked in Amazon Business Intelligence.He developed a graph-based interactive Python editor: Pyflow (1.2k stars!). He is skilled in MLOps, Data Engineering, and Python. He has studied Master of Engineering – CentraleSupélec from University of Paris-Saclay.
In-person | Half-Day Training | Machine Learning for Finance | Intermediate
This half-day trading session covers the most important Python topics and skills to apply AI and Machine Learning (ML) to Algorithmic Trading. The session shows how to make use of the Oanda trading API (via a demo account) to retrieve data, to stream data, to place orders, etc. Building on this, a ML-based trading strategy is formulated and backtested. Finally, the trading strategy is transformed into an online trading algorithm and is deployed for real-time trading on the Oanda trading platform…more details
Dr. Yves J. Hilpisch is founder and CEO of The Python Quants (http://tpq.io), a group focusing on the use of open source technologies for financial data science, artificial intelligence, algorithmic trading, and computational finance. He is also founder and CEO of The AI Machine (http://aimachine.io), a company focused on AI-powered algorithmic trading based on a proprietary strategy execution platform.
Yves has a Diploma in Business Administration, a Ph.D. in Mathematical Finance and is Adjunct Professor for Computational Finance at Miami Herbert Business School.
In-person | Tutorial | Generative AI | Machine Learning | Deep Learning | NLP | Intermediate-Advanced
In the first part of the talk I will provide an overview of the latest generative AI models and how they work. This will include discussing the various types of generative AI models, such as diffusion models for image generation and transformer (GPT-like) models for text generation and their underlying architectures and key concepts…more details
Heiko Hotz is a Senior Solutions Architect for AI & Machine Learning at AWS with a special focus on Natural Language Processing (NLP), Large Language Models (LLMs), and Generative AI. He is also the founder of the NLP London Meetup group, bringing together NLP enthusiasts and industry experts.
In-person | Workshop | Machine Learning | Deep Learning | Intermediate
In this workshop we will illustrate both approaches using a consistent single example. We will use TensorFlow in a Colab notebooks, so all you need is a recent version of Chrome and a Google login. You will not need prior knowledge with TensorFlow, but need a good understanding of how training neural networks work as a prerequisite…more details
Oliver Zeigermann has been developing software with different approaches and programming languages for more than 3 decades. In the past decade, he has been focusing on Machine Learning and its interactions with humans.
In-person | Full-Day Training | NLP | Beginner
In this course we will go through Natural Language Processing fundamentals, such as pre-processing techniques,tf-idf, embeddings, and more. It will be followed by practical coding examples, in python, to teach how to apply the theory to real use cases…more details
Leonardo De Marchi holds a Master in Artificial intelligence and has worked as a Data Scientist in the sports world, with clients such as the New York Knicks. He now works in Thomson Reuters as VP of Labs, and also provides consultancy and training for small and large companies. His previous experience includes being Head of Data Science and Analytics in Bumble, the largest dating site with over 500 million users, heading the team through acquisition and an IPO.
Laura Skylaki is a Manager of Applied Research in Thomson Reuters Labs, where she leads advanced machine learning projects in the domain of Legal and Tax AI.With a career spanning more than a decade at the intersection of research and practical application, she has contributed technical expertise in diverse fields such as bioinformatics and stem cell biology, image processing and natural language processing. She holds a doctorate in stem cell bioinformatics from the University of Edinburgh, UK, and has been publishing on machine learning applications in leading academic journals since 2012.
In-person | Tutorial | Deep Learning | Machine Learning | NLP | Beginner
In this tutorial, we will illustrate the evolution of deep learning architectures and how KNIME Analytics Platform is naturally designed to keep up with these transformations. We will start off by introducing simple ANNs for a classification task. While easy to grasp, ANNs are not suitable to effectively work with sequential (e.g., texts and time series) or visual data (e.g, images and videos). Other, more complex architectures proved superior. We will zoom in on RNNs with LSTM units for text generation and time series forecasting; CNNs for image classification and styling; and GANs for synthetic image generation…more details
Roberto Cadili is a data scientist on the Evangelism team at KNIME. During his BSc. in Economics, he developed a genuine interest in statistics and data analysis. At the University of Konstanz, he pursued a MSc. in Social and Economic Data Science where he studied different machine learning algorithms and deep learning architectures with an emphasis on NLP and Computer Vision. As editor for Low Code for Data Science, he is helping the KNIME community shape successful data science stories, tutorials, and best practices that are worth sharing.
Emilio Silvestri is a Junior Data Scientist on the Evangelism Team at KNIME. He has a Master’s Degree in Computer Science at the University of Konstanz, with a special focus on Data Science and Artificial Intelligence. He is a certified KNIME Trainer and works for the KNIME Education Team to onboard and upskill people in their data science journey with courses and webinars.
In-person | Workshop | Intermediate
Real-Time Analytics is one of the new trends in the streaming space, but it can be hard to keep track of everything, especially as it seems like new products are being released every week. We’ll start off this session with a presentation that will give you a map to understand the space. This map will hopefully make it easier to understand where current and new tools fit into the space…more details
Mark Needham is an Apache Pinot advocate and developer relations engineer at StarTree. As a developer relations engineer, Mark helps users learn how to use Apache Pinot to build their real-time user-facing analytics applications. He also does developer experience, simplifying the getting started experience by making product tweaks and improvements to the documentation. Mark writes about his experiences working with Pinot at markhneedham.com. He tweets at @markhneedham.
In-Person | Tutorial | NLP | Machine Learning&Deep Learning | Intermediate-Advanced
While deep learning has driven impressive progress, one of the toughest remaining challenges is generalization beyond the training distribution. Few-shot learning is an area of research that aims to address this, by striving to build models that can learn new concepts rapidly in a more “human-like” way. While many influential few-shot learning methods were based on meta-learning, recent progress has been made by simpler transfer learning algorithms, and it has been suggested in fact that few-shot learning might be an emergent property of large-scale models. In this talk, I will give an overview of the evolution of few-shot learning methods and benchmarks from my point of view, and discuss the evolving role of meta-learning for this problem. I will discuss lessons learned from using larger and more diverse benchmarks for evaluation and trade-offs between different approaches, closing with a discussion about open questions…more details
Eleni is a Research Scientist at Google DeepMind, based in London UK. She obtained her PhD from the University of Toronto, advised by Professors Richard Zemel and Raquel Urtasun. Her research is centered around creating methods that allow efficient and effective adaptation of deep neural networks to cope with distribution shifts, introduction of new concepts, or removal of outdated or harmful knowledge, falling in the areas of few-shot learning, meta-learning, domain adaptation and machine unlearning.
Virtual | Workshop | All | Beginner-Intermediate
The goal of this session is to get you familiarized with diffusion models, their inner workings, and different approaches to data generation. We’ll use Google Colab to build and train a simple diffusion model. You should be comfortable using Jupyter Notebooks, and training simple models in PyTorch…more details
Daniel has been teaching machine learning and distributed computing technologies at Data Science Retreat, the longest-running Berlin-based bootcamp, for more than three years, helping more than 150 students advance their careers.
He writes regularly for Towards Data Science. His blog post “Understanding PyTorch with an example: a step-by-step tutorial” reached more than 220,000 views since it was published.
The positive feedback from the readers motivated him to write the book Deep Learning with PyTorch Step-by-Step, which covers a broader range of topics.
Daniel is also the main contributor of two python packages: HandySpark and DeepReplay.
His professional background includes 20 years of experience working for companies in several industries: banking, government, fintech, retail and mobility.
Virtual | Workshop | Machine Learning | Intermediate-Advanced
In this tutorial we will dive into a particular space science / engineering domain: the calibration of space instruments. For this we take a dedicated look at calibration data from the so-called Cosmic Dust Analyzer (CDA) that was part of NASA’s Cassini mission in the Saturnian system. Together, we will see how the data has been generated, explore their features and limits and will determine how deep learning can help us to create new state-of-the art calibration solutions for space missions…more details
Thomas is a Senior Machine Learning engineer, working in the automotive industry since 2019. Before joining the Research & Development department of a large manufacturer he was conducting research activities in space science. In parallel to his studies in Astro- and Geo-Physics and later PhD program, he participated in 2 major missions: ESA’s comet mission Rosetta/Philae and NASA’s & ESA’s Saturn spacecraft Cassini/Huygens; always with a special focus on cosmic dust. Additionally, he applies Machine Learning algorithms to analyse astronomy- and space-related data to derive new scientific insights or to create new methods for calibrating instruments. Besides his industry work, Thomas is a guest scientist at the Free University of Berlin, where he continues working on the Cassini-related datasets using Deep Learning. On his active YouTube channel Astroniz he shares his Python + Space Science + Machine Learning knowledge with a small community.
Virtual | Workshop | NLP | Machine Learning | Beginner-Intermediate
In this workshop, you’ll walk through a complete end-to-end example of using Hugging Face Transformers, involving both our open-source libraries and some of our commercial products. Starting from a dataset containing real-life product reviews from Amazon.com, you’ll train and deploy a text classification model predicting the star rating for similar reviews…more details
Julien is currently Chief Evangelist at Hugging Face. He’s recently spent 6 years at Amazon Web Services where he was the Global Technical Evangelist for AI & Machine Learning. Prior to joining AWS, Julien served for 10 years as CTO/VP Engineering in large-scale startups.
In-person | Tutorial | Machine Learning | Data Engineering & Big Data | All Levels
The workshop objective is to use the Yelp Dataset to create business recommendations for users exploiting the network composed of reviews, users, friends, tips, and businesses. The workshop will start from the downloaded jsons of the Yelp dataset from which we will create csvs for the import on a Neo4j Database…more details
Valerio Piccioni is an AI Engineer at LARUS who primarily focuses on Graph Neural Networks, but also likes to have a go with other deep learning fields like NLP and Computer Vision. He is also interested in MLOps as building machine learning models that can arrive into production is harder than it seems. Currently he is working on a project regarding fraud detection with graphs.
In-person | Tutorial | Generative AI | NLP | Deep Learning | Machine Learning | All Levels
During this talk you will learn more about Transformer-based models and some best practice to optimize domain specific fully Open Source LLMs to be deployed and used in private managed environments having computational power constraints. Attendees will learn about the critical importance of the Engineering more than Data Science behind LLMs management…more details
Guglielmo is a Biomedical Engineer with an extensive background in Software Engineering and Data Science applied to different contexts, such as Biotech Manufacturing, Healthcare and DevOps, just to mention the latest, and a lifelong learner. As part of the Manufacturing IT Advanced Mathematics and Modelling Data Science Team he is currently busy unlocking business value through Deep Learning projects, mostly in Computer Vision (not restricted to this field by the way).
He has been recognized as DataOps Champion at the Streamsets DataOps Summit 2019 and awarded as one of the Top 50 Tech Visionaries at the 2019 Dubai Intercon Conference.
He is also an international speaker and author of the following book: Hands-on Deep Learning with Apache Spark @Packt https://www.packtpub.com/big-data-and-business-intelligence/hands-deep-learning-apache-spark
In-person | Full-Day Training | Generative AI | Intermediate
This workshop is designed to explore how artificial intelligence can be used to generate creative outputs and to inspire technical audiences to use their skills in new and creative ways…more details
Leonardo De Marchi holds a Master in Artificial intelligence and has worked as a Data Scientist in the sports world, with clients such as the New York Knicks. He now works in Thomson Reuters as VP of Labs, and also provides consultancy and training for small and large companies. His previous experience includes being Head of Data Science and Analytics in Bumble, the largest dating site with over 500 million users, heading the team through acquisition and an IPO.
Virtual | Tutorial | MLOps & Data Engineering | Machine Learning | Intermediate
In this talk, Gal (Senior Data Scientist, Fiverr) and Itai (CPO, Mona) share how Fiverr utilizes advanced tools, both home-grown and bought, to bridge the gap between data science and business, empower data scientists to understand the behavior of their models in production and make sure their AI solutions bring the value they’re expected to deliver…more details
With over 10 years of experience (Google, AI-focused startups) with big data and as the CPO and head of customer success at Mona, the leading AI monitoring intelligence company, Itai has a unique view of the AI industry. Working closely with data science and ML teams applying dozens of solutions in over 10 industries, Itai encounters a wide variety of business use-cases, organizational structures and cultures, and technologies used in today’s AI world.
Gal Naamani has been working as a data scientist for 4 years, with the past 3 years being at Fiverr. As the Senior Data Scientist, Gal works closely with developers, analysts, product managers, and business owners on growth opportunities and new ideas, from research to production. Gal currently has leading roles in projects that are focused around search engine ranking, promoted ads, online bidding optimization, exploration-exploitation problems, monitoring, and more.
In-person| Tutorial | Deep Learning |Machine Learning | All Levels
Why should we try to unify the ML frameworks? Won’t we just create a new incompatible standard and make the ML fragmentation even worse? I will argue that the answer to these sensible and important questions is no…more details
Daniel Lenton is the creator of Ivy, which is an open-source framework with an ambitious mission to unify all other ML frameworks. Prior to starting Ivy, Daniel was a PhD student at Imperial College London, where he published research in the areas of machine learning, robotics and computer vision.
In-person | Workshop | Deep Learning | Machine Learning | All Levels
In this session, we will consider how Delta Lake can power feature stores, model registries and model serving, with Databricks and MLFlow providing the environment. Delta streamlines the processes of capturing training datasets across experimentations and is the foundation for a flexible, agile lake…more details
Tori Tompkins is a Senior Data Science Consultant at Advancing Analytics. Specialising in MLOps, Tori has worked on many ML and data science projects with Azure, Databricks and graph technologies and all stages of the ML Lifecycle. She is a co-presenter of the Data & AI podcast, Totally Skewed, founder of Girls Code Too UK and regular contributor with Girls in Data.
Alex is a data scientist at Advancing Analytics with a love for all things machine learning and MLOps. He has worked as a machine learning engineer for four years in fields ranging from wearable devices to agritech. Outside of the world of data science he is an avid fan of board games and TTRPGs, running a number of D&D campaigns and a small Lincoln based D&D community.
In-person | Workshop | Generative AI | Intermediate
This workshop will show you how a vector database can help to scale the power of modern deep-learning models and effortlessly combine them with your data. More specifically, you will learn about how vector databases can help you to work with large-scale vector embeddings, and integrate the power of large language models (LLMs)…more details
JP finds joy in technology and learning, as well as empowering others by helping to distill complex technologies into relatable concepts. He works at Weaviate as the Technical Curriculum Coordinator, facilitating education for vector databases and data science topics. When he’s not working, JP enjoys immersing himself in the worlds of games and sports. You might spot him working on his serve on the tennis court, or engaging in spirited board game sessions.
Virtual | Workshop
In this hands-on workshop, Data Scientist Felipe Adachi will introduce the concept of data logging and discuss how to validate data at scale by creating metric constraints and generating reports based on the data’s statistical profiles using the whylogs open-source package. He will also walk through steps that data scientists and ML engineers can take to tailor their set of validations to fit the specific needs of their business or project, take actions when their rules fail to be met, and debug and troubleshoot cases where data fails to behave as expected…more details
Felipe is a Data Scientist at WhyLabs. He is a core contributor to whylogs, an open-source data logging library, and focuses on writing technical content and expanding the whylogs library in order to make AI more accessible, robust, and responsible. Previously, Felipe was an AI Researcher at WEG, where he researched and deployed Natural Language Processing approaches to extract knowledge from textual information about electric machinery. He is also a Master in Electronic Systems Engineering from UFSC (Universidade Federal de Santa Catarina), with research focused on developing and deploying fault detection strategies based on machine learning for unmanned underwater vehicles. Felipe has published a series of blog articles about MLOps, Monitoring, and Natural Language Processing in publications such as Towards Data Science, Analytics Vidhya, and Google Cloud Community.
Virtual | Tutorial | Deep Learning | Machine Learning | NLP | Intermediate
This tutorial will be most useful for data scientists and ML practitioners who need to regularly train and tune large ML models on a time and compute budget. You will learn about basic modern tuning algorithms, and how to use the best method for your application. We will demonstrate how distributed tuning on AWS SageMaker can speed up finding the right model for your particular data. You will also learn about some advanced use cases of automated tuning…more details
Matthias W. Seeger is a principal applied scientist at Amazon. He received a Ph.D. from the School of Informatics, Edinburgh university, UK, in 2003 (advisor Christopher Williams). He was a research fellow with Michael Jordan and Peter Bartlett, University of California at Berkeley, from 2003, and with Bernhard Schoelkopf, Max Planck Institute for Intelligent Systems, Tuebingen, Germany, from 2005. He led a research group at the University of Saarbruecken, Germany, from 2008, and was assistant professor at the Ecole Polytechnique Federale de Lausanne from fall 2010. He joined Amazon as machine learning scientist in 2014. He received the ICML Test of Time Award in 2020.
His interests center around Bayesian learning and decision making with probabilistic models, from gaining understanding to making it work in large scale practice. He has been working on theory and practice of Gaussian processes and Bayesian optimization, scalable variational approximate inference algorithms, Bayesian compressed sensing, and active learning for medical imaging. More recently, he worked on demand forecasting, hyperparameter tuning (Bayesian optimization) applied to deep learning (NLP), and AutoML.
In-person | Full-Day Training | Data Analysis | Machine Learning | Beginner-Intermediate
Working with data can be challenging: it often doesn’t come in the best format for analysis, and understanding it well enough to extract insights requires both time and the skills to filter, aggregate, reshape, and visualize it. This session will equip you with the knowledge you need to effectively use pandas – a powerful library for data analysis in Python – to make this process easier…more details
Stefanie Molin is a software engineer and data scientist at Bloomberg in New York City, where she tackles tough problems in information security, particularly those revolving around data wrangling/visualization, building tools for gathering data, and knowledge sharing. She is also the author of “Hands-On Data Analysis with Pandas,” which is currently in its second edition. She holds a bachelor’s of science degree in operations research from Columbia University’s Fu Foundation School of Engineering and Applied Science, as well as a master’s degree in computer science, with a specialization in machine learning, from Georgia Tech. In her free time, she enjoys traveling the world, inventing new recipes, and learning new languages spoken among both people and computers.
Virtual | Bootcamp
Curious about Data Science? Self-taught on some aspects, but missing the big picture? Well, you’ve got to start somewhere and this session is the place to do it. This session will cover, at a layman’s level, some of the basic concepts of data science…more details
For more than 20 years, Todd has been highly respected as both a technologist and a trainer. As a tech, he has seen that world from many perspectives: “data guy” and developer; architect, analyst, and consultant. As a trainer, he has designed and covered subject matter from operating systems to databases to machine learning / AI to end-user applications, with an emphasis on data, programming, and results that matter. As a strong advocate for knowledge sharing, he combines his experience in technology and education to impart real-world use cases to students and users of analytics solutions across multiple industries. He has been a regular contributor to the community of analytics and technology user groups in the Boston area and beyond, writes and teaches on many topics, and looks forward to the next time he can strap on a dive mask and get wet.
In-person | Workshop | Machine Learning | Deep Learning | Generative AI | Beginner-Intermediate
In this two part workshop series we will step through how you can leverage AI in your current Data Analytics Plane. This is an interactive session and we expect that you will be following along as we go, but don’t worry we have git repos and notebooks at the ready. All you need to bring is your laptop and your favourite training data sets if you prefer not to use the ones we provide…more details
Shawn is passionate about harnessing the power of data strategy, engineering and analytics in order to help businesses uncover new opportunities. As an innovative technologist with over 15 years experience, Shawn removes technology as a barrier, and broadens the art of the possible for business and product leaders. His holistic view of technology and emphasis on developing and motivating strong engineering talent, with a focus on delivering outcomes whilst minimising outputs, is one of the characteristics which sets him apart from the crowd.
Shawn’s deep technical knowledge includes distributed computing, cloud architecture, data science, machine learning and engineering analytics platforms. He has years of experience working as a consultant practitioner for a variety of prestigious clients ranging from secret clearance level government organizations to Fortune 500 companies.
In-person | Tutorial | Deep Learning | NLP | Machine Learning | Intermediate
This tutorial is all about deep reinforcement learning. You might have heard about it in the media, from its use in generative language models (reinforcement learning from human feedback) or more directly in one of the many applications of this fascinating technology…more details
Dr. Phil Winder is a multidisciplinary engineer and data scientist. As the CEO of Winder.AI, an AI consultancy, he provides AI, ML, Data Science, and MLOps development and consulting services to businesses of all sizes. Previous clients include the likes of Google, Microsoft, Shell, Nestle, the UK Government and many more. More information is available on the website: https://Winder.AI.
Phil is also the author of the book “Reinforcement Learning: Industrial Applications of Intelligent Agents” published by O’Reilly (https://rl-book.com) and was an early champion of MLOps. Over the past decade, he has also trained thousands of data scientists and is a celebrated global speaker on AI topics.
Phil holds a Ph.D. and M.Eng. in electronic engineering from the University of Hull and lives in Yorkshire, U.K., with his brewing equipment and family.
Virtual | Workshop | Responsible AI | Machine Learning Safety and Security | Intermediate
Machine learning projects are rarely like a kaggle competition. It is thrilling to see your name jump up on the leaderboard which makes competitions exciting and dare I say addictive. However predictive power in real life is much less important than kaggle competitions would have you believe. Often it is just as important to understand why a model makes a certain prediction. The ‘why’ plays an important role during the model development phase as well as after deployment. Complex machine learning pipelines are difficult to debug and issues can go unnoticed. One way to help increase trust in the model during the development stage is to improve its interpretability…more details
Andras Zsom is an Assistant Professor of the Practice and Director of Graduate Studies at the Data Science Initiative at Brown University, Providence, RI. He is teaching two mandatory courses in the data science master’s program, and helps the students navigate through their studies and curriculum. He also supervises interns on various research projects related to missing data, interpretability, and developing machine learning pipelines.
Virtual | Workshop | NLP | Intermediate
Leaving this workshop, you will understand each of these topics, and you will have gained the practical, hands-on expertise to start integrating modern NLP in your domain. Participants will fine-tune and prompt engineer state-of-the-art models like BART and XLM-Roberta, and they will peer behind the curtain of world shaking technologies like ChatGPT to understand their utility and architectures…more details
Daniel Whitenack (aka Data Dan) is a Ph.D. trained data scientist working with SIL International on NLP and speech technology for local languages in emerging markets. He has more than ten years of experience developing and deploying machine learning systems at scale. Daniel co-hosts the Practical AI podcast, has spoken at conferences around the world (Applied Machine Learning Days, O’Reilly AI, QCon AI, GopherCon, KubeCon, and more), and occasionally teaches data science/analytics at Purdue University.
Virtual | Tutorial | Responsible AI | Deep Learning | Machine Learning | NLP | Advanced
I will briefly recall why explaining the predictions of a complex machine learning model (for example a neural network) is important. Then I will cover the different categories of state-of-the-art papers on this subject with a brief overview of the most well-known methods (such as LIME or SHAP) in each categories and a focus on the contributions made in my research team on the specific topic of time series classification…more details
Elisa Fromont is a full professor at Université de Rennes France, since 2017 and a Junior member of the Institut Universitaire de France (IUF). She works at IRISA research institute in the INRIA LACODAM (“Large Scale Collaborative Data Mining”) team. From 2008 until 2017, she was associate professor at Université Jean Monnet in Saint-Etienne, France. She worked at the Hubert Curien research institute in the Data Intelligence team. Elisa received her Research Habilitation (HDR) in December 2015 from the University of Saint-Etienne. Her research interests lie in (explainable) machine learning, data mining and, in particular, time series analysis.
Virtual | Workshop | Machine Learning | Deep Learning | GenAI | Intermediate
In this hands-on workshop, we show how to code and implement these methods to enhance feature engineering and drive even greater predictive accuracy...more details
Colin is a seasoned data scientist who has worked in the finance, healthcare, security, oil and gas, government, telecommunications, and marketing industries. He has a keen interest in exploring the relationship between humans and AI and has contributed to projects on AI ethics, governance, and the future of work. His work has gained global recognition from the World Economic Forum, and he has contributed to several important initiatives, including the Singapore government’s official AI strategy, PDPC AI Governance and Ethics Guidelines, and the Monetary Authority of Singapore Veritas Initiative. In addition to his professional work, Colin is a dedicated healthcare advocate who volunteers for cancer research.
In-person | Workshop | Generative AI | Intermediate
This workshop will show you how a vector database can help to scale the power of modern deep-learning models and effortlessly combine them with your data. More specifically, you will learn about how vector databases can help you to work with large-scale vector embeddings, and integrate the power of large language models (LLMs)…more details
JP finds joy in technology and learning, as well as empowering others by helping to distill complex technologies into relatable concepts. He works at Weaviate as the Technical Curriculum Coordinator, facilitating education for vector databases and data science topics. When he’s not working, JP enjoys immersing himself in the worlds of games and sports. You might spot him working on his serve on the tennis court, or engaging in spirited board game sessions.
Tutorial | In-person | Deep Learning | Machine Learning | All Levels
Imagine taking an aspirin for a pounding headache, only to have it cease right before you swallow the pill – was it the mere anticipation of relief or an arbitrary coincidence? Inferring causality is hard, and in this session we’ll explore recent developments in the science of causal discovery, as well as motivate why this tool should be part of every data scientist’s arsenal…more details
Andre joined causaLens from Goldman Sachs, where he was an executive director in the Model Risk Management group in Hong Kong and Frankfurt. Today he is working with industry leading, global organisations to apply cutting edge Causal AI research in production level solutions that empower individuals and teams to make better decisions. Andre received his PhD in theoretical physics from the University of Munich, where he studied the interplay between quantum mechanics and general relativity in black-holes.
In-Person | Talk | Generative AI | Beginner-Intermediate
The rapid rise of Large Language Models is creating new industries, new categories of products, and forcing the tech ecosystem to rethink what is possible with these new systems. This talk explores a number of patterns for using LLMs as building blocks, beyond building shallow wrappers around commercial APIs or open source models. Learn how LLMs can be used to inject intelligence into software systems that do semantic search, text classification, retrieval-augmented generation…more details
Jay Alammar, Through his popular machine learning blog, Jay has helped millions of engineers visually understand machine learning tools and concepts from the basic (ending up in NumPy, pandas docs) to the cutting-edge (The Illustrated Transformer, BERT, GPT-3).
In-person | Workshop | Deep Learning | Machine Learning | MLOps and Data Engineering | Beginner
In this hands-on workshop I will show you an easy method for deploying your own generative AI in production with model checkpoints, open source libraries such as Hugging Face, and MLOps deployment pipelines…more details
Tim is leading Graphcore’s Cloud Solutions product to help AI & ML software development teams build AI products and deploy ML capabilities in production. Tim has worn many hats in his career, from being a research engineer, data scientist and leading MLOps teams. Along the way, he’s gained experience across all stages of the development lifecycle, taking AI applications from experimentation to deployment.
In-person | Half-Day Training | Machine Learning | Machine Learning for Finance | All Levels
In this tutorial, we will present sktime – a unified framework for machine learning with time series. sktime covers multiple time series learning problems, including time series transformation, classification and forecasting, among others. In addition, sktime allows you to easily apply an algorithm for one task to solve another (e.g. a scikit-learn regressor to solve a forecasting problem). In the tutorial, you will learn about how you can identify these problems, what their key differences are and how they are related…more details
Franz Kiraly is the founder and a core developer of the open source framework sktime. His research is focused on software engineering for open source and data science, machine learning for structured learning tasks such as time series tasks, and robust empirical and statistical evaluation of algorithms in deployment. Franz held a faculty position at University College London 2013-2020, before he moved to industry R&D in principal data scientist roles.
Marc Rovira is a data scientist at Electrolux Group in Stockholm, with a strong focus on forecasting and time series analysis. He actively contributes to the sktime community as a council member and user representative. Prior to his industry experience, Marc completed a Ph.D. that explored the intersection of computational fluid mechanics, chemical engineering, and machine learning, with the aim of mitigating air pollution. His educational background also includes a master’s degree in aerospace engineering.
Virtual | Tutorial | Deep Learning | NLP | Intermediate
This workshop aims to provide an introduction to topological data analysis (TDA), a rapidly evolving area of research that focuses on studying the shape of high-dimensional data. With the increasing availability of large and complex datasets in various domains, the need for sophisticated methods to analyze and understand these datasets has also grown. TDA strives to develop a more comprehensive understanding of data by analyzing its geometry and topology…more details
Christian is Machine Learning Technical Leader at Mercado Libre, the largest e-commerce/fintech company in Latin America, where he dedicates his efforts to creating tools for monitoring and quality of learning models. He is a Computer Engineer and Master in Science with a major in Astronomy from UNAM (Universidad Nacional Autonoma de Mexico). He is a “Xoogler” and has more than 15 years of experience in the field of machine learning. He has lectured in almost a dozen countries.
Virtual | Workshop | Machine Learning | Data Engineering & Big Data | NLP | Deep Learning | Intermediate
In this talk, we will explore how PyCaret 3, an open-source machine learning library in Python, can significantly accelerate machine learning workflows. PyCaret 3 offers a low-code approach to building, training, and deploying machine learning models, making it an ideal tool for data scientists and developers who want to focus on the business problem rather than the technical details…more details
Innovator, Technologist, and a Data Scientist turned Product Manager with proven track record of building and scaling data products, platforms, and communities. Experienced in building and leading teams of data scientists, data engineers, and product managers. Strongly opinionated tech visionary and a thought partner to C-level leadership.
Moez Ali is an inventor and creator of PyCaret. PyCaret is an open-source, low-code, machine learning software. Ranked in top 1%, 8M+ downloads, 7K+ GitHub stars, 100+ contributors, and 1000+ citations.
Globally recognized personality for open-source work on PyCaret. Keynote speaker and top ten most-read writer in the field of artificial intelligence. Teaching AI and ML courses at Cornell, NY and Queens University, CA. Currently building world’s first hyper-focused Data and ML Platform.
Virtual | Tutorial | Machine Learning Safety and Security | Machine Learning | Deep Learning | NLP | All Levels
Join Alexandra for a hands-on tutorial on synthetic data fundamentals to learn how to create synthetic data you can trust, assess its quality, and use it for privacy-preserving ML training. As a bonus, we’ll look into boosting your ML performance with smart upsampling…more details
Alexandra Ebert is a Responsible AI, synthetic data & privacy expert and serves as Chief Trust Officer at MOSTLY AI. As a member of the company’s senior leadership team, she is engaged in public policy issues in the emerging field of synthetic data and Ethical AI and is responsible for engaging with the privacy community, with regulators, the media, and with customers. She regularly speaks at international conferences on AI, privacy, and digital banking and hosts The Data Democratization Podcast, where she discusses emerging digital policy trends as well as Responsible AI and privacy best practices with regulators, policy experts and senior executives.
Apart from her work at MOSTLY AI, she serves as the chair of the IEEE Synthetic Data IC expert group and was pleased to be invited to join the group of AI experts for the #humanAIze initiative, which aims to make AI more inclusive and accessible to everyone.
Before joining the company, she researched GDPR’s impact on the deployment of artificial intelligence in Europe and its economic, societal, and technological consequences. Besides being an advocate for privacy protection, Alexandra is deeply passionate about Ethical AI and ensuring the fair and responsible use of machine learning algorithms. She is the co-author of an ICLR paper and a popular blog series on fairness in AI and fair synthetic data, which was featured in Forbes, IEEE Spectrum, and by distinguished AI expert Andrew Ng.
In-person | Bootcamp | Machine Learning | Beginner
In this workshop, you will get acquainted with the pandas library, which is the most widely used package for reading, analyzing and exporting datasets in Python. You will also learn how to visualize many kinds of tabular data using the plotnine package, along with some tips and tricks on how to make your visualizations stand out. Lastly, you will have the opportunity make predictions and take decisions using data, based on basic statistical methods…more details
Leonidas (Leo) is a Senior Data Scientist at Astrazeneca. His work is focused around machine learning in oncology, including clinical and non clinical applications. He is also enthusiastic about NLP applications in oncology and how this can be used to leverage patient treatment. He is also a workshop facilitator in the European Leadership University (ELU), NL and has also been a data science educator at DataCamp. He holds a PhD from the University of Warwick, UK. in bioinformatics and ML, an MSc in statistics from Imperial College London, UK and a BSc in Statistics and Insurance Science from the University of Piraeus, GR.
Virtual | Bootcamp | Machine Learning | Beginner
The Introduction to Machine Learning Workshop will build upon the attendee’s foundation of math and coding knowledge to develop a basic understanding of the most popular machine learning algorithms used in industry today. We will answer such questions as: What are the different types of ML algorithms ? What is Overfitting and how can we avoid it? Why is XGBoost consistently outperform other algorithms?…more details
Julia Lintern currently works as a Director of Data Science at Gartner. Previously, she worked as a Data Scientist for the New York Times. Julia began her career as a structures engineer designing repairs for damaged aircraft. Julia holds an MA in applied math from Hunter College, where she focused on visualizations of various numerical methods and discovered a deep appreciation for the combination of mathematics and visualizations. During certain seasons of her career, she has also worked on creative side projects such as Lia Lintern, her own fashion label.
Virtual | Bootcamp | Beginner
In this class students will install Anaconda Python and Jupyter Labs. Using this Jupyter Lab interface I will cover the basics of Python programming. Topics will include built in data structures, functions, looping, decisions, and importing other libraries…more details
Phil Tracton is an IC design engineer at Medtronic and an instructor at UCLA Extension. He has worked at Medtronic for over 20 years and has experience in implementing firmware, FPGAs, and custom ASICs. Many thousands of people have his work implanted in them. Most of these devices are focused on Neuromodulation. He has recently joined an internal team focused on long term research for implantable devices.
At UCLA he teaches multiple Python based courses including Learning Python and Python on the Raspberry Pi.
He is interested in low power AI on edge devices.
He will be running the Fundamentals of Python training class. This is his second time teaching at an ODSC event.
In-person | Talk | AI Safety | Machine Learning | Deep Learning | Advanced
In this talk, I will cover several new and practical tools for improving evaluation in safety-critical settings that improve statistical guarantees of estimates, as well as provide more insights on how to perform robust evaluation in situations where traditional assumptions cannot be met. I will draw connections with topics from interpretability, causal inference and uncertainty estimation and discuss how these are all key for evaluation…more details
Sonali is an Assistant Professor and leader of the AI for Actionable Impact Group at Imperial College London. Her research focuses on decision-making in uncertainty, causal inference and building interpretable models to improve clinical care and deepen our understanding of human health, with applications in areas such as HIV and critical care. Prior to this, Sonali was a postdoctoral research fellow at Harvard. Her work has been published at a number of machine learning conferences (NeurIPS, AAAI, ICML, AISTATS) and medical journals (Nature Medicine, Nature Communications, AMIA, PLoS One, JAIDS). She was also a Swiss National Science Fellow and was named a Rising Star in AI in 2021. Sonali received her PhD (summa cum laude) in 2019 from the University of Basel, Switzerland, where she built intelligent models for understanding the interplay between host and virus in the fight against HIV. Apart from her research, Sonali is also passionate about encouraging more discussion about the role of ethics in developing machine learning technologies to improve society.
In-person | Track Keynote | Machine Learning | Intermediate
In this talk I will present two new open-source packages that make up a powerful and state-of-the-art marketing analytics toolbox. Specifically, PyMC-Marketing is a new library built on top of the popular Bayesian modeling library PyMC. PyMC-Marketing allows robust estimation of customer acquisition costs (via media mix modeling) as well as customer lifetime value…more details
Thomas Wiecki is co-creator of PyMC, the industry-standard tool for statistical data science in Python. To help businesses solve advanced analytical problems he founded PyMC Labs (www.pymc-labs.io) consisting of world-class experts in Bayesian modeling.
Virtual | Keynote | All | All Levels
In this session we take a deep-dive into Azure Machine Learning, a cloud service that you can use to track as you build, train, deploy, and manage models. We use the Azure Machine Learning Python SDK to manage the complete life cycle of a PyTorch model, from managing the data, to train the model and finally run it into a production Kubernetes cluster…more details
Henk is a Cloud Advocate specializing in Artificial intelligence and Azure with a background in application development. He is currently part of the AI cloud advocate team and based in the Netherlands. Before joining Microsoft, he was a Microsoft AI MVP and worked as a software developer and architect building lots of AI powered platforms on Azure.
He loves to share his knowledge about topics such as DevOps, Azure and Artificial Intelligence by providing training courses and he is a regular speaker at user groups and international conferences.
In-person | Talk | Machine Learning | Deep Learning | Intermediate-Advanced
In this session, we will provide examples of when quantum computing is best applied to accelerate health care-specific applications and biomedical research. We will also provide an overview of how quantum computing works and a short overview of how to leverage open-source libraries, specifically Qiskit and Q#, to build, train, and evaluate a machine learning model for breast cancer prediction using an open dataset. We will also review how to build and run these models on local simulators and how these algorithms can be deployed on quantum hardware through cloud providers such as Azure…more details
Dr. Schulz is a physician scientist with a background in computational healthcare, molecular biology, and virology. Dr. Schulz has over 20 years’ experience in software development with a focus on enterprise system architecture and has a research interests in the management of large, biomedical data sets and the use of real-world data for predictive modeling. At Yale School of Medicine, he has led the deployment of the organization’s data science infrastructure which consists of a composable computing infrastructure to support the development of biomedical AI applications. Dr. Schulz is also a co-founder of Refactor Health, a digital health startup focused on the development of AI-driven digital signatures and automated healthcare DataOps.
In-person | Talk | Data Engineering & Big Data | ALL | All Levels
During this session, we will discuss the enablers that organizations need to unlock productivity with analytics and the importance of optimized algorithmic performance in the cloud to reduce costs, so organizations can derive maximum value from their investments...more details
Spiros Potamitis is a data scientist and a global product marketing manager of forecasting and optimization at SAS. He has extensive experience in the development and implementation of advanced analytics solutions across different industries and provides subject matter expertise in the areas of forecasting, machine learning and AI. Prior to joining SAS, Spiros worked and led advanced analytics teams in various sectors such as credit risk, customer insights and CRM.
In-person | Talk | Machine Learning for Finance | Intermediate-Advanced
In the last years, several machine learning innovations have been introduced to improve the robustness of asset allocation with hierarchical clustering and seriation-based approaches, to improve the transparency of these heuristics with explainable AI and to generate synthetic correlations and correlated market returns to improve the coverage of backtests and scenario analysis beyond the historical paths. Together, these innovations offer a consistent pipeline for better understanding rule-based dynamic portfolio allocation strategies. This talk reviews recent developments and puts them into the context of the current market challenges…more details
Peter Schwendner leads the Institute of Wealth & Asset Management at Zurich University of Applied Sciences, School of Management and Law, Switzerland. His interests are financial markets, asset management and machine learning applications. With the European Stability Mechanism (ESM), he has been developing analytics for primary and secondary bond markets and tools for optimizing the issuance process. Currently, he is working on the BRIDGE Discovery project “Spatial sustainable finance: Satellite-based ratings of company footprints in biodiversity and water”. Within the European COST Action «Fintech and AI in Finance», he leads the working group «Transparency into Investment Product Performance for Clients».
In-person | Talk | Generative AI | Machine Learning, Deep Learning | All Levels
The session will cover the importance of explaining AI models and their limitations, building effective next-gen data products, evaluating audience and user needs, and the aspects of visualisation that will always require human input. It will focus on the practical implications of AI tools on the roles of data professionals – and look at how we can thrive in this exciting new era…more details
Alan Rutter is the founder of consultancy Fire Plus Algebra, and is a specialist in communicating complex subjects through data visualisation, writing and design. He has worked as a journalist, product owner and trainer for brands and organisations including Guardian Masterclasses, WIRED, Riskified,the Home Office, the Biotechnology and Biological Sciences Research Council and Liverpool School of Tropical Medicine.
In-person | Talk | ML for Finance | ML Safety (AI Safety) & ML Security | MLOps and Data Engineering | Intermediate
In this talk, I will explain what Zero Trust Architecture is, which problems in data science it solves and how you could implement this into DataOps and MLOps processes. Furthermore, I will connect the concepts to the GDPR and the new/ proposed AI Act and use concrete examples from my projects in cyber security, banking and retail…more details
Dr. Casper Rutjes is Chief Technology Officer (CTO) at ADC (Amsterdam Data Collective), a Data & AI Consultancy in Europe. Rutjes is responsible for R&D, (Tech) Partnerships, consulting quality & standardization and IT. He leads global teams of consulting specialists in the areas of strategy & innovation, data engineering and data science across our key industries, mainly healthcare/life science, public and finance. At clients he is a trusted advisor and senior project lead for challenges on the interface of regulation, IT and business.
In-person | Talk
Data Storytelling for Business delves into two indispensable elements of giving great presentations with and about data: compelling data visualizations and designing a coherent and persuasive narrative around data. These skills are relevant across all industries…more details
Isaac Reyes is a TEDx speaker, data scientist and international keynote presenter in data analytics, data visualization and data presentation. In 2018, his “Art of Data Storytelling” speaking tour visited 23 cities across 5 continents, impacting over 15,000 people with Data Storytelling skills. He is the Co-founder of StoryIQ, a data visualization training company with full-time speakers in New York City, Manila and Singapore. In previous roles, he was the Head of Data Science at Altis Consulting and lectured in statistical theory at the Australian National University. A participant experience focused trainer, he was a keynote speaker at the 2019 Open Data Science Conference in Brazil.
Virtual | Talk | Responsible AI | NLP | Deep Learning | GenAI | Machine Learning | All Levels
Western societies are marked by diverse and extensive biases and inequality that are unavoidably embedded in the data used to train machine learning. Algorithms trained on biased data will, without intervention, produce biased outcomes and increase the inequality experienced by historically disadvantaged groups…more details
Professor Sandra Wachter is Professor of Technology and Regulation at the Oxford Internet Institute at the University of Oxford where she researches the legal and ethical implications of AI, Big Data, and robotics as well as Internet and platform regulation. At the OII, Professor Sandra Wachter leads and coordinates the Governance of Emerging Technologies (GET) Research Programme that investigates legal, ethical, and technical aspects of AI, machine learning, and other emerging technologies.
Professor Wachter is also an affiliate and member at numerous institutions, such as the Berkman Klein Center for Internet & Society at Harvard University, World Economic Forum’s Global Futures Council on Values, Ethics and Innovation, the European Commission’s Expert Group on Autonomous Cars, the Law Committee of the IEEE, the World Bank’s Task Force on Access to Justice and Technology, the United Kingdom Police Ethics Guidance Group, the British Standards Institution, the Bonavero Institute of Human Rights at Oxford’s Law Faculty and the Oxford Martin School. Professor Wachter also serves as a policy advisor for governments, companies, and NGO’s around the world on regulatory and ethical questions concerning emerging technologies.
Virtual | Talk | Machine Learning | Deep Learning | NLP | Beginner-Intermediate
In the field of healthcare, AI has been applied across the spectrum from diagnostics to prognostics. Many of these applications have been successfully commercialised yet only some are used in everyday patient care. This talk will introduce the audience to the science behind AI for disease detection (diagnosis) and prediction (prognosis) with a particular focus on musculoskeletal health. We will explore the link between big health data and AI, and finally highlight challenges and opportunities in reliable, representative, scalable and ethical uptake of AI technology in real-world clinical practice…more details
Sara is a Senior Research Associate in Biomedical Data Science and University Research Lecturer at the University of Oxford, where she is the Machine Learning Lead in the Centre for Statistics in Medicine. She has 12 years of experience in machine learning, signal processing, and intelligent remote monitoring research, with applications in biomedical and planetary health informatics. Sara has served on the NASA Frontier Development Lab Artificial Intelligence Panel and the NASA Climate Challenge Big Think. She is a National Geographic Society Explorer in Tracking Plastic Pollution with Remote Monitoring and Machine Learning. Sara is also a University of Oxford Ambassador for Women in Data Science.
In-person | Talk | Machine Learning | Intermediate
Pandas 2 brings new Arrow data types, faster calculations and better scalability. Dask scales Pandas across cores. Polars is a new competitor to Pandas designed around Arrow with native multicore support. Which should you choose for modern research workflows? We’ll solve a “just about fits in ram” data task using the 3 solutions, talking about the pros and cons so you can make the best choice for your research workflow. You’ll leave with a clear idea of whether Pandas 2, Dask or Polars is the tool to invest in…more details
Ian is a Chief Data Scientist, has helped co-organise the annual PyDataLondon conference raising $100k+ annually for the open source movement along with the associated 12,000+ member monthly meetup. Using data science he’s helped clients find $2M in recoverable fraud, created the core IP which opened funding rounds for automated recruitment start-ups and diagnosed how major media companies can better supply recommendations to viewers. He gives conference talks internationally often as keynote speaker and is the author of the bestselling O’Reilly book High Performance Python (2nd edition). He has over 25 years of experience as a senior data science leader, trainer and team coach. For fun he’s walked by his high-energy Springer Spaniel, surfs the Cornish coast and drinks fine coffee. Past talks and articles can be found at:
https://notanumber.email/
https://github.com/ianozsvald/
Tweets by ianozsvald
https://fosstodon.org/@ianozsvald
https://www.linkedin.com/in/ianozsvald/
In-person | Talk | Generative AI | Machine Learning | Deep Learning | Beginner
ChatGPT is the fastest-growing user application in history. Still, this application only has access to information it saw during training and sometimes produces false information, called hallucinations. In this talk, we will show you how to bring your data to LLMs and how to evaluate LLMs for your use case using open-source technology…more details
Timo Möller is Co-Founder of deepset and Head of Solution Engineering. He works closely together with deepset’s clients to bring modern NLP into production. He is an open-source fan and a passionate NLP engineer. Currently, he works on retrieval augmented generation, auto-generating training data, and ways to detect hallucinations.
In-person | Talk | Machine Learning for Finance
In this talk, we will look at how deep learning techniques can be used for building fast option pricers. A large set of representative training data is generated by using the numerical pricers. Then deep neural networks are used to learn the non-linear pricing functions…more details
Chakri Cherukuri is a senior researcher in the Quantitative Financial Research Group at Bloomberg LP in NYC. His research interests include quantitative portfolio management, algorithmic trading strategies, and applied machine learning. He has extensive experience in scientific computing and software development. Previously, he built analytical tools for the trading desks at Goldman Sachs and Lehman Brothers. He holds an undergraduate degree in mechanical engineering from the Indian Institute of Technology (IIT) Madras, India, and an MS in computational finance from Carnegie Mellon University.
In-person | Talk | Machine Learning Safety and Security | Deep Learning | Intermediate
This lecture will describe progress with developing automated certification techniques for learnt software components to ensure safety and adversarial robustness of their decisions. I will discuss different dimensions of robustness, including to bounded perturbations and causal interventions, as well as the role of uncertainty and explainability…more details
Marta Kwiatkowska is Professor of Computing Systems and Fellow of Trinity College, University of Oxford. She is known for fundamental contributions to the theory and practice of model checking for probabilistic systems, and is currently focusing on safety, robustness and fairness of automated decision making in Artificial Intelligence. She led the development of the PRISM model checker (www.prismmodelchecker.org), which has been adopted in diverse fields, including wireless networks, security, robotics, healthcare and DNA computing, with genuine flaws found and corrected in real-world protocols. Her research has been supported by two ERC Advanced Grants, VERIWARE and FUN2MODEL, EPSRC Programme Grant on Mobile Autonomy and EPSRC Prosperity Partnership FAIR. Kwiatkowska won the Royal Society Milner Award, the BCS Lovelace Medal and the Van Wijngaarden Award, and received an honorary doctorate from KTH Royal Institute of Technology in Stockholm. She is a Fellow of the Royal Society, Fellow of ACM and Member of Academia Europea.
In-person | Track Keynote | Data Engineering & Big Data | Deep Learning | Machine Learning | All Levels
The biggest challenges for developers of AI applications very often consist in building & delivering software to be used as a decision-making tool by operational staff. We will present how these challenges have been addressed using 2 successful projects: a cash flow prediction application (for one of Europe’s largest retailers) and a sales prediction app for a Quick Restaurant service…more details
Florian Jacta is a specialist of Taipy, a low-code open-source Python package enabling any Python developers to easily develop a production-ready AI application. Package pre-sales and after-sales functions. He is data Scientist for Groupe Les Mousquetaires (Intermarche) and ATOS. He developed several Predictive Models as part of strategic AI projects. Also, Florian got his master’s degree in Applied Mathematics from INSA, Major in Data Science and Mathematical Optimization.
Marine has 5+ years of experience as Data Scientist. She is skilled in Machine Learning techniques, Python, Rule-based models & AI. She has strong experience in Predictive and Descriptive Analytics, Fraud detection. She has done her Master’s Degree, Msc Big Data Analytics for Business from IÉSEG School of Management. Accounting & Finance from McGill University, Hong Kong University of Science and Technology and Europe Business School.
Virtual | Talk | Deep Learning | Machine Learning | NLP | Beginner-Intermediate
This talk compares a cloud-native data streaming architecture to traditional batch and big data alternatives and explains benefits like the simplified architecture, the ability of reprocessing events in the same order for training different models, and the possibility to build a scalable, mission-critical ML architecture for real time predictions with muss less headaches and problems…more details
Kai Waehner is Field CTO at Confluent. He works with customers and partners across the globe and with internal teams like engineering and marketing. Kai’s main area of expertise lies within the fields of Data Streaming, Analytics, Hybrid Cloud Architectures and Internet of Things. Kai is a regular speaker at international conferences, writes articles for professional journals, and shares his experiences with industry use cases and new technologies on his blog: www.kai-waehner.de. Contact: kai.waehner@confluent.io / @KaiWaehner / linkedin.com/in/kaiwaehner.
Virtual | Talk | Machine Learning for Finance | All Levels
The principal component analysis (PCA) is a staple statistical and unsupervised machine learning technique in finance. The application of PCA in a financial setting is associated with several difficulties, such as numerical instability and nonstationarity. We attempt to resolve them by proposing two new variants of PCA: an iterated principal component analysis (IPCA) and an exponentially weighted moving principal component analysis (EWMPCA). Both variants rely on the Ogita-Aishima iteration as a crucial step…more details
Bio Coming Soon!
Virtual | Talk | Deep Learning | Machine Learning | Intermediate
A central challenge to contemporary AI is to integrate learning and reasoning. The integration of learning and reasoning has been studied for decades already in the fields of statistical relational artificial intelligence and probabilistic programming. Statistical relational AI has focussed on unifying logic and probability, the two key frameworks for reasoning, and has extended this probabilistic logics machine learning principles…more details
Luc De Raedt is full professor at the Department of Computer Science, KU Leuven, and director of Leuven.AI, the newly founded KU Leuven Institute for AI. He is a guestprofessor at Örebro University in the Wallenberg AI, Autonomous Systems and Software Program. He received his PhD in Computer Science from KU Leuven (1991), and was full professor (C4) and Chair of Machine Learning at the Albert-Ludwigs-University Freiburg, Germany (1999-2006). His research interests are in Artificial Intelligence, Machine Learning and Data Mining, as well as their applications. He is well known for his contributions in the areas of learning and reasoning, in particular for his work on probabilistic and inductive programming. He co-chaired important conferences such as ECMLPKDD 2001 and ICML 2005 (the European and International Conferences on Machine Learning), ECAI 2012 and IJCAI-ECAI in 2022 (the European and international AI conferences). He is on the editorial board of Artificial Intelligence, Machine Learning and the Journal of Machine Learning Research. He is an EurAI and AAAI fellow, an IJCAI Trustee and received and ERC Advanced Grant in 2015.
Virtual | Talk | Generative AI | NLP | Deep Learning | Machine Learning
This session will provide an overview of these challenges and opportunities of PLMs for text summarisation using the biomedical domain as an example…more details
Sophia Ananiadou is Professor in Computer Science, Department of Computer Science, the University of Manchester. She is also Director of the National Centre for Text Mining (NaCTeM)); Deputy Director of the University’s Institute of Data Science and AI (IDSAI); Distinguished Research Fellow at the AI Research Centre of the National Institute of Advanced Industrial Science and Technology, Japan; Alan Turing Institute Fellow; Honorary Professor, University of the Aegean and Member of European Laboratory for Learning and Intelligent Systems Society. Her research interests evolved from abstract work on fragments of linguistic theory and logic to exploration of how AI systems could acquire and exploit knowledge of language, particularly in specialised domains (biomedicine, chemistry, exposome, law, public health). Research contributions include neural information extraction, text summarisation and simplification, emotion detection, terminology, development of resources (lexica, terminologies and labelled data), annotation tools and interoperable platforms for NLP workflows. She has developed tools such as the RobotAnalyst to improve evidence-based decisions, cut costs and improve efficiency and robustness of key policy decisions in public health.
In-person | Talk | Generative AI | NLP
Language models are increasingly attracting interest from writers. However, such models lack long-range semantic coherence, limiting their usefulness for longform creative writing. We address this limitation by applying language models hierarchically, in a system we call Dramatron…more details
Dr. Piotr Mirowski is a Staff Research Scientist at DeepMind. His research on artificial intelligence covers the subjects of reinforcement learning, navigation, weather and climate forecasting, as well as a socio-technical systems approach to human-machine collaboration and to computational creativity. He is the author of over 60 papers that have been published in Nature, Genome Biology, Clinical Neurophysiology or at ICLR, AAAI and NeurIPS. Piotr studied computer science in France at ENSEEIHT Toulouse and obtained his PhD in computer science in 2011 at New York University, with a thesis supervised by Prof. Yann LeCun (Outstanding Dissertation Award, 2011). A trained actor himself, Piotr founded and directs Improbotics, a theatre company where human actors and robots improvise live comedy performances and investigate the use of AI for artistic human and machine-based co-creation. https://piotrmirowski.com
In-person | Talk | All Levels
Services like ChatGPT and others powered by Generative AI are fueling innovation and efficiency across industries. However, for enterprises these services do not come without their risks, as they raise critical questions regarding data privacy and contextual accuracy considerations. In this presentation, we delve into the deployment of open source LLMs within secure environments. We discuss the advantages of this approach for enterprises, including heightened data privacy, improved accuracy, and greater control over AI implementations in enterprise settings…more details
Jake currently holds the position of Principal Technical Evangelist at Cloudera, where he promotes the strengths of Cloudera’s Lakehouse for delivering trusted AI. His tenure at Cloudera began as a Senior Product Marketing Manager for Cloudera Machine Learning (CML).
Before Cloudera, Jake developed his ML expertise at ExxonMobil, starting as a Data Scientist and later transitioning to a Data Science and Analytics Solution Architect role. He also contributed significantly at FarmersEdge, taking on responsibilities as a Senior Data Scientist and subsequently as a Data Science Manager.
Jake earned both his bachelor’s and master’s degrees from Brigham Young University in Information Systems Management with an emphasis in Statistics.
Outside of work, Jake is passionate about outdoor activities. He enjoys skiing, golfing, rafting, and hiking. However, spending time with his family amidst the mountains remains his most rewarding pastime.
In-person | Talk | Machine Learning | Beginner-Intermediate
In this session, we confront the widely acknowledged limitation in traditional statistical analysis and machine learning: ‘correlation is not causation.’ We start by dissecting this concept, outlining the challenges it presents when trying to derive meaningful insights from data…more details
Bernardo is a Data & AI leader, passionate about powering data transformation in companies and promoting social good in society using data.
He is specialized in Data Science, Machine Learning and AI, having won two awards in this field (Innovation in Big Data Award by Thomson Reuters and Machine Learning & Neural Computation Award by Imperial College London). His goal is to be able to take any challenge, no matter how complex, and to solve it using a fusion of art & science, business & technology capabilities, data & analytics to make it happen.
Bernardo has an MRes in Advanced Computing from Imperial College London, with a specialization in Machine Learning and a BSc in Electrical and Computer Engineering from Instituto Superior Técnico.
In-person | Talk | Machine Learning | Deep Learning | Intermediate
In this session, we’ll explore and discuss the following:
– Why and what is Ray
– How AIR, built atop Ray, allows you to program and scale your machine learning workloads easily
– AIR’s interoperability and easy integration points with other systems for storage and metadata needs
– AIR’s cutting-edge features for accelerating the machine learning lifecycle such as data preprocessing, last-mile data ingestion, tuning and training, and serving at scale..more details
Kai Fricke is a senior software engineer at Anyscale. As a core maintainer of the Ray AI Runtime he is building software for distributed machine learning training and tuning. During his postdoc at Cambridge he utilized reinforcement learning to optimize large graph structures and co-authored two open source reinforcement learning libraries.
In-person | Business Talk | Machine Learning | ML for Finance | MLOps and Data Engineering | Beginner-Intermediate
The successful deployment of machine learning (ML) models into production has traditionally been a complex and resource-intensive process that many organizations struggle with. With the rise of MLOps, a methodology that applies DevOps principles to ML, this process has become much more streamlined. At the Dutch fintech Mollie, we have fully embraced MLOps and implemented a cloud-based ML platform that supports both batch and real-time inference, as well as a suite of MLOps tools to facilitate the entire development cycle…more details
In-person | Talk | Generative AI | NLP Deep Learning | Machine Learning | Intermediate
Large Language Models (LLMs) such as GPT, LLaMa etc are everywhere these days. In this talk, we will see how to leverage the LLMs when using the Julia Programming Language. We will discuss how to run inference using these models from Julia, how to fine tune them, and even how to access third party hosted models from Julia code. At the end of this session, a Julia developer will have all the tools needed to use LLMs when writing Julia code…more details
Avik Sengupta is the head of product development and software engineering at Julia Computing, contributor to open source Julia and maintainer of several Julia packages. Avik is the author of Julia High Performance, co-founder of two artificial intelligence start-ups in the financial services sector and creator of large complex trading systems for the world’s leading investment banks.
Virtual | Talk | Responsible AI | NLP | Deep Learning | GenAI | Machine Learning | Beginner
In recent years fairness in machine learning (ML) and artificial intelligence (AI) has emerged as a highly active area of research and development. Most define fairness in simple terms, where fairness means reducing gaps in performance or outcomes between demographic groups while preserving as much of the accuracy of the original system as possible. This oversimplification of equality through fairness measures is troubling. Many current fairness measures suffer from both fairness and performance degradation, or “levelling down,” where fairness is achieved by making every group worse off, or by bringing better performing groups down to the level of the worst off…more details
Professor Brent Mittelstadt is an Associate Professor, Senior Research Fellow, and Director of Research at the Oxford Internet Institute, University of Oxford. He leads the Governance of Emerging Technologies (GET) research programme which works across ethics, law, and emerging information technologies. He is a prominent data ethicist and philosopher specializing in AI ethics, algorithmic fairness and explainability, and technology law and policy. Prof. Mittelstadt is the author of foundational works addressing the ethics of algorithms, AI, and Big Data; fairness, accountability, and transparency in machine learning; data protection and non-discrimination law; group privacy; ethical auditing of automated systems; and digital epidemiology and public health ethics. His contributions in these areas are widely cited and have been implemented by researchers, policy-makers, and companies internationally, featuring in policy proposals and guidelines from the UK government, Information Commissioner’s Office, and European Commission, as well as products from Google, Amazon, and Microsoft.
Virtual | Talk | NLP | Machine Learning | Deep Learning | Generative AI | All Levels
In this talk, I will first introduce the field of semantics and the task of semantic analysis, a.k.a, semantic parsing from a multilingual perspective. In particular, we will first discuss the layers of meaning, from morphology to pragmatics, and then define the scope of semantics as a field…more details
Dr. Gözde Gül Şahin is an Assistant Prof. at Koç University and a KUIS AI Fellow since February 2022. Previously, she was a postdoctoral researcher in the Ubiquitous Knowledge Processing (UKP) Lab at the Technical University of Darmstadt, Germany. Her research spans the fields of linguistics and machine learning, in particular semantics, multilingual representations and large language models. She completed her PhD studies in Istanbul Technical University (İTÜ) Computer Engineering department in 2018. She was a visiting researcher at the Institute for Language, Cognition and Computation (ILCC) of the University of Edinburgh in 2017. Before her Ph.D., she received her Masters and Bachelor degrees from Sabancı University in 2011 and İTÜ in 2009, respectively. She regularly serves as a PC member for *ACL conferences and is a co-organizer for the Workshop on Multilingual Representation Learning (MRL). Her research on NLP has been funded by Tübitak 2232, and 2236 grant programs that are granted to outstanding young principal investigators.
Virtual | Talk | Responsible AI and Social Good | Machine Learning | Deep Learning | Intermediate
The presentation will provide information on value-alignment methods, will give insights on how to address the construction of morality in machines, and will discuss the importance of teaching tecno-ethics in education…more details
Carles Sierra is Director of the Artificial Intelligence Research Institute (IIIA) of the Spanish National Research Council (CSIC) located in Barcelona. He is the President of EurAI, the European Association of Artificial Intelligence. He has been contributing to Artificial Intelligence research since 1985 in the areas of Knowledge Representation, Auctions, Electronic Institutions, Autonomous Agents, Multiagent Systems and Agreement Technologies. He is or has been a member of several editorial boards of journals, including AIJ and JAIR, two of the most prestigious generalist journals, and was the editor in chief of the JAAMAS journal, specialized in autonomous agents. He organized IJCAI, the most important international artificial intelligence conference in 2011 in Barcelona and was the President of the IJCAI Program Committee in 2017 in Melbourne. He is a Fellow of the European Association of AI, EurAI, and recipient of the ACM/SIGAI Autonomous Agents Research Award 2019.
Virtual | Talk | Generative AI | Machine Learning | Deep Learning | Beginner
In this talk, I will describe the latest developments in methodologies that can be used to detect social biases in texts generated by GAI systems. In particular, I will describe methods that can be used to detect social biases expressed not only in English but other languages as well, with minimal human intervention…more details
Danushka Bollegala is a Professor in the Department of Computer Science, University of Liverpool, UK. He obtained his PhD from the University of Tokyo in 2009 and worked as an Assistant Professor before moving to the UK. He has worked on various problems related to Natural Language Processing and Machine Learning. He has received numerous awards for his research excellence such as the IEEE Young Author Award, best paper awards at GECCO and PRICAI. His research has been supported by various research council and industrial grants such as EU, DSTL, Innovate UK, JSPS, Google and MSRA. He is an Amazon Scholar.
In-person | Talk | Machine Learning | Intermediate
The session will commence with an overview of the bottom-up and top-down modelling approaches, highlighting their respective strengths and limitations in various data science applications. Attendees will learn how bottom-up modelling focuses on individual components and their interactions, such as modelling individual customer demand in a supply chain, while top-down modelling emphasises the high-level relationships between components to provide a broader perspective, like analysing the overall market trends affecting the supply chain…more details
Gustavo is the esteemed Vice President of Research at Vortexa Ltd., where he has focused on applying statistical modelling and Machine Learning to the energy and freight markets. His research interests span computational neuroscience, medical imaging, and the development of innovative solutions for the energy sector.
Prior to his tenure at Vortexa, Gustavo amassed a wealth of experience in both the academic and professional realms. He has published his research in prestigious international journals and presented his findings at scientific conferences across the globe. Gustavo’s dedication to finding optimal solutions for complex business problems is evident in his work.
Gustavo holds an SB and MEng in Computer Science and Electrical Engineering from the Massachusetts Institute of Technology (MIT) and a PhD from the University of Tokyo. As an expert in his field, Gustavo brings a depth of knowledge and experience to ODSC, where attendees can expect to learn from his invaluable insights.
In-person | Talk | ML for Finance | Machine Learning | Deep Learning | Intermediate
This talk will focus on forecasting inflation with ML and ‘alternative data’. It will show the steps of building such models, the improvements over the traditional econometric models, and will describe the many hurdles a practical implementation of such an approach entails…more details
Alexander is a Quant & Data Scientist with 20 years of accumulated experience both in specialist and leadership positions in global financial institutions.. Mastering the main AI/ML techniques, he is also strictly specialized and personally contributed to the field of Probabilistic Graphical Models, Causal AI and Alternative Data. Alexander has authored/co-authored 10+ papers and 3 books on these topics. He holds a degree in Mathematical Finance from University of Oxford where he is a Visiting Lecturer on Bayesian Risk Management and Alternative Data. Currently he is CEO of Turnleaf Analytics.
In-person | Talk | NLP | Machine Learning | Intermediate
In this talk, I will first give an overview of the built-in functionality available in spaCy, using pretrained models. I will showcase how linguistic information such as part-of-speech tags and dependency parses can help you identify interesting patterns or phrases in your documents and ultimately perform document classification or other information retrieval tasks…more details
Sofie is a machine learning and NLP engineer who firmly believes in the power of data to transform decision making in industry. She has a Master in Computer Science (software engineering) and a PhD in Sciences (Bioinformatics), and more than 16 years of experience in Natural Language Processing and Machine Learning, including in the pharmaceutical industry and the food industry. In 2019, she joined Explosion to work on the open-source NLP library spaCy. She is currently leading the open-source team developing and maintaining spaCy, as well as various other open-source developer tools for data scientists.
In-person | Talk | Machine Learning | Deep Learning | NLP | Beginner-Intermediate
During the talk, we’ll show how Ludwig’s novel compositional model architecture referred to as encoder-combiner-decoder makes it possible to easily mix multiple modalities of data such as text, images, audio with structured data in a way that is consistently easy across tasks like regressions, classification, and even generation…more details
Dev is co-founder and Chief Product Officer for Predibase, a company looking to redefine how data scientists and engineers build models with a declarative approach. Prior to Predibase, he was a ML PM at Google working across products like Firebase, Google Research and the Google Assistant as well as Vertex AI. While there, Dev was also the first product manager for Kaggle – a data science and machine learning community with over 8 million users worldwide. Dev’s academic background is in computer science and statistics, and he holds a masters in computer science from Harvard University focused on machine learning.
In-person | Talk | Data Engineering | MLOps | Intermediate
In this talk, Leanne will take us through how the FT, already with a large number of models in production, are spearheading a journey to improve, iterate and upgrade the way they develop, deploy and monitor their ML and Data Science capabilities, all whilst keeping their current capabilities running. Leanne will highlight they key approaches and considerations when looking to improve your MLOps processes, and how you can expedite your ML in production activities, while ensuring you keep “the car on the road”…more details
Leanne is Director of Data Science at the Financial Times and is a passionate, experienced data leader having built and developed empowered data science and analytics teams for a variety of businesses; from startups to large organisations. Leanne is in her element when developing and implementing strategic, technical and cultural solutions to getting data & analytical capabilities into the operational ecosystem. She is an active part of the data and technology community, sharing innovation and insights to encourage best practice, from Manchester, UK to Austin, TX and is an Advisory Panel Board Member. Outside of all things data you can ask Leanne about her golf swing (it’s not good – yet), her passion for American Football (specifically the Cincinnati Bengals), her latest sewing project, and her love for good music, food and whisky.
In-person | Talk | MLOps & Data Engineering | Responsible AI | Beginner
From this talk you will learn:
– What ML Governance is meant to achieve
– How to get started with a template process
– The role of documentation (and especially Google Model Cards)
– Which roles have what responsibilities
– The relevance of a governance board
Ryan Dawson is a technologist passionate about data. Ryan works with clients on large-scale data and AI initiatives, helping organizations get more value from data. His work includes strategies to productionize machine learning, organizing the way data is captured and shared, selecting the right data technologies and optimal team structures, as well as writing the code to make it happen. He has over 15 years of experience and, as well as many widely read articles about MLOps, software design, and delivery. is author of the Thoughtworks Guide to Evaluating MLOps Platforms.
Meissane Chami serves ThoughtWorks, Inc. as a Senior ML Engineer, advising and developing innovative data science and machine learning solutions from proof of concept to production. She has gained expertise setting up innovation frameworks and conducting fast cycle proof of concepts. Her primary areas of expertise are in Natural Language processing, MLOps, DevOps, cloud computing, containerisation and Python. She holds a MSc degree in Machine Learning and Data Science form University College London School of Engineering.
Virtual | Talk | Generative AI | Deep Learning | NLP | All Levels
Deep learning especially large language models has been gaining a lot of recent traction from research community. This talk builds some background in deep learning towards explaining the concepts of large language models. Afterward, this talk lists different popular large language models, conducts brief comparison in terms of techniques and accuracy results among different large language models…more details
Hossam Amer joined Microsoft as a scientist in 2021. His research interests are Image/Video Compression, Computer Vision, and most recently Natural Language Processing. Hossam is contributing to many products including Microsoft Translator and Microsoft SwiftKey. Prior to joining Microsoft, Hossam was a Postdoctoral-Fellow at the Multimedia Communications Lab at the University of Waterloo (UW), where he mentored several MSc and PhD students. He obtained his PhD from the same lab, where he received the prestigious annual UW teaching award based on students’ and instructors’ nominations as well as published papers in top venues. Hossam also acts as a reviewer in several IEEE conferences and journals and supervises students in research and teaching. In addition, Hossam was the Chair of the ECE Graduate Student Association at UW. Hossam is a strong believer in constantly transferring his knowledge in order to make a difference.
Virtual | Talk | Machine Learning | Deep Learning | Intermediate
A common problem in the cybersecurity industry is how to detect and track botnets when there are billions of daily attacks. Botnets are internet connected devices that perform repetitive tasks, such as Distributed Denial of Service (DDoS). In many cases, these consumer devices are infected with malicious malware that is controlled by an external entity, often without the owner’s knowledge…more details
Ori Nakar is a principal cyber-security researcher, a data engineer, and a data scientist at Imperva Threat Research group. Ori has many years of experience as a software engineer and engineering manager, focused on cloud technologies and big data infrastructure. Ori also has an AWS Data Analytics certification. In the Threat Research group, Ori is responsible for the data infrastructure and involved in analytics projects, machine learning, and innovation projects.
In-person | Talk | Generative AI | Beginner-Intermediate
Daily communication via text between customer service agents and clients is rapidly increasing day by day, and banking is not an exception. In this talk we will explain how we experimented with generative NLP models to assist financial advisors in their daily interactions with clients. For this work we have used a seq2seq deep learning neural network architecture based on two LSTM acting as encoder and decoder…more details
Clara is senior data scientist at BBVA AI Factory. She has worked in the data science field for many years applying NLP techniques to different sectors such as media or banking. At the BBC in London she worked building recommender systems for BBC News and developed several tools to help editors understand audience feedback. At the banking sector in BBVA she has worked on building data products to help financial advisors better manage customers queries. She currently leads the collections data science team at BBVA AI factory. Prior to her industry experience she carried out her PhD in artificial intelligence and bioinformatics and holds a degree in computer science. Clara advocates for a responsible use of technology and is actively involved in activities which encourage women and girls to pursue a career in technology and science to help bridge the gender gap in these disciplines.
María is Senior Data Scientist and Data Product Owner at BBVA AI Factory, with ten years of experience in the Data Science field, she was one of the first Data Scientists in BBVA, taking part in the Big Data ecosystems set up in the bank. Graduated in Mathematics and Computer Engineering, she holds a MSc in Computational Intelligence from Universidad Autónoma de Madrid (UAM), specialized in Aspect-based sentiment Analysis and Item Recommendation.
She has worked in several analytical domains, ranging from Retail and Urban Analysis to Customer Intelligence. Now, she is trying to enhance the customers’ relationship with the bank through Natural Language Processing and Text Analytics. María focuses on understanding business challenges and developing the best analytical solution for each problem.
In-person | Business Talk | Deep Learning | Machine Learning | Beginner-Intermediate
Sam will lift the lid on the deep learning models used by Ocado Technology and how these have been adapted for the challenges faced in online grocery, showcasing the positive results achieved by the retailers who have adopted these forecasting solutions including 50% improvement in accuracy, drastically reduced waste, automation of replenishment decisions and big financial savings. Join this session to get a glimpse into a real life example of deep learning in production and how it is having an impressive impact in the ecommerce space…more details
Sam leads the supply chain machine learning team at Ocado Technology, responsible for the demand forecasting and replenishment optimisation algorithms used by Ocado’s international partners. Sam holds a DPhil in Condensed Matter Physics from the University of Oxford and volunteers as an ambassador for DataKindUK. Prior to joining Ocado he spent a number of years working in AI startups in the Netherlands.
In-person | Talk | MLOPs | All Levels
If your models are doing great in experimentation but you are still trying to put all the production pieces together, This session might help you understand what’s going wrong and how to fix it. By working according to this methodology data scientists can iterate rapidly which is at the core of a successful ML project…more details
Yuval Fernbach is the Co-founder & CTO of Qwak, where he is focused on building next-generation ML Infrastructure for ML teams of various sizes. Before Qwak, Yuval was an ML Specialist at AWS , where he helped AWS Customers across EMEA with their ML challenges. Previous to that, he was the CTO of the IT department of the IDF (“Mamram”).
Virtual | Talk | NLP | LLM | Intermediate-Advanced
In this talk, I address the challenge of learning from limited data for a range of natural language understanding tasks and applications. I will present our work on few-shot learning approaches to NLP in both monolingual and cross-lingual settings and present findings in tasks such as word sense disambiguation, syntactic parsing and text classification. Finally, I will present recent research on approaches that can enable higher levels of data efficiency, and show how they can outperform much more computationally complex counterparts…more details
Helen Yannakoudakis is an Assistant Professor in Natural Language Processing (NLP) at the Department of Informatics, King’s College London, and a Visiting Researcher at the Department of Computer Science & Technology, University of Cambridge. She is also a co-founder and Chief Scientific Officer at Kinhub (formerly Kami), translating research outcomes to deployable real-world applications in health and wellbeing. Her research focuses on machine learning for NLP, and specifically on transfer learning, few-shot learning, lifelong learning, multilingual NLP, and societal and health applications, such as language assessment, abusive language detection, misinformation, emotion and mental health detection. Helen is a Fellow of the Higher Education Academy, has received funding awards from both industry and academia, has won international competitions such as the NeurIPS 2020 Hateful Memes Challenge, and currently serves as an Area Chair for NeurIPS 2023.
Virtual | Talk | MLOps & Data Engineering | Intermediate
The benefits of Real-Time Machine Learning are becoming increasingly apparent. Digital native companies have long proven that use cases like fraud detection, recommendation systems, and dynamic pricing all benefit from lower latencies. In a recent KDD paper*, Booking.com found that even a 30% increase in model serving latency caused a .5% decrease in user conversion, a significant cost to their business…more details
Dillon Bostwick is a Solutions Architect at Databricks, where he’s spent the last five years advising customers ranging from startups to Fortune 500 enterprises. He currently helps lead a team of field ambassadors for streaming products and is interested in improving industry awareness of effective streaming patterns for data integration and production machine learning. He previously worked as a product engineer in infrastructure automation.
Avinash Sooriyarachchi is a Senior Solutions Architect at Databricks. His current work involves working with large Retail and Consumer Packaged Goods organizations across the United States and enabling them to build Machine Learning based systems. His specific interests include streaming machine learning systems and building applications leveraging foundation models. Avi holds a Master’s degree in Mechanical Engineering and Applied Mechanics from the University of Pennsylvania.
Virtual | Talk | Responsible AI | Machine Learning | All Levels
AI-powered coding assistants, such as GitHub Copilot, are spreading rapidly in the software engineering community. Copilot was developed by Microsoft and OpenAI on top of Codex, a transformer-based Large Language Model, and overtook 400.000 subscribers in the first month. It was praised by influential engineers, including Guido van Rossum, the inventor of the Python language…more details
Emanuele is Engineer by education, Data Scientist by choice, researcher and lecturer by passion. During his PhD in ML, he got invited to EPFL Lausanne for a 6-month visit and published 9 papers in top journals.He is the co-founder of xtream, an AI boutique applying academic research to business. Contributing to the community is part of their mission: He was a speaker and track organizer at eRum, AMLD, and PyCon and he lectured at Italian, Swiss, and Polish universities.
Virtual | Talk | Deep Learning | NLP | Machine Learning | All Levels
This talk will demonstrate the power of compound sparsity for model compression and inference speedup for NLP and CV domains, with a special focus on the recently popular Large Language Models…more details
Damian is engineer, roboticist, software developer, and problem solver. Previous experience in autonomous driving (Argo AI), AI in industrial robotics (Arrival), and building machines that build machines (Tesla). Currently working in Neural Magic, focusing on the sparse future of AI computation. Works towards unlocking creative and economic potential with intelligent robotics while avoiding the uprising of sentient machines.
Konstantin Gulin is a Machine Learning Engineer at Neural Magic working on bringing sparse computation to the forefront of industry. With prior experience in applying machine learning to remote sensing (NASA) and space mission simulation (The Aerospace Corporation), he’s turned his focus to enabling effective model deployment in even the most constrained environments. He’s passionate about technology and ethical engineering and strives for the thoughtful advancement of AI.
In-person | Business Talk | AI for Transportation | All Levels
In this talk we will be focusing on the third point, showing how a digital strategy can be driven by information more than by data, while still relying on advanced algorithms to solve very large scale problems. We will draw from extensive expertise working with companies in the logistics and distribution industry, optimizing distribution networks and their operation: hub-and-spoke configuration, intermodal operation, truck scheduling, and driver fleet optimization. In all these cases, we will discuss how capturing and digitizing the right information in the form of constraints has been critical to producing realistic recommendations accepted by the operation teams on the ground. As a side yet non-negligible benefit, digitized information is knowledge that stays within the company instead of leaving when the expert employee changes job or retires. This results in more resilient companies, robust operations ready for scale, a proactive mindset instead of reactive, and a positive environmental impact in the form of supply chain decarbonization…more details
Tomasz M. Grzegorczyk is the founder and CEO of Teranalytics, an AI and optimization company specializing in large-scale logistics operations such as production, manufacturing, shipping, and distribution. Before creating Teranalytics, he was a Chief Scientist at BAE Systems and MIT Research Scientist where he worked on computational electromagnetics, scattering in complex media, optical forces, and wave propagation in metamaterials. Tomasz holds a PhD from the Swiss Federal Institute of Technology in Lausanne, an MBA from the Massachusetts Institute of Technology, and is a senior member of the IEEE. He served as editor and board member of two international peer-reviewed
journals and one international conference, has authored more than a hundred publications and a book on metamaterials.
In-person | Talk | Machine Learning | Intermediate-Advanced
This session is designed for data practitioners who wish to maintain control and confidence over their projects even after deployment in production. We will explore two methods from the O’Reilly book “Fundamentals of Data Observability” that can be easily adopted to ensure the reliability of data pipelines throughout the whole process, from ingestion to analytics…more details
Andy Petrella is the CPO and founder of Kensu, a data observability solution that helps data teams trust what they deliver and create more value from data.
Andy is an entrepreneur with a background in data mining, data engineering, and data science. He is known as an early evangelist of Apache Spark and the Spark Notebook creator in the data community.
Since 2015, Andy has been an O’Reilly instructor and author, including the first O’Reilly book about Data Observability: “Fundamentals of Data Observability”
In-person | Talk | Deep Learning | MLOps | Intermediate
Deploying advanced Machine Learning technology to serve customers and/or business needs requires a rigorous approach and production-ready systems. This is especially true for maintaining and improving model performance over the lifetime of a production application. Unfortunately, the issues involved and approaches available are often poorly understood…more details
A data scientist and ML enthusiast, Robert has a passion for helping developers quickly learn what they need to be productive. Robert is currently the Senior Product Manager for TensorFlow Open-Source and MLOps at Google and helps ML teams meet the challenges of creating products and services with ML. Previously Robert led software engineering teams for both large and small companies, always focusing on moving fast to implement clean, elegant solutions to well-defined needs. You can find him on LinkedIn at robert-crowe.
Virtual | Talk | Machine Learning | All Levels
The talk is intended for graduate students, professionals, and MBA students seeking an introduction to forecasting methods without diving too deep into theoretical details. Participants will develop skills, mindsets, and behaviors sought after in the industry today…more details
Tanvir Ahmed Shaikh is a highly entrepreneurial and visionary data strategist with a passion for driving business growth through innovative data-driven solutions. With a track record of success in data science and digital transformation, Tanvir has been instrumental in developing and implementing strategies that improve efficiency, quality, and compliance. He possesses strong collaboration skills and effectively communicates technical concepts to non-technical stakeholders.
Currently serving as a Data Strategist (Director) at Genentech Inc., Tanvir leads the digital roadmap for the Global Pharma Manufacturing Quality organization. His expertise in prioritizing digital initiatives, building consensus, and driving change management has resulted in significant positive impacts on the organization.
Tanvir’s leadership abilities are exemplified through his role as the Founder and Digital Strategy Lead of the Roche Intrapreneur Network, a global network of over 350+ Roche technologists focused on executive capabilities and experiential learning. Through this network, he fosters a culture of entrepreneurship, product management, and storytelling, encouraging innovation and empowering individuals to think like CEOs of their products.
In his previous role as a Principal Data Scientist, Tanvir spearheaded cross-functional projects, driving operational excellence in forecasting, automation, and AI education. His contributions have led to substantial cost savings and increased efficiency within the organization. Tanvir’s passion for education and continuous learning is evident in his role as an Adjunct Professor at Carnegie Mellon University. He teaches courses on Time Series Forecasting in Python, AI Product Management, and Storytelling with Data, inspiring students to think holistically and take an end-to-end view of problem-solving. He actively promotes a culture of continuous learning, inclusive community building, and inspirational storytelling. Beyond his professional pursuits, Tanvir embraces a diverse range of interests. He finds joy in the culinary arts, experimenting with new recipes and creating culinary delights. Music also holds a special place in his heart, and he enjoys singing and playing the ukulele in his free time. Tanvir’s curiosity extends to the financial world, where he actively researches stocks and shares his knowledge, promoting personal finance education. Additionally, he stays active through the sport of tennis, both in competitive settings and for leisure. Tanvir’s dedication to data-driven strategies, love for storytelling, and commitment to personal growth and education make him a versatile and accomplished professional. He embodies the values of continuous learning, community building, and innovative thinking, making a significant impact in the field of data science and beyond.
Virtual | Talk | Responsible AI | Machine Learning | All Levels
Statistical reasoning shapes our collective sense of what is true, what is best, and what should happen next. Even before we mechanized statistical prediction through machine learning, it was a habitual convention that was used as a marker of quality, rigorous science and democratic fairness…more details
Jutta Treviranus is the Director of the Inclusive Design Research Centre (IDRC) and professor in the faculty of Design at OCAD University in Toronto (http://idrc.ocadu.ca ). Jutta established the IDRC in 1993 as the nexus of a growing global community that proactively works to ensure that our digitally transformed and globally connected society is designed inclusively. Dr. Treviranus also founded an innovative graduate program in inclusive design at OCAD University. Jutta is credited with developing an inclusive design methodology that has been adopted by large enterprise companies such as Microsoft, as well as public sector organizations internationally. In 2022 Jutta was recognized for her work in AI by Women in AI with the AI for Good – DEI AI Leader of the Year award.
Virtual | Talk | Machine Learning for Finance | Intermediate
The objective of this session is to make attendees familiar with the reasons why probabilistic machine learning is the next generation of AI in finance and investing…more details
Deepak Kanungo is the founder and CEO of Hedged Capital LLC, an AI-powered, proprietary trading and analytics firm built around probabilistic machine learning technologies. In 2005, long before machine learning was an industry buzzword, Deepak invented a probabilistic machine learning method and software system for managing the risks and returns of project portfolios. It is a unique framework that has been cited by IBM and Accenture, among others. Previously, Deepak was a financial advisor at Morgan Stanley, a Silicon Valley fintech entrepreneur, and a director in the Global Planning Department at Mastercard International. He was educated at Princeton University (astrophysics) and the London School of Economics (finance and information systems).
In-person | Talk | Machine Learning | MLOps and Data Engineering | Beginner
Machine learning has become an integral part of modern business operations, but the success of these projects depends on the quality of the underlying software. Unfortunately, many machine-learning prototypes fail to reach production systems because data science teams incur accidental and intentional technical debt faster than they get to their solution…more details
Yetunde Dada is the Director of Product Management at QuantumBlack, an AI-focused branch of McKinsey. She is instrumental in building products for Data Engineers and Data Scientists, including a notable Python library known as Kedro. Kedro is a distinguished product, marking the first open-source offering from McKinsey and QuantumBlack.
She holds an MBA from the Said Business School at the University of Oxford, earned in the 2017/2018 academic year. Her professional background is diverse and includes roles such as Data Engineer and Data Product Manager at Absa (formerly known as Barclays Africa Group Limited), Innovation Consultant at Engineers Without Borders South Africa, and a Mechanical Engineer.
In-person | Business Talk | Machine Learning for Finance | Machine Learning | All Levels
The purpose of this talk is to explain which business skills are most needed by analytics professionals, illustrate why each is so critical, and help analytics leaders to foster these skills within their teams. We will progress through the skills roughly in the order they are needed—from skills for the first year out of university up through skills needed to run an entire analytics program. In this talk, Dr. Stephenson will draw on best practices, case studies, research, and personal anecdotes from his twenty years of hands-on analytic leadership of teams of analytics professionals spanning six continents, as well as several years helping design and teach executive programs as an adjunct at the Amsterdam Business School…more details
David Stephenson has over 20 years of experience leading analytics initiatives, including as Head of Global Business Analytics at eBay Classifieds Group. Since founding DSI Analytics in 2014, he has worked directly with dozens of companies across a wide range of industries (Adidas, Miro, Janssen Pharmaceuticals, ABN Amro, Sky Broadcasting, etc). Dr. Stephenson also serves as part time faculty at the University of Amsterdam Business School, has published two books, and has developed and delivered data science trainings for hundreds of analytics professionals around the globe.
In-person | Talk | Machine Learning | Machine Learning Safety and Security | Data Engineering & Big Data | Responsible AI | Intermediate
If you’ve ever asked one of the questions above, then this talk is for you! You’ll learn how the ability to interpret a model can identify poor model performance or, worse, bias that could ultimately impact the fairness of your machine learning applications. You’ll learn about some of the most common algorithms, how they work and how you can get started using them yourself…more details
Ed Shee, Head of Developer Relations at Seldon. Having previously led a tech team at IBM, Ed comes from a cloud computing background and is a strong believer in making deployments as easy as possible for developers. With an education in computational modelling and an enthusiasm for machine learning, Ed has blended his work in ML and cloud native computing together to cement himself firmly in the emerging field of MLOps.
In-person | Tutorial | Data Engineering & Big Data | Deep Learning | Machine Learning | All Levels
This workshop presents Taipy, a new low-code Python package that allows you to create complete Data Science applications, including graphical visualization and managing algorithms, pipelines, and scenarios…more details
Florian Jacta is a specialist of Taipy, a low-code open-source Python package enabling any Python developers to easily develop a production-ready AI application. Package pre-sales and after-sales functions. He is data Scientist for Groupe Les Mousquetaires (Intermarche) and ATOS. He developed several Predictive Models as part of strategic AI projects. Also, Florian got his master’s degree in Applied Mathematics from INSA, Major in Data Science and Mathematical Optimization.
Alexandre worked in Amazon Business Intelligence.He developed a graph-based interactive Python editor: Pyflow (1.2k stars!). He is skilled in MLOps, Data Engineering, and Python. He has studied Master of Engineering – CentraleSupélec from University of Paris-Saclay.
In-person | Half-Day Training | Machine Learning for Finance | Intermediate
This half-day trading session covers the most important Python topics and skills to apply AI and Machine Learning (ML) to Algorithmic Trading. The session shows how to make use of the Oanda trading API (via a demo account) to retrieve data, to stream data, to place orders, etc. Building on this, a ML-based trading strategy is formulated and backtested. Finally, the trading strategy is transformed into an online trading algorithm and is deployed for real-time trading on the Oanda trading platform…more details
Dr. Yves J. Hilpisch is founder and CEO of The Python Quants (http://tpq.io), a group focusing on the use of open source technologies for financial data science, artificial intelligence, algorithmic trading, and computational finance. He is also founder and CEO of The AI Machine (http://aimachine.io), a company focused on AI-powered algorithmic trading based on a proprietary strategy execution platform.
Yves has a Diploma in Business Administration, a Ph.D. in Mathematical Finance and is Adjunct Professor for Computational Finance at Miami Herbert Business School.
In-person | Tutorial | Generative AI | Machine Learning | Deep Learning | NLP | Intermediate-Advanced
In the first part of the talk I will provide an overview of the latest generative AI models and how they work. This will include discussing the various types of generative AI models, such as diffusion models for image generation and transformer (GPT-like) models for text generation and their underlying architectures and key concepts…more details
Heiko Hotz is a Senior Solutions Architect for AI & Machine Learning at AWS with a special focus on Natural Language Processing (NLP), Large Language Models (LLMs), and Generative AI. He is also the founder of the NLP London Meetup group, bringing together NLP enthusiasts and industry experts.
In-person | Workshop | Machine Learning | Deep Learning | Intermediate
In this workshop we will illustrate both approaches using a consistent single example. We will use TensorFlow in a Colab notebooks, so all you need is a recent version of Chrome and a Google login. You will not need prior knowledge with TensorFlow, but need a good understanding of how training neural networks work as a prerequisite…more details
Oliver Zeigermann has been developing software with different approaches and programming languages for more than 3 decades. In the past decade, he has been focusing on Machine Learning and its interactions with humans.
In-person | Full-Day Training | NLP | Beginner
In this course we will go through Natural Language Processing fundamentals, such as pre-processing techniques,tf-idf, embeddings, and more. It will be followed by practical coding examples, in python, to teach how to apply the theory to real use cases…more details
Leonardo De Marchi holds a Master in Artificial intelligence and has worked as a Data Scientist in the sports world, with clients such as the New York Knicks. He now works in Thomson Reuters as VP of Labs, and also provides consultancy and training for small and large companies. His previous experience includes being Head of Data Science and Analytics in Bumble, the largest dating site with over 500 million users, heading the team through acquisition and an IPO.
Laura Skylaki is a Manager of Applied Research in Thomson Reuters Labs, where she leads advanced machine learning projects in the domain of Legal and Tax AI.With a career spanning more than a decade at the intersection of research and practical application, she has contributed technical expertise in diverse fields such as bioinformatics and stem cell biology, image processing and natural language processing. She holds a doctorate in stem cell bioinformatics from the University of Edinburgh, UK, and has been publishing on machine learning applications in leading academic journals since 2012.
In-person | Tutorial | Deep Learning | Machine Learning | NLP | Beginner
In this tutorial, we will illustrate the evolution of deep learning architectures and how KNIME Analytics Platform is naturally designed to keep up with these transformations. We will start off by introducing simple ANNs for a classification task. While easy to grasp, ANNs are not suitable to effectively work with sequential (e.g., texts and time series) or visual data (e.g, images and videos). Other, more complex architectures proved superior. We will zoom in on RNNs with LSTM units for text generation and time series forecasting; CNNs for image classification and styling; and GANs for synthetic image generation…more details
Roberto Cadili is a data scientist on the Evangelism team at KNIME. During his BSc. in Economics, he developed a genuine interest in statistics and data analysis. At the University of Konstanz, he pursued a MSc. in Social and Economic Data Science where he studied different machine learning algorithms and deep learning architectures with an emphasis on NLP and Computer Vision. As editor for Low Code for Data Science, he is helping the KNIME community shape successful data science stories, tutorials, and best practices that are worth sharing.
Emilio Silvestri is a Junior Data Scientist on the Evangelism Team at KNIME. He has a Master’s Degree in Computer Science at the University of Konstanz, with a special focus on Data Science and Artificial Intelligence. He is a certified KNIME Trainer and works for the KNIME Education Team to onboard and upskill people in their data science journey with courses and webinars.
In-person | Workshop | Intermediate
Real-Time Analytics is one of the new trends in the streaming space, but it can be hard to keep track of everything, especially as it seems like new products are being released every week. We’ll start off this session with a presentation that will give you a map to understand the space. This map will hopefully make it easier to understand where current and new tools fit into the space…more details
Mark Needham is an Apache Pinot advocate and developer relations engineer at StarTree. As a developer relations engineer, Mark helps users learn how to use Apache Pinot to build their real-time user-facing analytics applications. He also does developer experience, simplifying the getting started experience by making product tweaks and improvements to the documentation. Mark writes about his experiences working with Pinot at markhneedham.com. He tweets at @markhneedham.
In-Person | Tutorial | NLP | Machine Learning&Deep Learning | Intermediate-Advanced
While deep learning has driven impressive progress, one of the toughest remaining challenges is generalization beyond the training distribution. Few-shot learning is an area of research that aims to address this, by striving to build models that can learn new concepts rapidly in a more “human-like” way. While many influential few-shot learning methods were based on meta-learning, recent progress has been made by simpler transfer learning algorithms, and it has been suggested in fact that few-shot learning might be an emergent property of large-scale models. In this talk, I will give an overview of the evolution of few-shot learning methods and benchmarks from my point of view, and discuss the evolving role of meta-learning for this problem. I will discuss lessons learned from using larger and more diverse benchmarks for evaluation and trade-offs between different approaches, closing with a discussion about open questions…more details
Eleni is a Research Scientist at Google DeepMind, based in London UK. She obtained her PhD from the University of Toronto, advised by Professors Richard Zemel and Raquel Urtasun. Her research is centered around creating methods that allow efficient and effective adaptation of deep neural networks to cope with distribution shifts, introduction of new concepts, or removal of outdated or harmful knowledge, falling in the areas of few-shot learning, meta-learning, domain adaptation and machine unlearning.
Virtual | Workshop | All | Beginner-Intermediate
The goal of this session is to get you familiarized with diffusion models, their inner workings, and different approaches to data generation. We’ll use Google Colab to build and train a simple diffusion model. You should be comfortable using Jupyter Notebooks, and training simple models in PyTorch…more details
Daniel has been teaching machine learning and distributed computing technologies at Data Science Retreat, the longest-running Berlin-based bootcamp, for more than three years, helping more than 150 students advance their careers.
He writes regularly for Towards Data Science. His blog post “Understanding PyTorch with an example: a step-by-step tutorial” reached more than 220,000 views since it was published.
The positive feedback from the readers motivated him to write the book Deep Learning with PyTorch Step-by-Step, which covers a broader range of topics.
Daniel is also the main contributor of two python packages: HandySpark and DeepReplay.
His professional background includes 20 years of experience working for companies in several industries: banking, government, fintech, retail and mobility.
Virtual | Workshop | Machine Learning | Intermediate-Advanced
In this tutorial we will dive into a particular space science / engineering domain: the calibration of space instruments. For this we take a dedicated look at calibration data from the so-called Cosmic Dust Analyzer (CDA) that was part of NASA’s Cassini mission in the Saturnian system. Together, we will see how the data has been generated, explore their features and limits and will determine how deep learning can help us to create new state-of-the art calibration solutions for space missions…more details
Thomas is a Senior Machine Learning engineer, working in the automotive industry since 2019. Before joining the Research & Development department of a large manufacturer he was conducting research activities in space science. In parallel to his studies in Astro- and Geo-Physics and later PhD program, he participated in 2 major missions: ESA’s comet mission Rosetta/Philae and NASA’s & ESA’s Saturn spacecraft Cassini/Huygens; always with a special focus on cosmic dust. Additionally, he applies Machine Learning algorithms to analyse astronomy- and space-related data to derive new scientific insights or to create new methods for calibrating instruments. Besides his industry work, Thomas is a guest scientist at the Free University of Berlin, where he continues working on the Cassini-related datasets using Deep Learning. On his active YouTube channel Astroniz he shares his Python + Space Science + Machine Learning knowledge with a small community.
Virtual | Workshop | NLP | Machine Learning | Beginner-Intermediate
In this workshop, you’ll walk through a complete end-to-end example of using Hugging Face Transformers, involving both our open-source libraries and some of our commercial products. Starting from a dataset containing real-life product reviews from Amazon.com, you’ll train and deploy a text classification model predicting the star rating for similar reviews…more details
Julien is currently Chief Evangelist at Hugging Face. He’s recently spent 6 years at Amazon Web Services where he was the Global Technical Evangelist for AI & Machine Learning. Prior to joining AWS, Julien served for 10 years as CTO/VP Engineering in large-scale startups.
In-person | Tutorial | Machine Learning | Data Engineering & Big Data | All Levels
The workshop objective is to use the Yelp Dataset to create business recommendations for users exploiting the network composed of reviews, users, friends, tips, and businesses. The workshop will start from the downloaded jsons of the Yelp dataset from which we will create csvs for the import on a Neo4j Database…more details
Valerio Piccioni is an AI Engineer at LARUS who primarily focuses on Graph Neural Networks, but also likes to have a go with other deep learning fields like NLP and Computer Vision. He is also interested in MLOps as building machine learning models that can arrive into production is harder than it seems. Currently he is working on a project regarding fraud detection with graphs.
In-person | Tutorial | Generative AI | NLP | Deep Learning | Machine Learning | All Levels
During this talk you will learn more about Transformer-based model