Boston | April 13th – April 17th, 2020
Natural Language Processing Track
Learn the latest models, advancements, and trends from the top practitioners and researchers behind NLP
NLP has seen rapid advances in recent years. With some of the best and brightest minds in data science presenting, get the latest insights, natural language processing training, trends, and discoveries in data science languages, tools, topics – and beyond.
Connect with some of the most innovative people and ideas in the world of data science, while learning first-hand from core practitioners and contributors. Learn about the latest advancements and trends in NLP, including pre-trained models, with use-cases focusing on deep learning, speech-to text, and semantic search.
Some of our Current NLP Speakers

Thomas Wolf, PhD
Thomas leads the Science Team at Huggingface Inc., a Brooklyn-based startup working on Natural Language Generation and Natural Language Understanding.
After graduating from Ecole Polytechnique (Paris, France), he worked on laser-plasma interactions at the BELLA Center of the Lawrence Berkeley National Laboratory (Berkeley, CA). Got accepted for a PhD at MIT (Cambridge, MA) but ended up doing his PhD in Statistical/Quantum physics at Sorbonne University and ESPCI (Paris, France), working on superconducting materials for the French DARPA (DGA) and Thales.
Thomas is interested in Natural Language Processing, Deep Learning, and Computational Linguistics. Much of his research is about Natural Language Generation (mostly) and Natural Language Understanding (as a tool for better generation).
An Introduction to Transfer Learning in NLP and HuggingFace Tools(Workshop)

Kimberly Fessel, PhD
Kimberly Fessel is a Senior Data Scientist at Metis, the industry’s only accredited, full-time, immersive data science bootcamp. Prior to joining Metis as an instructor, Kimberly worked in digital advertising at MRM//McCann where she focused on helping clients understand their customers by leveraging unstructured data with modern NLP techniques. She holds a Ph.D. in applied mathematics from Rensselaer Polytechnic Institute and completed an NSF-funded postdoctoral fellowship in math biology at the Ohio State University. She is passionate about data visualization and about harnessing the power of language to tell compelling data stories.

Joan Xiao, PhD
Joan Xiao is a Principal Data Scientist at Linc Global, a commerce-specialized customer care automation company. In her role, she applies novel natural language processing and machine learning techniques to improve customer experience. Previously she led machine learning and data science teams at various companies ranging from startup to Fortune 100. Joan received her Ph.D in Mathematics and MS in Computer Science from University of Pennsylvania.
Transfer Learning in NLP(Talk)

Dr. Anju Kambadur
Dr. Prabhanjan (Anju) Kambadur heads the AI Engineering group at Bloomberg. Anju leads a group of 100+ researchers and engineers who build solutions for Bloomberg clients in the areas of machine learning, natural language processing (NLP) and natural language understanding, information extraction, knowledge graphs, question answering, and table understanding. Previously, Anju was a research staff member in the Business Analytics and Mathematical Sciences Department at IBM Research’s Thomas J. Watson Research Center, where he worked on problems in machine learning, such as matrix sketching, genome-wide association studies, temporal causal modeling, and high-performance computing. He received his PhD from Indiana University. Anju has published peer-reviewed articles in the fields of high-performance computing, machine learning, and natural language processing.
See our full speaker list
2020 SpeakersSample Talk, Workshop, and Training Sessions
Workshop | NLP | Beginner-Intermediate
In this session, I’ll start by introducing the recent breakthroughs in NLP that resulted from the combination of Transfer Learning and Transformer architectures. Then, we’ll learn to use the open-source tools released by HuggingFace like the Transformers and Tokenizers libraries and the distilled models.
Learning outcomes: understanding Transfer Learning in NLP, how the Transformers and Tokenizers libraries are organized and how to use them for downstream tasks like text classification, NER and text generation...more details
Thomas leads the Science Team at Huggingface Inc., a Brooklyn-based startup working on Natural Language Generation and Natural Language Understanding.
After graduating from Ecole Polytechnique (Paris, France), he worked on laser-plasma interactions at the BELLA Center of the Lawrence Berkeley National Laboratory (Berkeley, CA). Got accepted for a PhD at MIT (Cambridge, MA) but ended up doing his PhD in Statistical/Quantum physics at Sorbonne University and ESPCI (Paris, France), working on superconducting materials for the French DARPA (DGA) and Thales.
Thomas is interested in Natural Language Processing, Deep Learning, and Computational Linguistics. Much of his research is about Natural Language Generation (mostly) and Natural Language Understanding (as a tool for better generation).
Workshop
Veysel is a well known thought leader in healthcare NLP and works as a Lead Data Scientist and ML Engineer at John Snow Labs, improving the Spark NLP for the Healthcare library and delivering hands-on projects in Healthcare and Life Science. He is a seasoned data scientist with a strong background in every aspect of data science including NLP, machine learning, deep learning, and big data with over ten years of experience. He’s also pursuing his Ph.D. in ML at Leiden University, Netherlands, and delivers graduate-level lectures in Auto ML and Distributed Data Processing. He also has broad consulting experience in Statistics, Data Science, Software Architecture, MLOps, Machine Learning, and AI to several start-ups, boot camps, and companies around the globe. He also speaks at Data Science & AI events, conferences and workshops, and has delivered more than a hundred talks at international as well as national conferences and meetups.
Workshop | NLP | Machine Learning | Intermediate-Advanced
This workshop teaches you the use of transformer neural networks and their incarnations (BERT, RoBERTa, GPT-2) for solving real-world natural language use cases. NLP has advanced tremendously over the last few years and BERT is at the forefront of this success having achieved state-of-the-art results on 11 different NLP tasks. For businesses, BERT has unlocked new NLP use cases that have been previously unattainable.
This workshop will teach you what transformers and systems like BERT and GPT-2 are and how to use and modify them for your needs. Organizations have a wealth of unstructured text sources in every line of business, such as employee feedback in human resources, purchase orders and legal documents in contracting and procurement, communication records throughout the org, and many more. Making sense of this information and organizing it into knowledge and actionable insights to improve business outcomes is a key function every data scientist should be aware of…more details
Niels Kasch, PhD is a Founding Partner at Miner & Kasch, a leading AI and Data Science firm. He combines extensive experience in machine learning, analytics, and business knowledge to solve his customers’ key business challenges. An expert in natural language processing, he has developed large-scale, state-of-the-art semantic analysis technologies for combating financial fraud in the finance industry, and text analytics solutions for mitigating risk in the construction industry. He has delivered high quality analytics and data products to Fortune 500 clients and startups alike, covering industries from automotive to utilities. His leadership in framing strategic analytics visions for enterprises has led to the building of several innovative data science teams for his customers.
Talk | NLP | Open-source | Intermediate
Natural language processing has exploded in popularity during the last decade. No longer confined to academia, many companies now see NLP as a critical portion of their business intelligence, with the NLP market size expected to double again in the next two years. Traditional NLP approaches like sentiment analysis and topic modeling provide undeniably meaningful insights, but what other techniques can be leveraged to mine information from text?
This talk focuses on lesser known NLP methods that can help unearth novel observations and make analyses more memorable. After a brief introduction to the topic, attendees will learn about various open-source Python packages they can apply to enhance their NLP workflows. Example use cases will also be discussed to further solidify how each technique may be leveraged with existing data. Attendees of this talk will discover several unconventional NLP tools such as:
– Scattertext for comparing word usage between two populations
– spaCy’s linguistic features to parse sentences by syntax
– DeepMoji for assigning emoji labels to short text…more details
Kimberly Fessel is a Senior Data Scientist at Metis, the industry’s only accredited, full-time, immersive data science bootcamp. Prior to joining Metis as an instructor, Kimberly worked in digital advertising at MRM//McCann where she focused on helping clients understand their customers by leveraging unstructured data with modern NLP techniques. She holds a Ph.D. in applied mathematics from Rensselaer Polytechnic Institute and completed an NSF-funded postdoctoral fellowship in math biology at the Ohio State University. She is passionate about data visualization and about harnessing the power of language to tell compelling data stories.
Talk | NLP | ML for Programmers | Intermediate
Machine learning has become a core technology underlying many modern applications, especially utilizing natural language processing, where the techniques provide powerful methods for analyzing large data sets, such as contracts, electronic health records, social interactions, and other unstructured text data. With the ability for recent powerful techniques to retain meaning, search, and perform machine translation at high fidelity, alongside many open source traditional and hybrid methods, transforming unstructured content to structured insights, events, and relationships is at the fingertips. Organizations are looking to leverage these emerging technologies and close capability gaps to ingest, monitor, error-check, automate, or improve their capabilities in processing and understanding hundreds of millions of documents. While certain tasks are well addressed by existing systems, organizations often still struggle with implementation, identification of the correct methods & algorithms, as well as properly scale their models to solve open challenges within their terminology. In this session, we examine the data strategy and technical use cases involving natural language processing, the algorithms appropriate for certain project objectives, and discuss the development and deployment of these solutions…more details
Talk | NLP | Deep Learning | Intermediate
Accelerating progress in personalized healthcare requires learning the causal relationships between diseases, genes, treatments, medications, labs, and other clinical information – at scale over a large population and time range. More than half of the clinically relevant data in oncology is only found in free-text pathology reports, radiology reports, sequencing reports, and progress notes.
Extracting and normalizing these facts from these clinical documents requires training oncology-specific models that can accurately extract these specific facts from a variety of documents. This talk describes results and lessons learned, from a real-world project doing this at scale…more details
David Talby is the Chief Technology Officer at John Snow Labs, helping companies apply artificial intelligence to solve real-world problems in healthcare and life science. David is the creator of Spark NLP – the world’s most widely used natural language processing library in the enterprise.
He has extensive experience building and running web-scale software platforms and teams – in startups, for Microsoft’s Bing in the US and Europe, and to scale Amazon’s financial systems in Seattle and the UK.
David holds a Ph.D. in Computer Science and Master’s degrees in both Computer Science and Business Administration. He was named USA CTO of the Year by the Global 100 Awards and GameChangers Awards in 2022.
Tutorial | NLP | Machine Learning | Intermediate
David Talby presents the open-source Spark NLP package for training distributed custom natural language machine-learned pipelines on Apache Spark. The library natively extends Spark ML and includes state-of-the-art deep learning models, language models, and 30+ pre-trained NLP models. The talk walks through the library’s goals, design and API’s, using Jupyter notebooks that will be made publicly available after the talk. Best practices and industry use cases where the library has been applied will be discussed as well…more details
David Talby is the Chief Technology Officer at John Snow Labs, helping companies apply artificial intelligence to solve real-world problems in healthcare and life science. David is the creator of Spark NLP – the world’s most widely used natural language processing library in the enterprise.
He has extensive experience building and running web-scale software platforms and teams – in startups, for Microsoft’s Bing in the US and Europe, and to scale Amazon’s financial systems in Seattle and the UK.
David holds a Ph.D. in Computer Science and Master’s degrees in both Computer Science and Business Administration. He was named USA CTO of the Year by the Global 100 Awards and GameChangers Awards in 2022.
Workshop | NLP | Beginner-Intermediate
In this session, I’ll start by introducing the recent breakthroughs in NLP that resulted from the combination of Transfer Learning and Transformer architectures. Then, we’ll learn to use the open-source tools released by HuggingFace like the Transformers and Tokenizers libraries and the distilled models.
Learning outcomes: understanding Transfer Learning in NLP, how the Transformers and Tokenizers libraries are organized and how to use them for downstream tasks like text classification, NER and text generation...more details
Thomas leads the Science Team at Huggingface Inc., a Brooklyn-based startup working on Natural Language Generation and Natural Language Understanding.
After graduating from Ecole Polytechnique (Paris, France), he worked on laser-plasma interactions at the BELLA Center of the Lawrence Berkeley National Laboratory (Berkeley, CA). Got accepted for a PhD at MIT (Cambridge, MA) but ended up doing his PhD in Statistical/Quantum physics at Sorbonne University and ESPCI (Paris, France), working on superconducting materials for the French DARPA (DGA) and Thales.
Thomas is interested in Natural Language Processing, Deep Learning, and Computational Linguistics. Much of his research is about Natural Language Generation (mostly) and Natural Language Understanding (as a tool for better generation).
Workshop
Veysel is a well known thought leader in healthcare NLP and works as a Lead Data Scientist and ML Engineer at John Snow Labs, improving the Spark NLP for the Healthcare library and delivering hands-on projects in Healthcare and Life Science. He is a seasoned data scientist with a strong background in every aspect of data science including NLP, machine learning, deep learning, and big data with over ten years of experience. He’s also pursuing his Ph.D. in ML at Leiden University, Netherlands, and delivers graduate-level lectures in Auto ML and Distributed Data Processing. He also has broad consulting experience in Statistics, Data Science, Software Architecture, MLOps, Machine Learning, and AI to several start-ups, boot camps, and companies around the globe. He also speaks at Data Science & AI events, conferences and workshops, and has delivered more than a hundred talks at international as well as national conferences and meetups.
Workshop | NLP | Machine Learning | Intermediate-Advanced
This workshop teaches you the use of transformer neural networks and their incarnations (BERT, RoBERTa, GPT-2) for solving real-world natural language use cases. NLP has advanced tremendously over the last few years and BERT is at the forefront of this success having achieved state-of-the-art results on 11 different NLP tasks. For businesses, BERT has unlocked new NLP use cases that have been previously unattainable.
This workshop will teach you what transformers and systems like BERT and GPT-2 are and how to use and modify them for your needs. Organizations have a wealth of unstructured text sources in every line of business, such as employee feedback in human resources, purchase orders and legal documents in contracting and procurement, communication records throughout the org, and many more. Making sense of this information and organizing it into knowledge and actionable insights to improve business outcomes is a key function every data scientist should be aware of…more details
Niels Kasch, PhD is a Founding Partner at Miner & Kasch, a leading AI and Data Science firm. He combines extensive experience in machine learning, analytics, and business knowledge to solve his customers’ key business challenges. An expert in natural language processing, he has developed large-scale, state-of-the-art semantic analysis technologies for combating financial fraud in the finance industry, and text analytics solutions for mitigating risk in the construction industry. He has delivered high quality analytics and data products to Fortune 500 clients and startups alike, covering industries from automotive to utilities. His leadership in framing strategic analytics visions for enterprises has led to the building of several innovative data science teams for his customers.
Talk | NLP | Open-source | Intermediate
Natural language processing has exploded in popularity during the last decade. No longer confined to academia, many companies now see NLP as a critical portion of their business intelligence, with the NLP market size expected to double again in the next two years. Traditional NLP approaches like sentiment analysis and topic modeling provide undeniably meaningful insights, but what other techniques can be leveraged to mine information from text?
This talk focuses on lesser known NLP methods that can help unearth novel observations and make analyses more memorable. After a brief introduction to the topic, attendees will learn about various open-source Python packages they can apply to enhance their NLP workflows. Example use cases will also be discussed to further solidify how each technique may be leveraged with existing data. Attendees of this talk will discover several unconventional NLP tools such as:
– Scattertext for comparing word usage between two populations
– spaCy’s linguistic features to parse sentences by syntax
– DeepMoji for assigning emoji labels to short text…more details
Kimberly Fessel is a Senior Data Scientist at Metis, the industry’s only accredited, full-time, immersive data science bootcamp. Prior to joining Metis as an instructor, Kimberly worked in digital advertising at MRM//McCann where she focused on helping clients understand their customers by leveraging unstructured data with modern NLP techniques. She holds a Ph.D. in applied mathematics from Rensselaer Polytechnic Institute and completed an NSF-funded postdoctoral fellowship in math biology at the Ohio State University. She is passionate about data visualization and about harnessing the power of language to tell compelling data stories.
Talk | NLP | ML for Programmers | Intermediate
Machine learning has become a core technology underlying many modern applications, especially utilizing natural language processing, where the techniques provide powerful methods for analyzing large data sets, such as contracts, electronic health records, social interactions, and other unstructured text data. With the ability for recent powerful techniques to retain meaning, search, and perform machine translation at high fidelity, alongside many open source traditional and hybrid methods, transforming unstructured content to structured insights, events, and relationships is at the fingertips. Organizations are looking to leverage these emerging technologies and close capability gaps to ingest, monitor, error-check, automate, or improve their capabilities in processing and understanding hundreds of millions of documents. While certain tasks are well addressed by existing systems, organizations often still struggle with implementation, identification of the correct methods & algorithms, as well as properly scale their models to solve open challenges within their terminology. In this session, we examine the data strategy and technical use cases involving natural language processing, the algorithms appropriate for certain project objectives, and discuss the development and deployment of these solutions…more details
Talk | NLP | Deep Learning | Intermediate
Accelerating progress in personalized healthcare requires learning the causal relationships between diseases, genes, treatments, medications, labs, and other clinical information – at scale over a large population and time range. More than half of the clinically relevant data in oncology is only found in free-text pathology reports, radiology reports, sequencing reports, and progress notes.
Extracting and normalizing these facts from these clinical documents requires training oncology-specific models that can accurately extract these specific facts from a variety of documents. This talk describes results and lessons learned, from a real-world project doing this at scale…more details
David Talby is the Chief Technology Officer at John Snow Labs, helping companies apply artificial intelligence to solve real-world problems in healthcare and life science. David is the creator of Spark NLP – the world’s most widely used natural language processing library in the enterprise.
He has extensive experience building and running web-scale software platforms and teams – in startups, for Microsoft’s Bing in the US and Europe, and to scale Amazon’s financial systems in Seattle and the UK.
David holds a Ph.D. in Computer Science and Master’s degrees in both Computer Science and Business Administration. He was named USA CTO of the Year by the Global 100 Awards and GameChangers Awards in 2022.
Tutorial | NLP | Machine Learning | Intermediate
David Talby presents the open-source Spark NLP package for training distributed custom natural language machine-learned pipelines on Apache Spark. The library natively extends Spark ML and includes state-of-the-art deep learning models, language models, and 30+ pre-trained NLP models. The talk walks through the library’s goals, design and API’s, using Jupyter notebooks that will be made publicly available after the talk. Best practices and industry use cases where the library has been applied will be discussed as well…more details
David Talby is the Chief Technology Officer at John Snow Labs, helping companies apply artificial intelligence to solve real-world problems in healthcare and life science. David is the creator of Spark NLP – the world’s most widely used natural language processing library in the enterprise.
He has extensive experience building and running web-scale software platforms and teams – in startups, for Microsoft’s Bing in the US and Europe, and to scale Amazon’s financial systems in Seattle and the UK.
David holds a Ph.D. in Computer Science and Master’s degrees in both Computer Science and Business Administration. He was named USA CTO of the Year by the Global 100 Awards and GameChangers Awards in 2022.
See all our talks and hands-on workshop and training sessions
See all sessionsWhat You'll Learn
Talks & Workshops on these topics:
Topics
Natural Language Processing
NLP Transformers
Pre-trained Models
Text Analytics
Natural Language Understanding
Sentiment Analysis
Natural Language Generation
Speech Recognition
Named Entity Extraction
Models
BERT
XLNet
GPT-2
Transformers
Word2Vec
Deep Learning Models
RNN & LSTM
Machine Learning Models
ULMFiT
Transfer Learning
Tools
Tensorflow 2.0
Hugging Face Transformers
PyTorch
Theano
SpaCy
NLTK
AllenNLP
Stanford CoreNLP
Keras
FLAIR
You Will Meet
Some of the world’s best data science speakers
The brains and authors behind today’s most popular open data science tools, topics and languages
Hundreds of attendees focused on data science
Chief Data Scientists
Thought leaders working in data science
Data Scientists and Analysts
Software Developers
CEOs, CTOs, CIOs
Data Visualization professionals
Venture Capitalists and Investors
Startup Founders and Executives
Attendees from Healthcare, Finance, Education, Business, Intelligence, and other industries
Big data and data science innovators
Why Attend?
Several of the best minds and biggest names in data science will be presenting
Network with attendees from leading data science companies to learn how others are tackling similar problems
Gain quality training and cutting-edge insights in the hottest data science topics, tools, and languages
Learn the latest in data science from industry leaders without having to make room in the budget — tickets are surprisingly inexpensive