Tired of Cleaning your Data? Have Confidence in Data with Feature Types


Data scientists can spend 60 to 80% of their time exploring and cleaning data. When they're given an updated data set, this process should be repeated but often, it isn't. This can lead to a model that poorly describes the system it represents. However, there is something that you can do about this.

The "feature type" system in OCI Data Science’s Accelerated Data Science (ADS) SDK classifies data based on what they represent, not how they're stored in memory. It also gives you the tools to compute custom statistics, create visualizations, use a validator and a warning system, and select columns based on the feature types.

Session Outline
Attend this presentation to:
- Learn how to speed up your exploratory data analysis (EDA).
- Create custom feature types.
- Make your data cleaning and validation process reproducible.
- Develop the skills to have confidence in the quality of your data.


A modern polymath, John holds advanced degrees in mechanical engineering, kinesiology and data science, with a focus on solving novel and ambiguous problems. As a senior applied data scientist at Amazon, John worked closely with engineering to create machine learning models to arbitrate chatbot skills, entity resolution, search, and personalization.

As a principal data scientist for Oracle Cloud Infrastructure, he is now defining tooling for data science at scale. John frequently gives talks on best practices and reproducible research. To that end, he has developed an approach to improve validation and reliability by using data unit tests and has pioneered Data Science Design Thinking. He also coordinates SoCal RUG, the largest R meetup group in Southern California.

Open Data Science




Open Data Science
One Broadway
Cambridge, MA 02142

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Consent to display content from - Youtube
Consent to display content from - Vimeo
Google Maps
Consent to display content from - Google