Solving Problems with both Text and Numerical Data Using Gradient Boosting
Solving Problems with both Text and Numerical Data Using Gradient Boosting

Abstract: 

Gradient boosting is a powerful machine-learning technique that achieves state-of-the-art results in a variety of practical tasks. For a number of years, it has remained the primary method for learning problems with heterogeneous features, noisy data, and complex dependencies: web search, recommendation systems, weather forecasting, and many others.

Some problems contain different types of data, including numerical, categorical and text data. In this case the best solution is either buiding new numerical features instead of text and categories and pass it to gradient boosting, or using out-of-the box solutions for that.

CatBoost is the first Gradient Boosting library to have text features support out-of-the box. This talk will walk you through main features of this library including the way it works with texts.

CatBoost (http://catboost.ai) is a popular open-source gradient boosting library with a whole set of advantages:

1. CatBoost is able to incorporate categorical features and text features in your data with no additional preprocessing.

2. CatBoost has the fastest GPU and multi GPU training implementations of all the openly available gradient boosting libraries.

3. CatBoost predictions are 20-60 times faster then in other open-source gradient boosting libraries, which makes it possible to use CatBoost for latency-critical tasks.

4. CatBoost has a variety of tools to analyze your model.

This workshop will feature a comprehensive tutorial on using CatBoost library.

We will walk you through all the steps of building a good predictive model for data with text and numerical features.

Bio: 

Stanislav Kirillov is the leading developer in the group of ML-platforms in Yandex where he develops machine learning tools, supporting and developing infrastructure for them.

Open Data Science

 

 

 

Open Data Science
One Broadway
Cambridge, MA 02142
info@odsc.com

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Youtube
Consent to display content from Youtube
Vimeo
Consent to display content from Vimeo
Google Maps
Consent to display content from Google