Abstract: RAPIDS is an open source initiative to accelerate the complete end-to-end data science ecosystem with GPUs. It consists of several projects that expose familiar interfaces, making it easy to accelerate the entire data science pipeline - from the ETL and data wrangling to feature engineering, statistical modeling, machine learning, and graph analysis.
This presentation targets data scientists familiar with the Python data science ecosystem, which includes Pandas, Numpy, and Scikit-learn. A very brief overview of the RAPIDS ecosystem will get us kicked off, followed by an in-depth overview of cuML, the RAPIDS machine learning library.
Novice data scientists, who are new to the RAPIDS ecosystem, will benefit from a great introduction to the ease at which cuML can accelerate their existing sklearn workflows. Intermediate and advanced data scientists will gain a better understanding of cuML’s flexible architecture, including how it can be used to scale machine learning workloads across multiple GPUs and multiple nodes.
We will walk attendees through a real-world example step by step, showcasing the simplicity of cuML while demonstrating its ability to scale across multiple GPUs and multiple nodes. The talk will conclude with an analysis of cuML benchmarks, which will serve to support the claim that machine learning algorithms can gain significant speedups from GPU-acceleration.
Bio: Coming Soon!