
Abstract: Dask is a well-used framework for parallel and distributed computing in Python. It is used in many ways, including scalable versions of pandas, numpy, and other libraries, as well as as a general purpose toolkit for lower level task parallelism. Dask optimizes deployment, network communication, resilience, and load balancing, so that you don't have to.
However, like any well-used open source framework (pandas, numpy, python itself) Dask also has warts which get in the way of an optimal experience. What have we learned over the last several years of scaling Python, and what could we do better?
This talk discusses Dask's strengths, it's weaknesses, and the developer communities plans moving forward.
Bio: Matthew Rocklin, CEO and founder of Coiled, and the initial author of Coileds underlying technology, Dask. He developed Dask to help people solve challenging distributed computing problems while working at Anaconda. While he is primarily known for his work on Dask, he also coordinates and maintains several dozen libraries within Pythons numeric computing ecosystem, with a substantial focus on efficient and scalable computing.
Matthew is a frequent speaker at several technical, academic, and industry events, such as PyData, SciPy, Google Next, OReillys Strata, AGU, AMS, and ICML.
He has a Doctorate of Philosophy in Computer Science from the University of Chicago, and a Bachelors in Physics, Mathematics, and Astronomy from the University of California.