Abstract: Hyperparameter optimisation is one of the toughest challenges in Machine Learning, especially when needed for complex models that operate on large datasets. In this workshop we cover various techniques for optimising hyperparameters - Grid search, Random search, Bayesian optimisation, and Evolutionary algorithms. We cover the theory behind each approach and discuss the various pros and cons. The participants in the workshop will get access to individual, pre-configured compute environments (Docker containers):
• Each participant will have a dedicated container and code repository with assets needed for the workshop
• The container will enable them to choose from JupyterLab, Zeppelin, and VSCode as their IDE (whichever they prefer)
• The container will have all needed Python libraries (e.g. scikit-optimize) and datasets needed for the workshop included
• The containers will be hosted in an AWS account owned by Domino Data Lab and will be available to the participants free of charge. All materials from the workshop will also be made publicly available so people will be able to download them and do further experimentation after the event.
The workshop participants will then get a chance to complete a set of tasks revolving around the various optimisation techniques and observe the outcomes. The tasks will include hyperparameter optimisation for a deep neural network and optimization of the parameters of one ensemble model (Random Forest). Additional topics that we will cover in the workshop will include:
• Implications of running hyperparameter optimisation on large models/datasets and how to solve such challenges with sampling, focused intervals, reduced number of folds etc.
• Scalable hyperparameter tuning (via Ray Tune )
• Best practices and advanced techniques (e.g. dynamic termination, warm starting evaluations and optimisation etc.)
• Some open challenges in hyperparameter optimisation
Participant should be comfortable with Python and have basic machine learning knowledge.
Bio: Nikolay is an experienced Data Science professional who currently leads the EMEA Data Science team at Domino Data Lab. He holds an MSc in Software Technologies, an MSc in Data Science, and is currently undertaking postgraduate research at King's College London. His area of expertise is Statistics, Mathematics, and Data Science in general, and his research interests are in Neural Networks with emphasis on biological plausibility. He writes articles and blogs regularly and speaks at various European conferences (ODSC, Big Data Spain, Strata, Big Data London etc.) to build awareness about data science and artificial intelligence. He is also the organizer of the London Data Science and Machine Learning meetup and recipient of several technical mastery awards like the Oracle ACE Award and the IBM Outstanding Technical Achievement Award.