
Abstract: Despite being first developed in the 1970s – SQL remains one of the most important data science skills in 2021!
In this workshop you will learn about:
Why SQL is still relevant for modern data science
How to tune SQL queries for optimal performance
How to translate between Python Pandas syntax and SQL operations
What is NoSQL and why does it matter for data scientists
Session Outline
Session 1: Why SQL is still relevant for modern data science?
Learn about the pervasiveness of SQL throughout the modern data stack and learn about some specific use cases where data engineering for machine learning processes are driven by SQL based tools with a focus on Cloud SQL
Session 2: How to tune SQL queries for optimal performance
A high level guide to query optimisation, various rules of thumb to abide by and also some example scenarios to build your intuition when facing similar problems in the wild.
Session 3: How to translate between Python Pandas syntax and SQL operations
This quick fire session will feature a Jupyter Notebook approach to translating between SQL queries and basic Python Pandas operations. Expect to learn something new and compare the different implementations to help you weave both into your regular data science toolkit.
Session 4: What is NoSQL and why does it matter for data scientists?
Find out about the differences between traditional SQL and NoSQL, as well as their unique use cases. Learn about you should consider when choosing one technology over the other for data science projects through example case studies.
Background Knowledge
Basic SQL and Python
Bio: Danny is the Founder and CEO of Sydney Data Science and has over 10 years of experience in the data industry. He has held almost every role in the data ecosystem from data entry to campaign analyst, data scientist, data engineer and machine learning engineer. His core expertise is in data analytics, supervised ML algorithms, data architecture and designing digital data systems for retail, banking and financial markets.
Danny’s passion is to guide businesses and individuals on their data & machine learning journey. He currently runs the Data With Danny community with over 8,000 aspiring data professionals and is working on his vision of creating a scalable virtual data apprenticeship program to empower others to kickstart their career in data.