Abstract: Sound Classification and Detection with STFT and CNNs
Applications for audio based machine learning include virtual assistants, automatic speech recognition, speech to text, firearm locators, vehicle accident early detection, wildlife monitoring, audio anomaly detection, denoising, and music classification.
After completing this workshop, you will be able to use a short-time Fourier transform to convert audio into features suitable for use in machine learning models, and apply these features in a sound classification and sound detection task. You will be able to develop your own feature generation pipeline for audio data, and be able to implement and adapt published sound detection and classification models for your particular use case.
In part 1, we will discuss characteristics of sound waveforms, import sample audio, and produce spectrograms using a Short Time Fourier Transform (STFT). We will also investigate the consequence of different choices of rectangular, triangular, and Hann window functions used in STFT.
In part 2, we will work through a wildlife monitoring use case which will require using STFT to transform audio recordings of rainforest sounds into spectrograms, create time slices of these spectrograms, and classify the sound slices according to species. We will then extend the classification task to a sound detection class and create bounding boxes around time periods during which species calls are present.
This workshop will make use of a Jupyter Lab running inside a Docker container preloaded with required packages - Tensorflow and Librosa. Some familiarity with tensorflow and/or audio data will help with understanding, but is not required for this workshop.
After feature generation, the neural network training shares some similarity with image tasks, so this workshop may also be informative for those seeking to learn more about image classification and object detection.
Bio: Ryan Kasichainula is a data science instructor at Galvanize, Inc, an industry leader in technology education, with data science and software engineering immersive bootcamps. They are also an independent data consultant with experience in the technology, agriculture, energy, and pharmaceutical industries. Ryan enjoys applying data science techniques to a wide variety of domains, and they always have at least one side project in the works, usually in the realm of natural language generation.