Abstract: Anomaly detection is quickly becoming a critical feature of many monitoring products; the “incorporates anomaly detection” checkbox is one that every product manager/user wants to tick. The SolarWinds Data Science Team has been tasked with enhancing anomaly detection across the company's product portfolio.
In the search for the best implementation for anomaly detection, the team considered a range of approaches, from big-name standards to cutting-edge specialty approaches. Even the brightest data scientists from Twitter®, Netflix®, Facebook®, Yahoo®, and Etsy® disagree on how to approach anomaly detection, while specialised companies, like Numenta®, adopt an automatic, domain-agnostic, neural networks approach. There are now even anomaly detection APIs offered as a service from Microsoft® Azure® and Anodot.
This talk will answer the following questions: what is an anomaly when you’re not a domain expert, and don't know what you're looking for? What is an anomaly when the inputs are multiple time series with partially known and complex relationships? Is there really one algorithm/solution for all datasets/products? In the context of creating solutions-based products, how much better is the latest and greatest anomaly detection algorithm than ones based on easy-to-implement time series models? Should the implementation be fully automated or is there value in enabling user customisation? Once an anomaly is identified, is it possible to find causality in associated data?
Bio: Coming from a background in Mathematics, I completed my PhD in Statistical Genetics, then worked as a Statistical Researcher and Consultant in the Pharmaceutical Industry. Then I joined SolarWinds as a Data Scientist 2 years ago.