Abstract: It is important to efficiently determine the health of large complex systems by detecting anomalous behavior, where anomalies in the system data can help detect if there is a failure or an impending failure. The goal is to detect anomalous behavior before it escalates to severe service degradation or a service impacting outage.
In this talk, using sequential multivariate system performance data, we present the application of multivariate change detection algorithms and visual analytics methods for detecting and diagnosing anomalous behavior with low latency in a large networking system. A brief overview of anomaly detection concepts will also be presented.
Multivariate change detection algorithms based on non-parametric change detection methods are applied to the data to detect anomalies and present diagnostic information at fine time granularity. We identify whether a change point is a single time stamp (pointwise anomaly) or a collection of time stamps (collective anomaly) that does not conform with the general pattern of data.
Two unsupervised change point detection methods are used, namely, the Bayesian approach and the distance-based approach. For the Bayesian approach, we deploy the following R packages: changepoint.mv and anomaly. The R package ecp is selected for the distance-based change detection approach. An advantage of the changepoint.mv package is that it also provides diagnostic capability in terms of explicitly identifying both the change point location and the variables associated with the change point.
The R packages used for change detection will be described in terms of their capabilities and characteristics, and the R code used for the analysis will be shared. In addition, the use of self-organizing maps (using the R kohonen package) for visual analytics will be presented. We demonstrate our methods with real data.
Bio: Bio Coming Soon!