
Abstract: The typical cybersecurity application involves safeguards and mechanisms to protect against outside compromise. However, insiders are increasingly becoming one of the largest threats to an organization due to their detailed knowledge of system operations, security practices, and legitimate physical and electronic access to critical systems. Insider threat can be difficult to detect because a compromised employee is acting permissibly and is therefore not explicitly breaking rules; yet is creating a pattern of suspicious behavior.
Pandata partnered with FirstEnergy’s Security Operations Center to develop an AI solution around insider threat detection to create a Holistic Risk Profile for all employees based on physical and digital behavior. This approach builds on user behavior analytics – profiling behavior to determine first what constitutes normal, then determine what is abnormal, and finally, what is malicious, done in the absence of labeled threat data. To accomplish this task, we first developed custom stream data-pipelines and a pre-processing profiler that enriches and stores several internal operational and security data sources in a Hadoop Data Lake. The profiler component aggregates several employee “identities” across the organization into a single curated “employee security profile” to which varying levels of criticality are assigned by rules-based associations for location, job role and asset access. This strategy provides us the ability to quickly process near-real-time data activity on the order of hundreds of thousands of events per second.
Through a partnership with cybersecurity analysts, we incorporated human expertise alongside an ensemble machine learning model with a human-in-the-loop retraining component to develop a model that both detects abnormality and attributes risk to patterns of behavior. While the work is still ongoing, this AI solution built using a combination of Python and PySpark programming languages as well as several big data technologies available through the Cloudera Data Hub Platform allows cybersecurity analysts insight into long-term patterns of behavior and both prioritizes and reduces activity to investigate down from tens of thousands of events. This solution not only takes the necessary steps to relieve “security analyst burnout” a syndrome that commonly plagues the cyber security industry and often results from an analyst having to manually respond to hundreds of false positive threats daily, but empowers them to focus on the infrequent yet noteworthy events that would have otherwise been missed.
Bio: Danielle Aring is an IT Security Data Engineer IV with the Transmission Security Operations Center (TSOC) at FirstEnergy. In her role, she is responsible for the design, development, implementation and maintenance of IT security equipment and software. Danielle holds a master’s in Computer Information Science from Cleveland State University. With an extensive background in software engineering and expertise in machine learning, Danielle is guiding the transition of the TSOC away from reactionary, rules-based threat detection to preventative, predictive, threat-hunting approaches. She built her organizations’ security data lake in Hadoop from the ground up. Developed several large-scale data pipelines for near real-time security log ingest along with alerting, monitoring and metrics. Danielle is passionate about cybersecurity educational awareness and innovative applications of AI/ML to the changing threat landscape.