OpenSOC was a project that always interested me. In December 2015 it became Apache Metron when it was accepted into Apache Incubation. The Metron team builds an extensible, open architecture to account for the variety of tools used in customer environments (thousands of firewalls, thousands of domains and a multitude of Intrusion Detection Systems). In april 2016, Metron 0.1 was released.
To Quote the site, "Apache Metron (Incubating) provides a scalable advanced security analytics framework built with the Hadoop Community evolving from the Cisco OpenSOC Project. A cyber security application framework that provides organizations the ability to detect cyber anomalies and enable organizations to rapidly respond to identified anomalies."
Look at the site:
Basic idea for data correlation and enrichment:
Lately I have been working with researchers on security tool development. I found the following break down on the benefit and interesting contrast:
Metron Benefits. SOC Analyst & Investigator Perspective
I am going to take directly from the Metron documentation:
Apache Metron provides keys capabilities not found in traditional security tools:
- Looking through Alerts
- Centralized Alerts Console - Having a centralized dashboard for alerts and the telemetry events associated with the alert across all security data sources in your enterprise is a powerful feature within Metron that prevents the Analyst from jumping from one console to another.
- Meta Alerts - The long term vision of Metron is to provide a suite of analytical models and packs including Alerts Relevancy Engine and Meta-Alerts. Meta Alerts are generated by groupings or analytics models and provide a mechanism to shield the end user from being inundated with 1000s of granular alerts.
- Alerts labeled with threat intel data - Viewing alerts labeled with threat intel from third party feeds allows the analyst to decipher more quickly which alerts are legitimate vs false positives.
- Collecting Contextual data
- Fully enriched messages - Analyst spend a lot of time manually enriching the raw alerts or events. With Metron, analysts work with the fully enriched message.
- Single Pane of Glass UI - Single pane of glass that not only has all alerts across different security data sources but also the same view that provides the enriched data
- Centralized real-time search - All alerts and telemetry events are indexed in real-time. Hence, the analyst has immediate access to search for all events.
- All logs in one place - All events with the enrichments and labels are stored in a single repository.
- Granular access to PCAP - After identifying a legitimate threat, more advanced SOC investigators want the ability to download the raw packet data that caused the alert. Metron provides this capability.
- Replay old PCAP against new signatures - Metron can be configured to store raw pcap data in Hadoop for a configurable period of time. This corpus of pcap data can then be replayed to test new analytical models and new signatures.
- Tag Behavior for modeling by data scientists
- Raw messages used as evidentiary store
- Asset inventory and User Identity as enrichment sources.
Note that the above 3 steps in the analyst workflow make up approximately 70% of the time. Metron will drastically decrease the analyst workflow time spend because everything the SOC analyst needs to know is in a single place.
From the researcher's POV:
Data Scientist Perspective
To quote once more from the Metron documentation on the key capabilities the project offers:
- Finding the Data
- All my data is in the same place - One of the biggest challenges faced by security data scientists is to find the data required to train and evaluate the score models. Metron provides a single repository where the enterprise’s security telemetry data are stored.
- Data exposed through a variety of APIs - The Metron security vault/repository provides different engines to access and work with the data including SQL, scripting languages, in-memory, java, scala, key-value columnar, REST APIs, User Portals, etc..
- Standard Access Control Policies - All data stored in the Metron security vault is secured via Apache Ranger through access policies at a file system level (HDFS) and at processing engine level (Spark, Hive, HBase, Solr, etc..)
- Cleaning the data
- Metron normalizes telemetry events - As discussed in the first blog where we traced an event being processed by the platform, Metron normalizes all telemetry data into at least a standard 7 tuple json structure allowing data scientists to find and correlate data together more easily.
- Partial schema validation on ingest - Metron framework will validate data on ingest and will filter out bad data automatically which is something that data scientists, traditionally, spend a lot time doing.
- Munging Data
- Automatic data enrichment - Typically data scientists have to manually enrich data to create and test features or have to work with the data/platform team to do so. With Metron, events are enriched in real-time as it comes in and the enriched event is stored in the Metron security vault.
- Automatic application of class labels - Different types of metadata (threat intel information, etc…) is tagged on to the event which allows the data scientists to create feature matrixes for models more easily.
- Massively parallel computation framework - All the cleaning and munging of the data is using distributed technologies that allows the processing of these high velocity/ large volumes to be performant and scalable.
- Visualizing Data
- Real-time search + UI - Metron indexes all events and alerts and provides UI dashboard to perform real-time search.
- Apache Zeppelin Dashboards - Out of the box Zeppelin dashboards will be available that can be used by SOC analysts. With Zeppelin you can share the dashboards, substitute variables, and can quickly change graph types. An example of a dashboard would be to show all HTTP calls that resulted in 404 errors, visualized as a bar graph ordered by the number of failures.
- Integration with Jupyter - Jupyter notebooks will be provided to data scientists for common tasks such as exploration, visualization, plotting, evaluating features, etc..
Note that the above 4 steps in the data science workflow make up approximately 80% of the time. Metron will drastically reduce the time from hypothesis to model for the data scientist.
A final image of the overall architecture:
There is much work that needs to be done, but it is interesting to look through and keep an eye on its development.
If you have access to Safari books online, take a look at the “Achieving Real-time Ingestion and Analysis of Security Events through Kafka and Metron” by Kevin Mao:
Kevin works for Captial One and presented at Strata+Hadoop World. Kevin talks about how capital has about 40 data feeds about 5 billion events a day. Peak 75k events/sec. About 5T a day. CaptialOne initially went with commercial vendors. Had issues scaling past 30 – 60 days. The needed something more scalable. Kevin mentioned they tried Splunk as part of the SIEM solution. He states how Splunk has a grate UI experience. A lot of their analysts loved Splunk, but it was prohibitively expensive to store that much data.” I am hearing that alot lately. They created Purplerain. Name came because executives were referring to the works as “the project formally known as” … Here are the major components:
Jumping forward to Kevin's slide on Metron:
Capital One's more cost effective solution was using the ELK stack. Do check out Kevin's talk. It was interesting to hear from a company that has started using Metron.