Skip to content
Austin Leahy edited this page Dec 14, 2016 · 1 revision

Problem Statement

Work done on Spot has accomplished incredible things. But operationalizing the process is still a daunting task. This is a struggle shared by most products in the information security space. Bluntly there is no solution in the marketplace today that provides fast access to stores of InfoSec data at scale. Security data sources lean strongly towards the unstructured document or log format. This creates a natural operational barrier because incident response is about joining together disparate data points, an area where document or log stores tend to fall down.

Operators in the field might argue that products like Splunk or ArcSight do very well overcoming these types of challenges through linking and search capabilities. These arguments are all guilty of the fallacy of the “Argument from incredulity”. Incident response teams (even the best ones) live in a technological era that often requires submitting queries that won’t complete for long periods of time. In the same way that in the last 10 years the field of statistics has moved from platforms like SPSS to programming languages like Python and R to manipulate and analyze data, so too is InfoSec on the cusp of paradigm shift.

[talk about ODM]

The answer to this challenge is graph. Graph databases provide the best of both worlds from a security standpoint. Each edge or node in the graph is simply a stored document maintaining the flexibility of a noSQL like object, while the schema and edge connections provide clear traversable relationships.

Nothing new and shocking here, graph databases have existed in a niche corner of the market for years and have ultimately seen limited adoption. The problem with most graph platforms is that they are simply unable to perform at Petabyte scale.

This proposal will make the argument that the conditions and timing are now right for the creation of a petabyte scale open source graph platform built on Hadoop to serve the operational needs of Apache Spot and the greater Cyber Security industry as a whole.

Clone this wiki locally