One of the best outcomes of my recent experiences in BlackHat and DEFCON was becoming more aware of the greater data-driven community and their opinions and experiences with dealing with data.
As such, I was exposed to the ideas of some of our risk luminaries, and have been catching up on some awesome reading on a lot of very good blog posts. There were in particular two recent blog posts that caught my eye, both of them seemingly inspired on a very long discussion on the SIRA mailing list, which I was not a part of until after the
carnage discussion had ended.
Both posts deal with the role of different kinds of data on the greater analysis picture of Information Security, and my personal belief lies somewhere in the intersection of these two posts.
Alex’s post approaches it in a more econometric and philosophical way, as he layers the kinds of data on their level of strategical usefulness, from packet to log data, information risk and finally operational risk data, in a fair comparison to micro and macro economics.
Allison’s is much more pragmatic, seeking the differentiation (and demonstrating the similarities) of big data storage and processing technologies and our more traditional SIEM relational database infrastructure, and how the same techniques or at the very least the same data-driven concepts could be applied to both, in a fair comparison to errr… fruit salad.
And both reach the exact same point in very different ways: it is all data, and it can help your security objectives through data analysis and probabilistic techniques.
I have made this point over and over again in face to face meetings and in my talks: there are few individuals and organizations in InfoSec that are embracing the new capabilities that are being made available in our era of almost infinite storage and incredible computing power. We are always very late to the party in all technological and procedural advances based on a misguided belief that “InfoSec is different”, and then we are dragged kicking and screaming to the new reality by sub-optimal vendor interpretations of what we should need.
The main reason why I chose to begin the development of MLSec Project on “simple” SIEM and Log Management data is because every single organization has a lot of this data lying around. If quick wins can be demonstrated with this kind of data, maybe we can awaken the curiosity and the appetite of those organizations so that bigger questions can then be asked and answered satisfactorily.
The truth is that as much as I would love to tackle the more strategic and broad problems first, the necessary data is just not there. We are still taking such a beating in the lower levels of tactical and reaction-based security that there is sadly no time to build a shelter from the rain to try to do more noble and holistic work.
Maybe we can help turn this tide a bit. Let me know what you think in the comments.