The use of Big Data combined with security analytics has become a key weapon in the fight against cybercrime. Conventional detection mechanisms have failed due to the increase in the number of ‘unknown’ attacks. Let us look at some examples to understand how Big Data combined with analytical models can help solve this challenge.
Detecting the unknown
The easy way to understand the need for Big Data and analytics is to look at the evolution of viruses, malware and their detection. Until APT related malware became the cornerstone of serious attacks, malware detection was based on signatures. At that time there were more ‘knowns,’ so solutions could detect and identify existing patterns using a signature. Since the advent of APT and the use of sophisticated malware to penetrate large enterprises and government organisations, the ‘unknowns’ have become dominant.
Today, malware and other complex attacks are so sophisticated that trying to detect it through known patterns no longer works. The availability of crypters and packers to generate custom code along with more memory based attacks has made signature based detection obsolete. Alternative approaches emerged when signatures failed and analysts employed heuristic technologies to detect malware using the “sandbox” testing method. This approach was quite successful until malware creators started writing code to stop the execution of malware in a sandbox.
The next step is to utilise Big Data. As an example, infected end points could be sending back beaconing data to command and control servers that are not yet in the known blacklist. This data is available in proxies but is difficult to detect using signatures since well written malware do not have known beaconing that can be captured with a signature. It can be detected using clustering techniques running on a Big Data platform given the sheer volume of proxy logs.
Another example is to detect advanced malware by analysing the effects of data sets on processes and observing events from end points. This kind of data set is voluminous in any enterprise environment. In this approach, the anomalous behaviours of some end points are compared to other end points that are peers to detect anomalies that point to malware. This method calls for quick analysis of huge volumes of data as well as using analytical models to baseline normal patterns in end points against unusual patterns. The use of Big Data technologies combined with end point analytics enables the quick identification of any deviation and therefore identifies affected systems.
Lateral movement of attackers or malware is another use case that requires analytics on large volumes of data. Once an organisation is infiltrated using malware, malware moves laterally to other systems till the time they reach the crown jewels. Once they get what they want, data exfiltration starts from inside the organisation to external cybercrime syndicates. This essentially means that in order to detect lateral movement, analysts have to identify hundreds or thousands of lateral movement probes from billions of other events. This is inherently a difficult task and is different from conventional methods of attack detection that rely on identifying higher volume of attacks on lower number of data centre assets. It requires processing large volumes of events and identifying specific behaviour anomalies in the environment, which needs base lining normal user access and machine access and detecting deviations from it. This is an area that not only requires Big Data but also uses machine learning over Big Data.
These are but some examples. There are many such use cases that require analytics applied on large datasets for better detection of unknown attacks.
Moving to a new paradigm with big data and security analytics
The last two to three years have seen profound changes in the attack methodology of cybercrime syndicates. They have moved from directly targeting high value assets to indirect stealth attacks that are inherently difficult to detect. The information security industry is also evolving to detect and respond to such high impact attacks. Big data combined with security analytical models now play a key role in the detection of such attacks. It is no longer about the prevention of attacks, but more about quick detection and containment of the breach. In this new paradigm, Big Data security analytics leads the way in prediction of attacks, proactive detection of breaches, and quick containment.