Vetting Threat Intelligence
If your organization has ever gotten an alert for “facebook.com” because someone didn’t vet indicators of compromise properly somewhere along the chain between threat intelligence generation by a third-party and consumption by your security infrastructure, you’re not alone.
Facebook might be an extreme example, but it’s common for organizations to alert on, for instance, web hosting companies because they’re hosting malicious phishing sites, only to receive false positive alerts when employees are visiting legitimate blogs on those same hosting services.
At the time of writing, Pulsedive was showing 824 phishing sites hosted on sites.google.com, but unless you want a ton of false positives, you should absolutely not alert on the sites.google.com domain.
Ever seen alerts for a Chinese ISP because some malware activity was observed coming from there a few months ago? Or a Russian social networking site like vk.com because someone shared a link to a malicious file in a spam email? Or a Pakistani news site because it’s hosted in another country? All bad indicators of compromise, and they might have even come from threat intelligence feeds or mailing lists your organization is paying for.
I’m going to provide some practical suggestions below for how to vet IOCs from existing threat intelligence sources. I’ll mainly be using examples related to network IOCs but these points should apply to all types of threat intelligence.
Vetting Process
As a general tip and if your organization has the resources, all of your threat intelligence should be vetted internally to avoid time wasted in the future when your analysts are responding to alerts; both paid and free threat intelligence feeds have inaccuracies that will trigger false positives. Pulsedive is useful in that it performs this vetting automatically upon ingestion of threat intelligence, but if you’re stuck with entirely manual resources or solutions that don’t vet threat intelligence, you can do the following.
- Script a pass-through of IOC lists against a whitelist and filter known clean ones out. You can use the first 1,000 or so domains from the Alexa/Cisco Umbrella top million sites, and the NIST National Software Reference Library (NSRL) for hashes. Additionally, it’s wise to filter out reserved IPs.
- Perform searches for lists of new IOCs against your traffic for the past week or so. Investigate common hits and remove ones that appear to be normal traffic. If you spot malicious hits amongst normal activity, you should replace the existing IOC with a new one that isolates the malicious activity, such as a query string, URL path, file hash, etc.
- Run test alerts with the new IOCs or push them into production, depending on how your process works, but in either case routinely investigate false positives and add them to a whitelist to prevent time wasted.
I should note that whitelisting using the Alexa/Cisco Umbrella top million sites list can be a little tricky and controversial. If your IOC is a domain, using the list is probably fine. If the IOC is a URL and the domain is on the list, you should keep the full URL as an IOC and not just the domain.
There is also the question of compromised sites on the list; in 2013 for instance, php.net was compromised, and the domain appears as #44691 on the Cisco Umbrella list. My opinion is that a compromised site that is typically high-profile and popular would not make a good IOC, and a downloaded file hash or redirect URL should be used instead.
If you’re interested in a solution that automatically vets open source threat intelligence and provides a downloadable export of high- and critical-risk IOCs, Pulsedive can deliver.
Good IOCs
Unique to malicious activity
A good IOC triggers on malicious activity. In organizations where documentation for incidents created from alerts is required, false positives can be a huge time waster. If an IOC triggers on both normal and malicious activity, you should try to create an IOC that isolates the malicious activity, as mentioned above.
Less specific
It’s much more practical to add a malicious domain than it is to add 30 malicious URLs, given the option. The more specific the IOC, the greater the chance you’ll miss something and the more threat intelligence you’ll have to maintain.
Not too general either
Blocking an entire ISP, top-level domain, or country is probably a bad idea. One of your employees could have innocent ties to foreign countries, or servers for clean domains could be hosted globally. Additionally, an employee could have a side project hosted at home, want to check in on their home security system, or RDP into a home server and retrieve a file, so I would not recommend alerting on traffic to ISP customer IP ranges either.
Blocking a netblock or small IP range where malicious activity has been observed is preferred, as botnets and other threats could be using rotating IPs within the same range.
Regarding host-based threat intelligence, you may want to avoid adding general filenames and instead store hashes or other IOCs specific to the file in question. ssdeep hashes can be used to compare similar files, provided they are available, and thus may be effective against different versions of the same strain of malware.
Relevant for longer
IPs and domains could change very often (see: domain generation algorithm and fast flux) as can file hashes for different versions of updating malware. That’s not to say these types of IOCs are not useful; phishing sites for example need to stay online for longer to gather more visitors and thus potential victims.
When given the option though, it is preferred to store IOCs that remain relevant for longer. HTTP headers, for example, could remain the same for changing domains or IPs, as the back end technologies of the threat infrastructure are unlikely to change as often. The WHOIS registrant may also be shared among several domains and could be used to identify a threat actor or group. The SSL certificate could be present on several malicious hosts in a single threat infrastructure as well.
All of these are searchable in Pulsedive and could help correlate threats or IOCs. If we take a look at Zeus activity, we can see that the X-Adblock-Key header and value is shared by several indicators, and could be used to block IOCs that haven’t even been observed yet. Searching that header in Pulsedive reveals that 22 IOCs have that HTTP header and value but only 5 are linked with Zeus through open source feeds.
Conclusion
Faulty threat intelligence can be frustrating for any analyst, but proper vetting of threat intelligence can significantly reduce false positives and by extension save time for your organization. Pulsedive can help with vetting, and the the techniques above could be implemented in any organization to help strengthen security posture.
If you have suggestions or feedback, please let me know! You can tweet at @pulsedive, email me at dan@pulsedive.com, or send feedback through the feedback form on our website.