What is threat hunting, and how to hunt cybercriminals correctly
Threat hunting or TH - proactive search for signs of hacking or the functioning of malicious programs that are not detected by standard protection tools. Today we’ll talk about how this process works, what tools you can use to search for threats, and what to keep in mind when creating and testing hypotheses.
What is threat hunting and why is it needed
In the threat hunting process, the analyst does not wait until the sensors of the security systems work, but purposefully searches for signs of compromise. To do this, he develops and verifies the assumptions of how attackers could penetrate the network. Such checks should be consistent and regular.
Proper implementation of the process must take into account the principles of:
- It must be assumed that the system is already hacked. The main goal is to find signs of penetration.
- To search, you need a hypothesis about exactly how the system was compromised.
- The search should be carried out iteratively, that is, after checking the next hypothesis, the analyst puts forward a new one and continues the search.
Often, traditional automated protection tools miss the complex targeted attacks . The reason is that such attacks are often distributed over time, so security tools cannot correlate the two phases of the attack. At the same time, attackers carefully think through penetration vectors and develop action scenarios in the infrastructure. This allows them not to perform unmasking actions and pass off their activity as legitimate. Attackers are constantly improving their knowledge, buying or developing new tools.
The issues of identifying targeted attacks for organizations that were previously hacked are especially relevant. According to the report FireEye M-Trends, 64% of previously compromised organizations were again attacked. It turns out that more than half of the hacked companies are still at risk. So, you need to apply measures for early detection of facts of compromise - this can be achieved with the help of TH.
Threat hunting helps security professionals reduce the time to detect hacking, as well as update knowledge about the protected infrastructure. TH is also useful when using threat intelligence (TI) - especially when using TI indicators when hypothesizing.
How to formulate hypotheses for testing
Since during the TH a priori it is assumed that the attacker has already penetrated the infrastructure, the first thing to do is to localize the location of the search for traces of hacking. It can be determined by hypothesizing how penetration occurred and what confirmation can be found in the infrastructure. Having formulated a hypothesis, the analyst checks the truth of his assumption. If the hypothesis is not confirmed, the expert proceeds to the development and testing of a new one. If, as a result of testing the hypothesis, traces of hacking are found or the presence of malware is found, then the investigation begins.
Figure 2. Schematic of the threat hunting
The idea of a hypothesis can be born from the personal experience of the analyst, however, there are other sources for its construction, for example:
- Threat intelligence indicators (TI indicators). The simplest hypothesis with the structure is: an attacker uses a new modification of utility X, which has an MD5 hash Y .
- Attacking Techniques, Tactics, and Procedures (TTPs). Information about the TTPs of modern cybercriminals can be found in the MITER ATT & amp; CK database. Hypothesis example: an attacker hacked a user workstation and, using brute force, tries to find the password for a privileged account .
- Analytics of automated infrastructure data processing tools. Their data will help identify anomalies. For example, using asset management systems, you can notice the appearance of a new node on the network without the knowledge of administrators.A sharp increase in the volume of traffic on a network node can also become a reason for a more detailed study of this node.
- Information discovered during verification of previous hypotheses.
Threat hunting tools
After formulating the hypothesis, it is necessary to determine the sources of data that may contain information to verify it. Often such sources contain too much data, among which you need to find relevant ones. Thus, the TH process comes down to researching, filtering and analyzing a huge amount of data about what is happening in the infrastructure. Consider the sources in which you can find information to test the search hypothesis:
Figure 3. Classification of sources of information for TH
Most relevant information is contained in the logs and network traffic. The products of the SIEM (security information and event management) and NTA (network traffic analysis) classes help analyze information from them. External sources (such as TI feeds) also need to be included in the analysis process.
How it works in practice
The main goal of TH is to detect a hack that was not detected by automated security tools.
As an example, let us examine the tests of two hypotheses. In practice, we show how traffic analysis and log analysis systems complement each other in the process of hypothesis testing.
Hypothesis No. 1: an attacker entered the network through a workstation and tries to gain control over other nodes in the network, uses command execution via WMI technology to advance.
Intruders obtained privileged user credentials. After that, they try to gain control over other nodes in the network in order to get to the host with valuable data. One way to run programs on a remote system is to use the Windows Management Instrumentation technology (WMI) ) She is responsible for centralized management and monitoring of the various parts of the computer infrastructure. However, the creators provided the possibility of applying this approach to the components and resources of not only a single host, but also a remote computer. For this, the transfer of commands and responses via the DCERPC protocol was implemented.
Therefore, to test the hypothesis, you need to examine DCERPC queries. We show how this can be done using traffic analysis and the SIEM system. In fig. 4 shows all filtered network interactions using the DCERPC protocol. For example, we chose the time interval from 06:58 to 12:58.
Figure 4. Filtered DCERPC Sessions
In fig. 4 we see two dashboards. On the left are the nodes that initiated DCERPC connections. On the right are the nodes that clients connected to. It can be seen from the figure that all clients on the network access only the domain controller. This is a legitimate activity because hosts joined in an Active Directory domain use DCERPC to contact the domain controller for synchronization. It would be considered suspicious in case of such communication between user hosts.
Since nothing suspicious for the selected period of time has been identified, moving along the timeline, select the next 4 hours. Now this is the interval from 12:59 to 16:46. In it, we noticed a strange change in the list of destination hosts (see Figure 5).
Figure 5. After changing the time interval, two new nodes appeared in the list of servers
The destination host list contains two new nodes. Consider one that does not have a DNS name (10.125.4.16).
Figure 6.Refine the filter to find out who connected to 10.125.4.16
As can be seen from fig. 6, the domain controller 10.125.2.36 accesses it (see Fig. 4), which means that such an interaction is legitimate.
Next, you need to analyze who connected to the second new node, in Fig. 5 is win-admin-01.ptlab.ru (10.125.3.10). From the name of the node it follows that this is the administrator's computer. After refining the filter, only two session source nodes remain.
Figure 7. Refining the filter to find out who connected to win-admin-01
Similarly to the previous case, one of the initiators was a domain controller. Such sessions are not suspicious, since this is a common occurrence in an Active Directory environment. However, the second node (w-user-01.ptlab.ru), judging by the name, is a user computer - such connections are anomalies. If you go to the Sessions tab with this filter, you can download traffic and see details in Wireshark.
Figure 8. Download relevant sessions
In traffic, you can see a call to the IWbemServices interface, which indicates the use of a WMI connection.
Figure 9. Accessing the IWbemServices (Wireshark) interface
Moreover, the transferred calls are encrypted, so the specific commands are unknown.
Figure 10. DCERPC traffic is encrypted, so the transmitted command (Wireshark) is not visible
In order to finally confirm the hypothesis that such an interaction is illegitimate, it is necessary to check host logs. You can go to the host and see the system logs locally, but it is more convenient to use the SIEM system.
In the SIEM interface, we entered into the filter a condition that left only the logs of the target node when the DCERPC connection was established, and we saw the following picture:
Figure 11. System logs win-admin-01 at the moment of establishing a DCERPC connection
In the logs, we saw an exact match with the start time of the first session (see Fig. 9), the initiator of the connection is the w-user-01 host. Further analysis of the logs shows that they connected under the PTLAB \ Admin account and launched the command (see Fig. 12) to create the john user with the password password !!!: net user john password !!!/add.
Figure 12. The executed command during the connection
We found out that from host 10.125.3.10 someone on WMI, on behalf of the PTLAB \ Admin account, added a new user to the win-admin-01.ptlab.ru host. In real TH, the next step is to find out if this is an administrative activity. To do this, contact the owner of the PTLAB \ Admin account and find out if he carried out the described actions. Since the considered example is synthetic, we assume that this activity is illegitimate. Also, when conducting a real TN in the event that an illegal use of the account is detected, an incident must be created and a detailed investigation conducted.
Hypothesis No. 2: an attacker entered the network and is at the stage of exfiltration of data; it uses traffic tunneling to output data.
Traffic tunneling is the organization of the channel so that packets of one network protocol (possibly in a modified form) are transmitted inside the fields of another network protocol. A standard example of tunneling is building encrypted channels, such as SSH. Encrypted channels ensure the confidentiality of transmitted information and are distributed in modern corporate networks. However, there are exotic options, such as ICMP or DNS tunnels. These tunnels are used by cybercriminals to disguise their activity as legitimate.
Let's start by looking for the most common way to tunnel traffic — through the SSH protocol. To do this, we filter out all sessions using the SSH protocol:
Figure 13. Searching DNS session traffic
The figure shows that there is no SSH traffic in the infrastructure, so you need to choose the following protocol, which could be used for tunneling. Since DNS traffic is always allowed on corporate networks, then we will consider it later.
If you filter traffic by DNS, you can see that one of the nodes has an abnormally large number of DNS queries.
Figure 14. Widget with statistics of DNS client sessions
By filtering the sessions by the source of the requests, we found out where such an abnormal amount of traffic is sent and how it is distributed between the destination nodes. In fig. Figure 15 shows that part of the traffic goes to the domain controller, which acts as a local DNS server. However, a large proportion of requests go to an unknown host. In a corporate network built on Active Directory, user computers must not use an external DNS server to bypass the corporate server to resolve DNS names. If such activity is detected, you need to find out what is transmitted in the traffic and where all these requests are sent.
Figure 15. Searching SSH session traffic
If you go to the “Sessions” tab, you can see what is transmitted in requests to the suspicious server. The time between requests is quite small, and the sessions themselves are many. Such parameters are not typical for legitimate DNS traffic.
Figure 16. DNS traffic settings
Opening any session card, we see a detailed description of requests and responses. Responses from the server do not contain errors, but the requested records look very suspicious, because usually the nodes have shorter and more meaningful DNS names.
Figure 17. Suspicious DNS record request
Traffic analysis showed that suspicious activity occurs on the win-admin-01 host to send DNS queries. It's time to analyze the logs of the network node - the source of this activity. To do this, go to SIEM.
You need to find the system logs win-admin-01 and see what happened around 17:06. It can be seen that at the same time, a suspicious PowerShell script was running.
Figure 18. Running PowerShell at the same time as sending suspicious requests
The logs indicate which script was executed.
Figure 19.Logging the name of the running script
The name of the executed script admin_script.ps1 hints at legitimacy, but administrators usually give the scripts a name by a specific function, and here the name is common. Moreover, the script is located in the folder for temporary files. It is unlikely that an important administrative script will end up in a folder that can be cleaned at any time.
Among the events found the creation of an unusual cryptographic class from the Logos.Utility library. This library is rare and is no longer supported by the developer, so creating its classes is unusual. Let's try to find projects that use it.
Figure 20. Creating a custom cryptographic class
If you use the search, you can use the second link to find a utility that organizes a DNS tunnel and uses this class.
Figure 21. Searching for script information by class name
To make sure that this is the utility we need, we look for additional signs in the logs. So the evidence came to light. The first is to run the nslookup utility using a script.
Figure 22. Running the nslookup utility script
The nslookup.exr utility is used during network diagnostics and is rarely run by ordinary users. You can see the launch in the source code of the utility.
Figure 23. Nslookup (GitHub) utility startup code
The second proof is a rather unique string for generating random values.
Figure 24. Script generation of random values
If you use the source code search, you can see exactly this line.
Figure 25. Code for generating a random value
The tunnel hypothesis was confirmed, but the essence of the actions performed was unclear. During the subsequent analysis of the logs, we noticed two process launches.
Figure 26. Search for office documents for further exfiltration
The launch lines of the found processes indicate the search for documents for download. Thus, the hypothesis was fully confirmed, the attackers really used traffic tunneling to download data.
As the latest analytical reports show, the average time for the presence of intruders in the infrastructure remains long lasting. Therefore, do not wait for signals from automated protection means - act proactively. Learn your infrastructure and modern attack methods, and use the research that TI teams conduct ( FireEye , Cisco , PT Expert Security Center ).
I do not call for the abandonment of automated protection. However, one should not assume that the installation and correct configuration of such a system is the final point. This is just the first necessary step.Next, you need to monitor the development and functioning of the controlled network environment, keep your finger on the pulse.
The following tips will help you with this:
- Learn your infrastructure. Choose a convenient approach to managing network assets. It is necessary at any time to be ready to answer the question about what function a particular node performs and give information on it.
- Identify the most important risks and periodically test hypotheses against them. Networks come in many sizes, for large and distributed infrastructures it is very important to highlight critical sites.
- Follow the latest trends in the field of information security. In particular, be prepared to respond to recent vulnerabilities and new attack methods. Test your defenses periodically for a new threat. If the threat has not been identified, make a hypothesis for TH from this attack and test it until the automated defenses begin to identify it.
- Automate routine tasks so that more time is left for applying a creative approach and testing non-standard solutions.
- Simplify the process of analyzing large amounts of data. To do this, it is useful to use tools that help the analyst see what is happening on the network and on the network nodes as a single picture. Among these tools is the TI indicator exchange platform , traffic analysis system and SIEM system .
Author : Anton Kutepov, Specialist, PT Expert Security Center Positive Technologies.
The entire analysis was conducted in the PT Network Attack Discovery traffic analysis system and the MaxPatrol SIEM security event management system.