Written by Asaf Nadler and Avi Aminov
Spyware is a malicious software (malware) used to gather information about a person or organization without their consent. In a typical setting, a remote server, that acts as a command and control server (C&C), waits for an incoming connection from the spyware that contains the gathered information. Statistics reported by Avast estimate that nowadays over 100M types of spyware are active worldwide.
In the presence of network security products (e.g., firewalls, secure web gateways, and antiviruses), spyware must communicate with its C&C server over a covert channel, to prolong its operation. Among commonly used covert channels, the domain name system (DNS) protocol stands out.
Data Exchange over the DNS Protocol****
The DNS protocol is a core component of the Internet protocol (IP) suite; it's main goal is the translation of hostnames to IP addresses. The growing number of domains in the Internet today exceeds the storage capabilities of a single database server, thus the DNS protocol was designed as a distributed database. Each hostname resolution corresponds to a single server within the distributed database, also known as an authoritative name server (AuthNS). Upon a request for a hostname resolution, a DNS client iterates over authoritative name servers until it reaches the correct one. Once the correct AuthNS is reached, it replies with an answer corresponding to the appropriate hostname. One can think of the hostname as incoming data for the AuthNS as displayed in Figure 1 (e.g, when requesting for "passw0rd.exfiltration.com", the AuthNS for "exfiltration.com" acquired the input "passw0rd").
Figure 1 - The scheme of data exchange over the DNS.
As a channel for the exchange of data, the DNS protocol is far from optimal with regards to efficiency and reliability. The DNS protocol restricts queries (i.e. outbound messages) to 255 bytes of letters, digits, and hyphens. Also, since the DNS protocol is used mostly over the User Datagram Protocol (UDP), there is no guarantee that queries will be replied based on their order of arrival.
Nevertheless, from a security standpoint, the DNS protocol is an excellent covert channel. Due to its crucial internet role, misconfiguration of the DNS can lead to network disconnects, and it is therefore rarely restricted with security policies (e.g., allowing resolutions only to specific domain names). In addition, the DNS protocol is often less monitored in comparison to other Internet protocols (e.g., HTTP, FTP, and mail transfer protocols) for posing a lesser risk. It follows that use of the DNS protocol as a covert channel has been a part of previous cyber campaigns, including: a 56M credit cards theft from Home Depot in 2014, and a 25k credit cards theft from Sally Beauty.
During the last decade, several open-source software, as well as spyware, made use of the DNS protocol for data exchange. While the scheme for data exchange (as described before) remains the same, the communication pattern of the protocol varies. As a result - the detection techniques change as well. In the next sections, we introduce two classes of data exchange over the DNS protocols: (1) high throughput DNS tunneling and (2) low throughput exfiltration malware, and review existing techniques for their detection.
High Throughput DNS Tunneling****
High throughput DNS tunneling (shortly, DNS tunneling) is a family of freely available software for data exchange over the DNS protocol. The DNS tunneling family includes software such as: Iodine, Dns2tcp, and DNSCat. Most of these are general purpose, thus allowing various types of data exchange (e.g., web browsing, file transfer and remote desktop control).
Although a commonly known and non-malicious use of DNS tunneling is bypassing Wi-Fi payment by setting up a DNS tunnel for web browsing, it may also be used as a communication channel between a malware and its C&C server. Therefore, there is a clear motivation for the security community to detect DNS tunneling.
In order to further discuss the detection of DNS tunneling, its unique characteristics should first be addressed. Because the DNS protocol is based mostly over UDP, there's no guarantee for the arrival of messages in the order in which they were sent. This is handled by DNS tunneling tools by either enforcing a TCP communication over the DNS, or sending constant ping messages between requests to assure the correct order. Applying these methods for the sake of integrity, increases the rate of messages over the DNS protocol. Also, when a DNS tunneling tool is used for either web browsing or file transfer, the volume and length of messages will increase as well in comparison to normal DNS traffic behavior.
Due to the latter, we expect the presence of DNS tunneling to cause a significant change of the DNS traffic with regards to: (1) volume, (2) messages length, and (3) a shorter mean time between messages (see Figure 2).
_Figure 2 - PCAP recording of the DNS traffic while a tunneling session is active. The mean time between queries is less than 1 second due to keep-alive messages, responses to the same primary domain vary enough to implicate the existence of a bi-directional channel, and the average query length exceeds 100 characters. All the aforementioned are unusual compared to normal DNS traffic. _
Based on this distinguishable behavior, current solutions focus on the detection of DNS tunneling by relying on the volume and variety of requests that these tools generate. The obvious solution is rate control, which is offered by security vendors. Other more sophisticated solutions rely on statistical models. Among such models are: supervised learning models trained on tunneling versus non-tunneling users traffic, and anomaly detection models that will trigger upon a significant change over the DNS traffic as whole. Such models prove themselves highly effective with regards to recall (i.e., rate of detection) and false positive rates. While the problem of DNS tunneling detection is important, and has been studied thoroughly, an entire class of low throughput DNS exfiltration malware remained overlooked. This class, containing at least nine malware over the last seven years (see Figure 3) is discussed in the next part.
Low Throughput DNS Exfiltration Malware****
In the case of malware, data exchange over the DNS may avoid a TCP tunnel and constant pings. Instead, short and sporadic messages can be delivered on rare occasions. For example: a malware "wakes up" once an hour and sends a poll message to its C&C in which its asking for instructions, or instead a malware detects a credit card swipe and sends it to the C&C without waiting for a response.
The below malware's (Figure 3) short and sporadic message exchange policy is designed to avoid DNS tunneling detection solutions relying on a high message volume, lengthy queries and density.
First Seen On
Figure 3 - List of low throughput DNS exfiltration malware
To the best of our knowledge, the problem of detecting low throughput DNS exfiltration malware had not been studied. In an upcoming blogpost, we will elaborate further on the matter and unveil a novel solution aimed at the challenging case of such malware.
Asaf Nadler is a senior security researcher at Akamai Technologies. He is currently pursuing a PhD in Information Systems Engineering at Ben Gurion University of the Negev. His research interests span both machine learning and computer network security.