It is impossible to protect APIs unless you take a deep dive into the protocols implemented over the standard HTTP. Most security tools are not protecting data where it’s most vulnerable, inside the XML schema itself. These encoding attacks are going unflagged by many application platforms, despite topline security tools and techniques they believe protect APIs. The problem is in the code and the relative inability to notice what malicious attack may be hiding inside at a fundamental level.
You may already know that whitehat researchers have used data protocol-level encodings to avoid WAF filtration. This bypass hacking technique is no secret. Hackers can use a JSON Unicode escape sequences to bypass WAFs. Most WAFs leave companies vulnerable and open to stealthy cyberattacks hide in API data. Unicode escape sequences can pass through simple RegExp-based signatures, bypassing JSON API protection. But it’s not just a problem for JSON.
Now, we want to describe different encoding techniques for SOAP/XML APIs as a continuation of that research. We can show why you absolutely cannot have API protections without dealing with the specific data encoding protocols.
Like JSON, XML allows the encoding of data in different ways, such as built-in entities. This can leave APIs vulnerable to cyberattacks. Here are three ways XML variation can obscure code, leaving it unmonitored (even though it may seem covered). The first two are comparably easier to protect against. The third mechanism that can be used to hide malicious code is more sophisticated and harder to protect against.
Let’s consider the following statement, as an example:
union select password from users--a-
If successfully executed, this statement would allow a hacker to gain access to the passwords of all the users on the system. Naturally, WAFs and other protection systems would try to block it.
A signature for this kind of attack would look for a string “union select password” in the XML encoded data stream.
Now, let’s see how that can be circumvented.
An internal variable can be used to replace some characters in strings inside XML tags. For example, you could use “ ” instead of a space, like this:
‘ union select password from users--a-
In this case, the XML parser will easily replace the named entity, “ ”, with a space character. However, the signature for union + select will not detect the replacement.
The second encoding option is using HTML HEX-codes, such as “U” instead of “U” character, like this:
‘Union select password from users — a-
Which is almost equal to \u0085 encoding for JSON previously described in _Bypassing WAFS with JSON Unicode Escape Sequences _or our article on why WAFs miss attacks hiding in API data.
The third obfuscation mechanism is the most complicated and dangerous one. Unlike the two other encoding options, this third mechanism requires the WAF to be fully compatible with XML data encoding to catch it. This evasion technique is based on XML-defined entities, which is almost equal to variables with defined values. This technique makes it possible to build any string inside XML tags using references for entities/variables defined earlier. Take a look:
<!DOCTYPE a [<!ENTITY u “uni”>]><a>&u;on select password from users — a-</a>
In this example, a piece of the payload was defined at the beginning of the XML document as the “u” variable used later inside <a> XML tag.
No regular expressions can be used to detect the replaced code. As explained by the Chomsky Hierachy, XML syntax is inherently not regular.
We have explained how grammar-based detection can bring the API parsing to the new level in a recent whitepaper Evolution of Real-Time Attack Detection.
Obfuscation mechanisms can be incredibly sophisticated, planting seeds that are imperceptible until a dangerous security problem or breach surfaces. Most security tools are not set up to detect the hidden code and block XML encoding attacks. Wallarm implements a full XML-compatible parser _inside _to block XML encoding attacks. It allows the protection of any SOAP, XMLRPC and other XML-based APIs out-of-the-box.
This screenshot shows how encoding attack block works:
This screenshot shows Wallarm solving for the custom entities-based evasion technique:
Cyberattackers can use your own code to infiltrate and change code completely undetected, bypassing standard API protections. Going deeper, by installing parsers inside, can strengthen the areas that are presently unpoliced by more superficial or traditional security tools.
If you enjoyed this article, take a look at our previous XML-related research in XXE That Can Bypass WAF Protection_. _Find more information about different XML encodings for entire documents.
Follow Wallarm Labs’ research on Medium.
Latest Bypassing Techniques Beats SOAP/XML API Protection was originally published in Wallarm on Medium, where people are continuing the conversation by highlighting and responding to this story.