Section 09

Cloud & AI/ML Risk Assessment

Cloud-native architectures and AI/ML systems introduce risk dimensions absent from traditional TRA. This section covers shared responsibility model gaps, multi-cloud inconsistencies, MITRE ATLAS for AI threat modeling, LLM-specific risks, and the NIST AI Risk Management Framework.

Related Sections

See Offensive AI for adversarial ML attack techniques. See Cloud Pentesting for hands-on cloud security testing.

Shared Responsibility Gaps

The shared responsibility model defines who secures what — but the gaps between provider and customer responsibilities are where most cloud incidents occur.

Layer IaaS PaaS SaaS Common Gaps
Data Customer Customer Shared Encryption key management, data classification in cloud
Identity & Access Customer Shared Shared Service account sprawl, cross-account access, federation trust
Application Customer Shared Provider Serverless function permissions, API gateway config
Network Shared Provider Provider VPC peering assumptions, private endpoint coverage
Infrastructure Provider Provider Provider Hypervisor vulnerabilities, hardware supply chain

The Shared Responsibility Gap

The most common cloud security failures occur not where responsibility is clear, but in the gray zones — where both parties assume the other is handling security. Your TRA must explicitly document who owns each control and verify implementation.

Cloud-Native TRA Patterns

Serverless / FaaS

  • Ephemeral execution — traditional network scanning doesn't apply
  • • Overpermissioned function IAM roles (most common risk)
  • • Event injection via untrusted triggers (S3, SQS, API Gateway)
  • • Cold start secrets in environment variables
  • • Shared execution environment side-channel risks
  • • Lack of runtime monitoring visibility

Container / Kubernetes

  • • Base image vulnerabilities propagating to all pods
  • • RBAC misconfiguration (cluster-admin to default SA)
  • • Pod-to-pod network policy gaps
  • • Secrets in etcd (encrypted at rest?)
  • • Container escape via privileged mode
  • • Supply chain: Helm chart and operator trust

Multi-Cloud Risk

Control Plane Inconsistencies

IAM models differ fundamentally between AWS (policy-based), Azure (RBAC-based), and GCP (project-based). A "least privilege" policy in AWS may not translate equivalently to Azure. Your TRA must assess each provider's control implementation independently — not assume parity.

Data Sovereignty & Residency

Multi-cloud deployments may replicate data across jurisdictions. Assess: Where does data physically reside? Which laws apply (GDPR, CCPA, data localization requirements)? Can the provider compel data disclosure under foreign law (e.g., US CLOUD Act)?

Identity Federation Gaps

Cross-cloud workload identity federation (AWS roles ↔ GCP service accounts ↔ Azure managed identities) creates complex trust chains. A misconfigured federation trust in one cloud can provide lateral movement paths across your entire multi-cloud estate.

AI/ML Threat Modeling with MITRE ATLAS

MITRE ATLAS (Adversarial Threat Landscape for AI Systems) extends ATT&CK to AI/ML systems. Use it to systematically identify threats specific to machine learning pipelines.

AI/ML Attack Surface

flowchart LR subgraph train["Training Phase"] TD["Training Data"] --> DP["Data Pipeline"] DP --> MT["Model Training"] MT --> MV["Model Validation"] end subgraph deploy["Deployment Phase"] MV --> MS["Model Serving"] MS --> API2["Inference API"] API2 --> OUT["Output"] end ATK1["Data Poisoning"] -.->|"corrupt"| TD ATK2["Supply Chain\n(backdoored model)"] -.->|"inject"| MT ATK3["Model Theft\n(extraction)"] -.->|"steal"| API2 ATK4["Adversarial\nExamples"] -.->|"evade"| API2 ATK5["Prompt\nInjection"] -.->|"manipulate"| API2 style ATK1 fill:#ec4899,stroke:#000,color:#000 style ATK2 fill:#ec4899,stroke:#000,color:#000 style ATK3 fill:#ec4899,stroke:#000,color:#000 style ATK4 fill:#ec4899,stroke:#000,color:#000 style ATK5 fill:#ec4899,stroke:#000,color:#000 style TD fill:#ff8800,stroke:#000,color:#000 style MS fill:#22d3ee,stroke:#000,color:#000 style API2 fill:#a855f7,stroke:#000,color:#000
ATLAS Tactic Techniques Impact Mitigations
Reconnaissance Model card analysis, API probing, published research Attacker learns model architecture and capabilities Limit model metadata exposure, rate limit API
ML Attack Staging Training data poisoning, backdoor insertion Compromised model integrity, hidden behaviors Data provenance, model validation, anomaly detection
Evasion Adversarial examples, input perturbation Model produces incorrect outputs Adversarial training, input validation, ensemble models
Exfiltration Model extraction, membership inference, inversion IP theft, privacy violations, training data exposure Differential privacy, output perturbation, watermarking

LLM-Specific Risks

Large Language Models introduce unique risk categories that traditional threat modeling doesn't address. The OWASP Top 10 for LLM Applications provides a starting taxonomy.

Input Risks

  • Prompt injection: Direct and indirect injection to override system instructions
  • Jailbreaking: Techniques to bypass safety filters and content policies
  • Data extraction: Crafted prompts to leak training data or system prompts
  • Context manipulation: RAG poisoning via manipulated source documents

Output Risks

  • Hallucination: Confident but false outputs leading to incorrect decisions
  • Sensitive data exposure: Model reproducing PII from training data
  • Insecure code generation: LLM-generated code with vulnerabilities
  • Bias amplification: Discriminatory outputs in high-stakes decisions

Integration Risks

  • Excessive agency: LLM agents with overprivileged tool access
  • Insecure plugin design: Unauthenticated or unvalidated tool calling
  • Supply chain: Compromised fine-tuning data or model weights
  • Model denial of service: Resource exhaustion via complex queries

Governance Risks

  • Accountability: Who is liable for LLM-generated decisions?
  • Explainability: Can the model explain why it produced an output?
  • Regulatory compliance: EU AI Act high-risk classification
  • Intellectual property: Training data copyright and fair use

NIST AI RMF (100-1)

The NIST AI Risk Management Framework provides a structured approach for managing AI risks. Map its four functions to your TRA process.

GOVERN — Establish AI Risk Culture

Define AI risk policies, roles, and accountability. Establish acceptable use policies, model governance boards, and ethical guidelines. Map to your TRA's organizational context phase.

MAP — Contextualize AI Risks

Identify and categorize AI risks specific to your system — what data does it process? What decisions does it influence? Who is affected? Map to your TRA's scoping and threat identification phases.

MEASURE — Assess and Track AI Risks

Quantify risks across trustworthiness characteristics: accuracy, fairness, privacy, security, explainability, resilience. Use metrics and benchmarks. Map to risk quantification (FAIR).

MANAGE — Treat and Monitor AI Risks

Implement controls, monitoring, and incident response for AI-specific risks. Establish model monitoring for drift, bias, and adversarial inputs. Map to risk treatment phase.

EU AI Act Risk Classification

Risk Level Examples Requirements TRA Implication
Unacceptable Social scoring, real-time biometric mass surveillance Prohibited TRA must flag for immediate cessation
High Risk Credit scoring, hiring, medical devices, law enforcement Full conformity assessment, logging, human oversight Comprehensive TRA with bias testing and explainability
Limited Risk Chatbots, emotion detection, deepfake generation Transparency obligations (inform users it's AI) Standard TRA plus transparency controls
Minimal Risk AI-powered spam filters, game NPCs, autocomplete No additional requirements Standard TRA is sufficient

Section Summary

Key Takeaways

  • • Shared responsibility gaps are the #1 source of cloud security incidents
  • • Multi-cloud adds identity federation and control parity risks
  • • MITRE ATLAS extends ATT&CK to AI/ML threat modeling
  • • LLMs introduce prompt injection, hallucination, and excessive agency risks
  • • NIST AI RMF maps to traditional TRA phases (Govern/Map/Measure/Manage)

Next Steps