Model Gateway

Intermediate

T1190 | Exploit Public-Facing Application

Model API & Gateway Security

Model gateways concentrate risk: prompts, completions, embeddings, files, tenant routing, provider fallback, logs, billing, abuse controls, and model versions all pass through one service boundary.

What makes model APIs different

Traditional API tests still matter, but model APIs add high-volume unstructured input, probabilistic output, hidden prompt assembly, expensive abuse paths, sensitive prompt logs, and provider-side retention choices.

V2 Attack Flow Diagram

Model Gateway Abuse Flow

Model APIs add routing, tenancy, cost, retention, and version drift to normal API testing.

API01

Request Enters

Tenant, role, API key, prompt class, file, embedding, or model route.

Gateway02

Policy Routes

Provider selection, regional rules, fallback, cache, quota, and abuse throttles.

Inference03

Model Responds

Output, safety metadata, token cost, tool schema, and prompt/version context.

Logs04

Logs Persist

Prompt/completion retention, support access, redaction, deletion, and incident trace.

A gateway issue is reportable when routing, isolation, retention, or versioning creates concrete data, cost, or control impact.

Gateway Test Matrix

Area	Test	Evidence
Tenant isolation	Verify API keys, org IDs, project IDs, files, embeddings, and logs cannot cross tenants.	Denied requests, trace IDs, role matrix.
Prompt routing	Confirm sensitive prompts never fall back to unapproved providers or regions.	Routing policy, fallback test, provider logs.
Version pinning	Track model, system prompt, tool schema, guardrail, and retrieval config versions.	Response metadata, release record, rollback path.
Abuse controls	Test quota, burst limits, cost caps, oversized context, file upload limits, and repetitive unsafe attempts.	Rate-limit responses, alerts, billing guardrail.
Retention	Verify prompt/completion/file retention, training opt-out, deletion, and redaction settings.	Provider config, retention screenshots, deletion proof.

High-Value Findings

Unapproved Provider Fallback

Sensitive prompts route to a public provider when the preferred model is unavailable.

Prompt Log Overexposure

Operators, support roles, or other tenants can access raw prompts containing confidential data.

Model Version Drift

Security behavior changes because model or prompt versions are not pinned, logged, or regression-tested.

Best practice

Treat the model gateway like a security product: strict tenant isolation, explicit routing policy, versioned prompts, privacy-preserving logs, cost controls, and eval gates before release.

AI App Testing

Operator Playbook

Test model APIs and gateways for tenant isolation, provider routing, prompt/version drift, logging exposure, cost abuse, and sensitive-output handling.

Authorized use only

Offensive Focus

Treat the model gateway like a security-critical API with AI-specific metadata and policy decisions.
Probe provider fallback, rate limits, cache behavior, prompt logging, and tenant boundaries.
Validate every route logs enough context to reproduce security failures.

Evidence To Capture

Written scope and allowed test classes
Timestamped prompts, retrieved context, tool calls, and response artifacts
Request IDs, model/provider/version, policy decisions, and tenant or user role
Screenshots or exported logs that reproduce the finding without exposing client secrets

Offensive Test Cases

Provider fallback abuse check

Objective: Verify whether sensitive prompts route to unapproved providers during errors, quotas, or model unavailability.
Authorized setup: Use staging routes and synthetic sensitive markers.
Evidence: Request ID, route decision, provider/model, fallback reason, prompt classification, and logs.

Tenant isolation and cache probe

Objective: Check whether prompts, completions, embeddings, or cache entries leak across tenants or roles.
Authorized setup: Use two lab tenants with canary prompts and non-sensitive outputs.
Evidence: Tenant IDs, cache keys, request/response pairs, logs, and isolation decision.

Common Findings

Gateway fallback routes bypass data residency or approved-provider requirements.
Prompt and completion logs are visible to broad support or operator roles.
Model and prompt versions are not pinned, so security behavior drifts silently.

Lab Ideas

Create two lab tenants and test cache isolation with canary prompts.
Simulate a primary model failure and inspect provider fallback behavior.
Write a gateway log schema that supports incident reconstruction.

Related Offensive AI Guides

AI App Pentest Methodology

End-to-end AI application testing workflow.

API Security

API testing methods for gateways and model-backed services.

RAG Security Testing

Retrieval and data-boundary testing.

AI Evaluation Workbench

Regression gates for AI controls.