Model Gateway
Intermediate
T1190

Model API & Gateway Security

Model gateways concentrate risk: prompts, completions, embeddings, files, tenant routing, provider fallback, logs, billing, abuse controls, and model versions all pass through one service boundary.

What makes model APIs different

Traditional API tests still matter, but model APIs add high-volume unstructured input, probabilistic output, hidden prompt assembly, expensive abuse paths, sensitive prompt logs, and provider-side retention choices.

V2 Attack Flow Diagram

Model Gateway Abuse Flow

Model APIs add routing, tenancy, cost, retention, and version drift to normal API testing.

API01

Request Enters

Tenant, role, API key, prompt class, file, embedding, or model route.

Gateway02

Policy Routes

Provider selection, regional rules, fallback, cache, quota, and abuse throttles.

Inference03

Model Responds

Output, safety metadata, token cost, tool schema, and prompt/version context.

Logs04

Logs Persist

Prompt/completion retention, support access, redaction, deletion, and incident trace.

A gateway issue is reportable when routing, isolation, retention, or versioning creates concrete data, cost, or control impact.

Gateway Test Matrix

AreaTestEvidence
Tenant isolationVerify API keys, org IDs, project IDs, files, embeddings, and logs cannot cross tenants.Denied requests, trace IDs, role matrix.
Prompt routingConfirm sensitive prompts never fall back to unapproved providers or regions.Routing policy, fallback test, provider logs.
Version pinningTrack model, system prompt, tool schema, guardrail, and retrieval config versions.Response metadata, release record, rollback path.
Abuse controlsTest quota, burst limits, cost caps, oversized context, file upload limits, and repetitive unsafe attempts.Rate-limit responses, alerts, billing guardrail.
RetentionVerify prompt/completion/file retention, training opt-out, deletion, and redaction settings.Provider config, retention screenshots, deletion proof.

High-Value Findings

Unapproved Provider Fallback

Sensitive prompts route to a public provider when the preferred model is unavailable.

Prompt Log Overexposure

Operators, support roles, or other tenants can access raw prompts containing confidential data.

Model Version Drift

Security behavior changes because model or prompt versions are not pinned, logged, or regression-tested.

Best practice

Treat the model gateway like a security product: strict tenant isolation, explicit routing policy, versioned prompts, privacy-preserving logs, cost controls, and eval gates before release.

AI App Testing

Operator Playbook

Test model APIs and gateways for tenant isolation, provider routing, prompt/version drift, logging exposure, cost abuse, and sensitive-output handling.

Authorized use only

Offensive Focus

  • Treat the model gateway like a security-critical API with AI-specific metadata and policy decisions.
  • Probe provider fallback, rate limits, cache behavior, prompt logging, and tenant boundaries.
  • Validate every route logs enough context to reproduce security failures.

Evidence To Capture

  • Written scope and allowed test classes
  • Timestamped prompts, retrieved context, tool calls, and response artifacts
  • Request IDs, model/provider/version, policy decisions, and tenant or user role
  • Screenshots or exported logs that reproduce the finding without exposing client secrets

Offensive Test Cases

Provider fallback abuse check

Objective
Verify whether sensitive prompts route to unapproved providers during errors, quotas, or model unavailability.
Authorized setup
Use staging routes and synthetic sensitive markers.
Evidence
Request ID, route decision, provider/model, fallback reason, prompt classification, and logs.

Tenant isolation and cache probe

Objective
Check whether prompts, completions, embeddings, or cache entries leak across tenants or roles.
Authorized setup
Use two lab tenants with canary prompts and non-sensitive outputs.
Evidence
Tenant IDs, cache keys, request/response pairs, logs, and isolation decision.

Common Findings

  • Gateway fallback routes bypass data residency or approved-provider requirements.
  • Prompt and completion logs are visible to broad support or operator roles.
  • Model and prompt versions are not pinned, so security behavior drifts silently.

Lab Ideas

  • Create two lab tenants and test cache isolation with canary prompts.
  • Simulate a primary model failure and inspect provider fallback behavior.
  • Write a gateway log schema that supports incident reconstruction.