Methodology
Advanced
AML.T0051

AI Application Pentest Methodology

Treat AI features like distributed systems with probabilistic decision points. The test target is not only the model: it is the prompt chain, retrieval layer, permission model, gateway, observability stack, and the business process that trusts the output.

Use the planner first

Start with the planner below to create a scoped test plan, then refine it against the client's architecture and rules of engagement.
AI engagement planner

Build a scoped AI test plan in seconds

Select the system shape and generate a practitioner checklist mapped to OWASP LLM, MITRE ATLAS, evidence, and report deliverables.

Checks
12
Evidence
Required
Output
Markdown
01 Scoping

Document model/provider, user roles, data classes, tool permissions, allowed tests, and explicit no-go actions before testing begins.

Rules of engagement addendum, architecture sketch, permission inventory.

OWASP LLM01OWASP LLM07
02 Prompt injection

Test direct and indirect instruction conflicts across user input, retrieved content, files, tickets, web pages, and third-party connectors.

Prompt, source content, model response, guardrail behavior, impact notes.

OWASP LLM01MITRE ATLAS AML.T0051
03 Abuse-case execution

Run at least one end-to-end authorized abuse path that starts with adversarial input and reaches an unsafe answer, data exposure, permission decision, or attempted tool action.

Abuse-case objective, fixture data, model context, response, downstream effect, and business-impact statement.

OWASP LLM01OWASP LLM05OWASP LLM06
04 Regression pack

Convert accepted findings into repeatable evals with expected safe behavior, owner, severity, and rerun instructions.

Prompt/eval file, fixture IDs, scoring rule, model version, prompt version, pass/fail output.

OWASP LLM09OWASP LLM10
05 Output handling

Verify that generated output cannot trigger unsafe rendering, hidden links, credential disclosure, or automated downstream actions.

Rendered output capture, sanitizer behavior, blocked content examples.

OWASP LLM02OWASP LLM05
06 Monitoring

Confirm prompts, retrieval hits, tool calls, approvals, and high-risk denials are logged with enough context for incident response.

Log samples, alert rules, retention policy, privacy notes.

OWASP LLM10
07 Retrieval boundaries

Test cross-tenant retrieval, document-level ACLs, stale index entries, poisoned chunks, and citation integrity.

Corpus sample, retrieval trace, denied-document test, citation diff.

OWASP LLM02OWASP LLM06OWASP LLM08
08 RAG abuse paths

Seed harmless poisoned documents, cross-tenant canaries, stale index records, and citation-laundering fixtures to prove retrieval trust-boundary failures.

Document ID, chunk ID, retrieval score, source ACL, answer text, citation comparison, and canary result.

OWASP LLM01OWASP LLM02OWASP LLM08
09 Knowledge ingestion

Review upload, crawl, sync, and connector paths for malicious instructions, hidden text, metadata injection, and unsafe file parsing.

Ingestion path map, sample poisoned document, sanitizer output.

OWASP LLM03MITRE ATLAS AML.T0046
10 Read scope

Confirm read tools cannot access secrets, unrelated tenants, local credential stores, or out-of-scope repositories.

Allowlist, denied read tests, data-boundary notes.

OWASP LLM06
11 Confidentiality

Prefer local or approved enterprise endpoints, redact secrets before prompts, and record exactly where client data is processed.

Data-flow record, redaction examples, approved endpoint list.

OWASP LLM06OWASP LLM10
12 Hybrid routing

Verify sensitive prompts route only to approved local/private endpoints and fallback behavior does not leak data to public providers.

Routing policy, fallback test, denied external call.

OWASP LLM06OWASP LLM07

Assessment Phases

1. Scope and data rules

Document AI features, model/provider, data classes, allowed test types, prohibited actions, and evidence handling.

2. Architecture mapping

Map users, prompts, retrieval, memory, tools, model gateways, output sinks, logging, and approval gates.

3. Abuse case design

Convert OWASP LLM, MITRE ATLAS, business flows, and tool permissions into testable hypotheses.

4. Controlled execution

Run tests in low-risk order: prompt conflicts, retrieval boundaries, tool abuse, tenant isolation, output handling, and monitoring.

5. Evidence and retest

Preserve prompts, traces, screenshots, logs, control behavior, business impact, and fix validation criteria.

Minimum Test Matrix

AreaWhat to testProof to collect
Prompt layerDirect, indirect, multi-turn, role conflict, and hidden instruction handling.Prompt, context, response, refusal or bypass trace.
RetrievalChunk poisoning, ACL filters, stale index entries, source ranking, and citation integrity.Retrieved chunks, scores, source IDs, user role.
Tools and agentsTool allowlists, argument validation, approval gates, memory persistence, sandbox escape paths.Tool manifest, approval logs, blocked calls, sandbox policy.
Model gatewayTenant isolation, abuse throttling, provider fallback, prompt/version pinning, sensitive-output controls.Gateway config, request IDs, rate-limit results, version record.
OperationsLogging, alerting, incident response, privacy retention, data minimization, and human handoff.Log samples, alert rule, retention setting, escalation workflow.

Client Deliverables

AI RoE addendum

Allowed models, providers, test classes, tool-use boundaries, evidence restrictions, and emergency stop conditions.

AI attack-surface map

Trust boundaries for prompts, retrieval, tools, memory, gateways, human approval, logs, and output consumers.

Regression pack

A small set of reproducible prompts, fixtures, and expected outcomes that validate fixes after model or prompt changes.

Quality bar

A strong AI finding includes the exact user role, source content or fixture, model/prompt version, retrieved context, tool-call trace if applicable, observed business impact, and the control that should have stopped it.

AI App Testing

Operator Playbook

Run an end-to-end authorized AI application pentest that covers prompts, retrieval, tools, gateways, users, logs, and downstream impact.

Authorized use only

Offensive Focus

  • Convert scope into abuse cases before testing individual prompts.
  • Prioritize tests that can cross trust boundaries, expose data, invoke tools, alter decisions, or create reportable business impact.
  • Turn every accepted finding into a regression fixture.

Evidence To Capture

  • Written scope and allowed test classes
  • Timestamped prompts, retrieved context, tool calls, and response artifacts
  • Request IDs, model/provider/version, policy decisions, and tenant or user role
  • Screenshots or exported logs that reproduce the finding without exposing client secrets

Offensive Test Cases

AI RoE addendum review

Objective
Identify gaps in scope around models, providers, prompt logs, vector stores, and tool actions.
Authorized setup
Review written authorization with the client before active testing.
Evidence
Approved test classes, exclusions, provider rules, emergency contacts, and stop conditions.

Abuse-case execution sprint

Objective
Execute a small prioritized set of AI-specific abuse cases and document impact.
Authorized setup
Use test accounts, fixtures, and approved provider routes.
Evidence
Test case, inputs, model context, outputs, logs, side effects, and remediation owner.

Common Findings

  • AI scope is bolted onto a web pentest without model, retrieval, or tool-use authorization.
  • Testing focuses on jailbreak prompts instead of business-impact abuse cases.
  • No one owns regression testing after model, prompt, or connector changes.

Lab Ideas

  • Write an AI pentest RoE addendum for a sample SaaS support bot.
  • Build a risk-ranked abuse-case backlog for RAG, agents, and model APIs.
  • Turn one finding into a Promptfoo regression test.