Offensive AI

Beginner

T1592 | Gather Victim Host Information T1588.005 | Obtain Capabilities: Exploits

Introduction to Offensive AI

Offensive AI combines the reasoning power of Large Language Models with traditional security tooling through standardized protocols like MCP. In 2026, AI agents don't just assist pentesting — they orchestrate multi-tool workflows, correlate findings across attack surfaces, and generate human-quality reports in minutes.

Foundation Page

This is the starting point for the Offensive AI section. Every subsequent module builds on the concepts, tooling, and threat models introduced here.

What is Offensive AI?

Offensive AI refers to the application of artificial intelligence and machine learning to penetration testing, vulnerability research, exploit development, and security assessments. Unlike traditional automated scanners that follow rigid rule sets, AI agents can:

Understand context — adapt testing strategies based on target architecture and observed responses
Correlate findings — link results across nmap, nuclei, Burp, and custom scripts into unified attack narratives
Generate custom payloads — craft exploits tailored to specific software versions and configurations
Explain and report — produce human-readable executive summaries and technical remediation guidance
Chain actions autonomously — plan multi-step attack paths and execute them with human-in-the-loop approval
Learn from engagements — improve subsequent tests using contextual memory and retrieval-augmented generation (RAG)

The Model Context Protocol (MCP)

The Model Context Protocol (MCP) is an open standard that defines how AI models interact with external tools and data sources. Originally developed by Anthropic and now widely adopted, MCP gives LLMs a structured interface to invoke security tools, read scan results, and trigger workflows — all without bespoke integrations.

graph LR subgraph AI Client A[Claude / Copilot / Cursor] end subgraph MCP Layer B[MCP Server] C[Tool Registry] D[Resource Provider] E[Prompt Templates] end subgraph Security Tools F[nmap] G[nuclei] H[sqlmap] I[Burp Suite] J[Custom Scripts] end A -- "JSON-RPC" --> B B --> C B --> D B --> E C --> F C --> G C --> H C --> I C --> J

Key MCP Concepts

Tools

Functions the AI can invoke to perform actions — run nmap scans, execute sqlmap, generate nuclei templates, or call custom scripts.

Resources

Data sources the AI can read — scan results, configuration files, vulnerability databases, and engagement notes.

Prompts

Pre-defined templates for common security workflows — recon checklists, exploit chains, and report generation pipelines.

Sampling

Server-initiated requests for AI decision-making — the server asks the model to classify a finding or choose a next action.

Example MCP configuration for Claude Desktop with a security toolkit and local Ollama bridge:

claude_desktop_config.json

json

// claude_desktop_config.json — MCP server configuration
// Location: ~/Library/Application Support/Claude/claude_desktop_config.json (macOS)
//           %APPDATA%\Claude\claude_desktop_config.json (Windows)
{
  "mcpServers": {
    "security-toolkit": {
      // Point this at a real security MCP server you run locally —
      // e.g. HexStrike AI (see /offensive-ai/02-hexstrike/) or secops-mcp
      "command": "python3",
      "args": ["/opt/mcp-servers/security_server.py"],
      "env": {
        "NMAP_PATH": "/usr/bin/nmap",
        "NUCLEI_PATH": "/usr/bin/nuclei"
      }
    },
    "ollama-bridge": {
      "command": "python3",
      "args": ["/opt/mcp-servers/ollama_bridge.py"],
      "env": {
        "OLLAMA_HOST": "http://localhost:11434",
        "MODEL": "llama3.3:70b"
      }
    }
  }
}
// claude_desktop_config.json — MCP server configuration
// Location: ~/Library/Application Support/Claude/claude_desktop_config.json (macOS)
//           %APPDATA%\Claude\claude_desktop_config.json (Windows)
{
  "mcpServers": {
    "security-toolkit": {
      // Point this at a real security MCP server you run locally —
      // e.g. HexStrike AI (see /offensive-ai/02-hexstrike/) or secops-mcp
      "command": "python3",
      "args": ["/opt/mcp-servers/security_server.py"],
      "env": {
        "NMAP_PATH": "/usr/bin/nmap",
        "NUCLEI_PATH": "/usr/bin/nuclei"
      }
    },
    "ollama-bridge": {
      "command": "python3",
      "args": ["/opt/mcp-servers/ollama_bridge.py"],
      "env": {
        "OLLAMA_HOST": "http://localhost:11434",
        "MODEL": "llama3.3:70b"
      }
    }
  }
}

Types of AI Security Tools

The 2026 AI security ecosystem falls into four primary categories, each with distinct strengths and integration patterns.

1. MCP Platforms

Full-stack platforms that expose security tools via MCP servers, letting any compatible AI client orchestrate complex engagements.

HexStrike AI — 150+ tools, 12+ specialized agents, full MCP server
MCP Reference Servers — Official modelcontextprotocol/servers collection, extendable with security tools
secops-mcp — Open-source all-in-one security testing toolbox behind a single MCP interface

2. AI Copilots

IDE and tool extensions that add AI-powered analysis, code generation, and vulnerability detection to existing workflows.

GitHub Copilot — Code-aware security analysis within VS Code
ReconAIzer — Burp Suite AI-powered traffic analysis
Nuclei AI — Automatic vulnerability template generation
BurpGPT — LLM-driven HTTP request/response analysis

3. Agent Frameworks

Autonomous and semi-autonomous systems that plan, execute, and iterate on multi-step security tasks.

LangChain Agents — Composable tool-calling agents with memory and planning
CrewAI — Multi-agent orchestration for complex engagements
AutoGen — Microsoft's multi-agent conversation framework
PentestGPT — Interactive penetration testing assistant

4. Security LLMs

Language models fine-tuned or optimized for security tasks — vulnerability analysis, exploit generation, and threat intelligence.

WhiteRabbitNeo — local security-focused LLM for offensive research
HackerGPT — Bug bounty and vulnerability research assistant
DeepSeek-R1 — Reasoning model with strong code/security capabilities
Llama 3.3 (70B) — Open-weight model suitable for local security tasks

Supported AI Clients

MCP-based security tools integrate with a growing ecosystem of AI clients. Choose based on your workflow — desktop app for interactive sessions, IDE for code-centric testing, or API for pipeline automation.

Claude Desktop

Native MCP support, best-in-class tool calling

VS Code Copilot

MCP agent mode, inline security analysis

Cursor

MCP integration with codebase-aware context

Windsurf

Cascade AI with MCP tool orchestration

Cline

VS Code extension with autonomous MCP agent

Roo Code

Multi-model MCP client with custom modes

Use Cases: AI vs. Traditional

Illustrative time comparisons for AI-assisted versus manual workflows. Actual results vary widely by target, tooling, and operator skill — treat these as rough directional estimates, not measured benchmarks.

Use Case	Traditional	With AI (2026)	Speedup	Quality Note
Subdomain Enumeration	2–4 hours	3–8 minutes	~30x	AI cross-references passive + active sources
Vulnerability Scanning + Triage	4–8 hours	10–20 minutes	~20x	Auto-deduplicates and ranks by exploitability
Web Application Testing	1–3 days	2–6 hours	~8x	Covers auth, logic, and injection classes
Report Generation	4–12 hours	5–15 minutes	~50x	Draft quality; human review still required
Exploit Development	1–5 days	1–8 hours	~10x	Works best for known CVE patterns
Attack Surface Mapping	1–2 days	15–45 minutes	~40x	Aggregates DNS, certs, ports, cloud metadata

AI Limitations

AI agents are powerful force-multipliers but not replacements for human expertise. Always verify AI-generated findings, review exploit code before execution, validate business logic flaws manually, and maintain proper oversight of autonomous agents. False positives and hallucinated CVEs remain common failure modes.

MITRE ATLAS Framework

ATLAS (Adversarial Threat Landscape for AI Systems) is MITRE's threat framework specifically designed for adversarial attacks on machine learning systems. While ATT&CK covers traditional infrastructure, ATLAS maps tactics and techniques for attacking, evading, and exploiting AI/ML models — directly relevant to both offensive AI usage and defending against it.

Offensive Relevance

ATLAS techniques you'll encounter when attacking AI-integrated targets:

• AML.T0051 — LLM Prompt Injection (direct and indirect)
• AML.T0054 — LLM Jailbreak (bypassing model guardrails)
• AML.T0043 — Craft Adversarial Data (evasion samples)
• AML.T0020 — Poison Training Data (training data manipulation)
• AML.T0024 — Exfiltration via AI Inference API (data leakage through outputs)

Defensive Mapping

Understanding ATLAS also helps you assess AI defenses in target environments:

• Map ML model deployment surfaces during reconnaissance
• Identify prompt injection vectors in AI-powered applications
• Test robustness of AI-based WAFs and anomaly detectors
• Evaluate data pipeline integrity for poisoning opportunities
• Assess API rate limits and model access controls

Query the ATLAS matrix programmatically to map adversarial ML techniques:

atlas_navigator.py

python

# MITRE ATLAS Navigator — list adversarial ML techniques
# Prerequisites: pip install requests pyyaml
# Docs: https://atlas.mitre.org/
import requests, yaml

# Fetch the ATLAS matrix (adversarial ML tactics and techniques)
ATLAS_URL = "https://raw.githubusercontent.com/mitre-atlas/atlas-data/main/dist/ATLAS.yaml"
data = yaml.safe_load(requests.get(ATLAS_URL).text)

# Techniques live under the ATLAS matrix
techniques = data["matrices"][0]["techniques"]
for t in techniques[:10]:
    print(f"  {t['id']}: {t['name']}")

# --- Selected LLM/GenAI techniques (verify current IDs at atlas.mitre.org) ---
# AML.T0051: LLM Prompt Injection
# AML.T0054: LLM Jailbreak
# AML.T0043: Craft Adversarial Data
# AML.T0020: Poison Training Data
# AML.T0015: Evade AI Model
# AML.T0024: Exfiltration via AI Inference API
# AML.T0057: LLM Data Leakage
# AML.T0044: Full AI Model Access
# MITRE ATLAS Navigator — list adversarial ML techniques
# Prerequisites: pip install requests pyyaml
# Docs: https://atlas.mitre.org/
import requests, yaml

# Fetch the ATLAS matrix (adversarial ML tactics and techniques)
ATLAS_URL = "https://raw.githubusercontent.com/mitre-atlas/atlas-data/main/dist/ATLAS.yaml"
data = yaml.safe_load(requests.get(ATLAS_URL).text)

# Techniques live under the ATLAS matrix
techniques = data["matrices"][0]["techniques"]
for t in techniques[:10]:
    print(f"  {t['id']}: {t['name']}")

# --- Selected LLM/GenAI techniques (verify current IDs at atlas.mitre.org) ---
# AML.T0051: LLM Prompt Injection
# AML.T0054: LLM Jailbreak
# AML.T0043: Craft Adversarial Data
# AML.T0020: Poison Training Data
# AML.T0015: Evade AI Model
# AML.T0024: Exfiltration via AI Inference API
# AML.T0057: LLM Data Leakage
# AML.T0044: Full AI Model Access

ATLAS vs ATT&CK

ATLAS complements ATT&CK — it doesn't replace it. When pentesting AI-integrated targets, use ATT&CK for infrastructure and network technique mapping, and ATLAS for ML model-specific attack vectors. Many engagements in 2026 require both.

Getting Started Checklist

AI Client — Install Claude Desktop, VS Code with Copilot, Cursor, or Windsurf
Local LLM — Install Ollama and pull llama3.3:70b or deepseek-r1:32b
Python 3.11+ — Required for MCP SDK, LangChain, and agent frameworks
Isolated lab — VM or container network for safe testing (Kali, ParrotOS)
Security tools — nmap, nuclei, gobuster, sqlmap, ffuf, Burp Suite Community
MCP server — Clone and configure an MCP security platform (e.g., HexStrike)
Authorization — Obtain written permission before testing any target

Set up a local Ollama instance as your offline security LLM:

ollama-setup.sh

bash

# Prerequisites: curl, Docker (optional)
# Install Ollama (macOS/Linux)
curl -fsSL https://ollama.com/install.sh | sh

# Pull a security-focused model
ollama pull llama3.3:70b
ollama pull deepseek-r1:32b

# Verify the model is running
ollama list
# NAME                  ID            SIZE    MODIFIED
# llama3.3:70b          a]6b4c8df...  39 GB   2 minutes ago
# deepseek-r1:32b       f1e2d3c4...   18 GB   5 minutes ago

# Run an interactive security query
ollama run llama3.3:70b "Explain the OWASP Top 10 2025 changes"

# Start the Ollama API server (default: http://localhost:11434)
ollama serve

# Test API connectivity
curl http://localhost:11434/api/tags | jq '.models[].name'
# Prerequisites: curl, Docker (optional)
# Install Ollama (macOS/Linux)
curl -fsSL https://ollama.com/install.sh | sh

# Pull a security-focused model
ollama pull llama3.3:70b
ollama pull deepseek-r1:32b

# Verify the model is running
ollama list
# NAME                  ID            SIZE    MODIFIED
# llama3.3:70b          a]6b4c8df...  39 GB   2 minutes ago
# deepseek-r1:32b       f1e2d3c4...   18 GB   5 minutes ago

# Run an interactive security query
ollama run llama3.3:70b "Explain the OWASP Top 10 2025 changes"

# Start the Ollama API server (default: http://localhost:11434)
ollama serve

# Test API connectivity
curl http://localhost:11434/api/tags | jq '.models[].name'

Runs untrusted remote code

curl … | sh executes whatever the server returns, unreviewed and with your privileges. Fetch the script first (curl -fsSL https://ollama.com/install.sh -o install.sh), read it, then run it — or use a pinned package/container image. The same applies to any code sample here that pulls a live URL at runtime.

Minimal MCP-aware agent using the Python SDK:

mcp_agent_example.py

python

# Prerequisites: pip install langchain openai mcp
# Example: minimal MCP-aware agent using LangChain + MCP
import asyncio
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client

async def run_security_agent():
    """Connect to an MCP security server and invoke a scan tool."""
    server_params = StdioServerParameters(
        command="python3",
        args=["/opt/mcp-servers/security_server.py"]
    )

    async with stdio_client(server_params) as (read, write):
        async with ClientSession(read, write) as session:
            await session.initialize()

            # List available security tools
            tools = await session.list_tools()
            for tool in tools.tools:
                print(f"  Tool: {tool.name} — {tool.description}")

            # Call a reconnaissance tool
            result = await session.call_tool(
                "nmap_scan",
                arguments={"target": "192.168.1.0/24", "flags": "-sV --top-ports 100"}
            )
            print(result.content[0].text)

asyncio.run(run_security_agent())
# Prerequisites: pip install langchain openai mcp
# Example: minimal MCP-aware agent using LangChain + MCP
import asyncio
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client

async def run_security_agent():
    """Connect to an MCP security server and invoke a scan tool."""
    server_params = StdioServerParameters(
        command="python3",
        args=["/opt/mcp-servers/security_server.py"]
    )

    async with stdio_client(server_params) as (read, write):
        async with ClientSession(read, write) as session:
            await session.initialize()

            # List available security tools
            tools = await session.list_tools()
            for tool in tools.tools:
                print(f"  Tool: {tool.name} — {tool.description}")

            # Call a reconnaissance tool
            result = await session.call_tool(
                "nmap_scan",
                arguments={"target": "192.168.1.0/24", "flags": "-sV --top-ports 100"}
            )
            print(result.content[0].text)

asyncio.run(run_security_agent())

Offensive AI Introduction Labs

Hands-on exercises to build your AI-assisted pentesting foundation.

AI Red Teaming TryHackMe easy

Prompt injection fundamentalsLLM jailbreak techniquesAI-assisted reconnaissanceEvaluating AI-generated exploits

Open Lab

HTB Academy: AI Red Teamer Path Hack The Box medium

ML model exploitationAI API abuseAdversarial input craftingAI-powered CTF problem solving

Open Lab

Need help setting up? Check our Lab Setup Guide →

Foundations

Operator Playbook

Establish the offensive mental model for AI systems: where instructions enter, where trust is assigned, and where impact can be proven.

Authorized use only

Offensive Focus

Model the system as input channels, context sources, decision points, tools, outputs, and logs.
Translate AI concepts into testable abuse hypotheses rather than generic AI terminology.
Use MITRE ATLAS, OWASP LLM/GenAI, and ATT&CK as finding taxonomies, not as substitutes for evidence.

Evidence To Capture

Written scope and allowed test classes
Timestamped prompts, retrieved context, tool calls, and response artifacts
Request IDs, model/provider/version, policy decisions, and tenant or user role
Screenshots or exported logs that reproduce the finding without exposing client secrets

Offensive Test Cases

Trust-boundary sketch

Objective: Draw where user prompts, system prompts, retrieved data, tool schemas, memory, and output consumers intersect.
Authorized setup: Work from architecture docs, approved interviews, and staging observations.
Evidence: Boundary diagram plus list of untrusted-to-trusted transitions for later testing.

First abuse hypothesis

Objective: Write three testable hypotheses for prompt injection, retrieval abuse, or unsafe tool use.
Authorized setup: Use only permitted fixtures and known test accounts.
Evidence: Hypothesis, expected control, observed behavior, and next test decision.

Common Findings

Teams cannot identify which AI inputs are trusted, untrusted, or mixed-trust.
System prompts and tool schemas are treated as secret controls instead of defense-in-depth context.
No owner can explain how model outputs are validated before downstream use.

Lab Ideas

Threat-model a simple chatbot with retrieval and one ticketing tool.
Map one AI feature to ATLAS and OWASP LLM categories.
Create a harmless canary value and trace where it appears in prompts, logs, and outputs.

Related Guides

HexStrike AI Platform

Deep dive into the HexStrike AI platform and its MCP-based toolkit.

AI Agent Workflows

Building multi-step autonomous security assessment workflows.

OSINT Techniques

Open-source intelligence collection paired with AI analysis.

Counter-Surveillance

Adversarial ML and evasion techniques against AI-powered surveillance.