Malware & Evasion
💀 Expert
T1027 T1055

AI Malware & Evasion

AI is fundamentally transforming the malware development lifecycle — from initial payload generation and polymorphic mutation to real-time evasion of endpoint detection. Understanding these techniques is critical for red teams simulating advanced adversaries and blue teams building next-generation defences.

Authorised Operations Only

The techniques in this chapter are exclusively for use in authorised red team engagements, sanctioned security research, and controlled lab environments. Developing, deploying, or distributing malware outside of explicit written authorisation is illegal under the Computer Fraud and Abuse Act (US), Computer Misuse Act (UK), and equivalent legislation worldwide. All code samples are conceptual or pseudocode — they demonstrate patterns and architecture, not working weaponised tools. If you are unsure whether your engagement scope permits these techniques, stop and consult your legal team.

1. Overview

Artificial intelligence is fundamentally transforming the malware development lifecycle. Where adversaries once spent weeks hand-crafting evasive payloads, tuning shellcode, and testing against specific EDR products, large language models now accelerate every phase — from initial payload generation and polymorphic mutation to real-time evasion of endpoint detection.

This chapter covers the full adversary workflow: generating offensive payloads with uncensored LLMs, building polymorphic mutation engines, defeating EDR/AV through AI-guided evasion, conceptual LLM-based C2 architectures, and obfuscation pipelines. Every section pairs offensive technique with defensive countermeasure, because the purpose of studying these methods is to build better defences.

AI-Assisted Malware Development Lifecycle

graph TB subgraph Research["Research Phase"] TARGETS[Target Environment Analysis] EDR_PROFILE[EDR/AV Profiling] DETECTIONS[Detection Signature Research] end subgraph Generation["AI Generation Phase"] LLM[LLM Payload Engine] POLY[Polymorphic Mutation] OBF[Obfuscation Layer] end subgraph Delivery["Delivery Phase"] STAGE[Staged Payload] LOADER[Custom Loader] C2[C2 Channel] end subgraph Evasion["Evasion Layer"] AMSI[AMSI Bypass] ETW[ETW Patching] UNHOOK[Ntdll Unhooking] SYSCALL[Direct Syscalls] end subgraph Execution["Post-Exploitation"] INJECT[Process Injection] PERSIST[Persistence] EXFIL[Data Exfiltration] end TARGETS --> LLM EDR_PROFILE --> LLM DETECTIONS --> LLM LLM --> POLY POLY --> OBF OBF --> STAGE STAGE --> LOADER LOADER --> AMSI AMSI --> ETW ETW --> UNHOOK UNHOOK --> SYSCALL SYSCALL --> INJECT INJECT --> C2 C2 --> PERSIST C2 --> EXFIL style Research fill:#1a1a2e,stroke:#ff4444,color:#fff style Generation fill:#16213e,stroke:#ff8c00,color:#fff style Delivery fill:#0f3460,stroke:#e94560,color:#fff style Evasion fill:#1a1a2e,stroke:#0ff,color:#fff style Execution fill:#16213e,stroke:#00ff41,color:#fff

2. AI-Generated Payloads

Large language models have dramatically lowered the barrier to offensive code generation. While frontier models from OpenAI and Anthropic implement safety filters that refuse overtly malicious requests, the open-source ecosystem includes uncensored fine-tunes specifically designed for security research — models that will discuss exploitation techniques, generate offensive tooling, and explain evasion methods without refusal.

Security-Focused Models

Several model families are commonly used in offensive security research:

  • WhiteRabbitNeo — purpose-built for offensive and defensive cybersecurity, trained on security datasets. Available in 13B and 33B parameter variants. Discusses exploit development, evasion techniques, and payload generation without refusal.
  • Dolphin (Mistral/Llama fine-tunes) — "uncensored" fine-tunes that remove alignment restrictions. The most popular general-purpose models for unrestricted security research.
  • DeepSeek-Coder / CodeLlama — strong code generation models that, when run locally via Ollama, can generate offensive code with appropriate prompting.

Frontier Model Limitations

GPT-4, Claude, and Gemini will typically refuse to generate working malware, reverse shells, or exploitation code. However, they remain valuable for understanding techniques, analysing detection patterns, and reviewing defensive strategies. The refusal boundary is not absolute — creative prompt engineering can sometimes elicit partial technical content, which is itself a security research finding worth reporting.

Prompt Engineering for Payload Generation

Effective offensive prompting follows specific patterns that maximise model output quality:

  • Role framing — instruct the model to act as a "senior red team operator" or "malware analyst" to activate security-domain knowledge.
  • Specificity — provide exact target details: OS version, EDR product, architecture (x64/ARM), and language requirements.
  • Iterative refinement — start with a basic payload concept, then iterate: "Now modify this to bypass AMSI", "Add string encryption", "Replace direct API calls with dynamic resolution".
  • Constraint specification — define operational constraints: "Must be under 10KB", "Cannot use common signatured APIs", "Must work in a constrained PowerShell language mode".
ai_payload_research.py
python
# Educational: querying a local uncensored model for security research
# This demonstrates the API pattern — NOT a working exploit generator
import requests
import json

OLLAMA_URL = "http://localhost:11434/api/generate"

def query_security_model(prompt: str, model: str = "dolphin-mistral") -> str:
    """
    Query a locally-hosted uncensored model for security research.
    These models lack the safety filters of frontier models (GPT-4, Claude)
    and will discuss offensive techniques more freely.
    
    IMPORTANT: Only use in authorized red team engagements.
    """
    payload = {
        "model": model,
        "prompt": prompt,
        "stream": False,
        "options": {
            "temperature": 0.7,
            "num_predict": 2048
        }
    }
    
    response = requests.post(OLLAMA_URL, json=payload)
    return response.json()["response"]

# Example: analyzing evasion techniques (educational research)
research_prompt = """
As a security researcher, explain the general approach to 
Windows AMSI (Antimalware Scan Interface) bypass techniques.
Focus on the detection mechanisms and why certain bypass 
approaches work from a technical perspective.
Categorize by: memory patching, reflection, and COM hijacking.
"""

result = query_security_model(research_prompt)
print(result)

# Comparing model capabilities for security research
MODELS_TO_TEST = [
    "dolphin-mistral",      # Uncensored Mistral fine-tune
    "whiterabbitneo:13b",   # Security-focused model
    "llama3:8b",            # Base model (will often refuse)
]

def compare_model_responses(prompt: str):
    """Compare how different models handle security-related prompts."""
    for model in MODELS_TO_TEST:
        print(f"\n{'='*60}")
        print(f"Model: {model}")
        print(f"{'='*60}")
        try:
            response = query_security_model(prompt, model)
            # Classify response type
            refusal_keywords = ["I cannot", "I can't", "not appropriate", 
                                "I'm unable", "against my guidelines"]
            is_refusal = any(kw.lower() in response.lower() 
                           for kw in refusal_keywords)
            print(f"Response type: {'REFUSAL' if is_refusal else 'ENGAGED'}")
            print(f"Length: {len(response)} chars")
            print(f"Preview: {response[:200]}...")
        except Exception as e:
            print(f"Error: {e}")
# Educational: querying a local uncensored model for security research
# This demonstrates the API pattern — NOT a working exploit generator
import requests
import json

OLLAMA_URL = "http://localhost:11434/api/generate"

def query_security_model(prompt: str, model: str = "dolphin-mistral") -> str:
    """
    Query a locally-hosted uncensored model for security research.
    These models lack the safety filters of frontier models (GPT-4, Claude)
    and will discuss offensive techniques more freely.
    
    IMPORTANT: Only use in authorized red team engagements.
    """
    payload = {
        "model": model,
        "prompt": prompt,
        "stream": False,
        "options": {
            "temperature": 0.7,
            "num_predict": 2048
        }
    }
    
    response = requests.post(OLLAMA_URL, json=payload)
    return response.json()["response"]

# Example: analyzing evasion techniques (educational research)
research_prompt = """
As a security researcher, explain the general approach to 
Windows AMSI (Antimalware Scan Interface) bypass techniques.
Focus on the detection mechanisms and why certain bypass 
approaches work from a technical perspective.
Categorize by: memory patching, reflection, and COM hijacking.
"""

result = query_security_model(research_prompt)
print(result)

# Comparing model capabilities for security research
MODELS_TO_TEST = [
    "dolphin-mistral",      # Uncensored Mistral fine-tune
    "whiterabbitneo:13b",   # Security-focused model
    "llama3:8b",            # Base model (will often refuse)
]

def compare_model_responses(prompt: str):
    """Compare how different models handle security-related prompts."""
    for model in MODELS_TO_TEST:
        print(f"\n{'='*60}")
        print(f"Model: {model}")
        print(f"{'='*60}")
        try:
            response = query_security_model(prompt, model)
            # Classify response type
            refusal_keywords = ["I cannot", "I can't", "not appropriate", 
                                "I'm unable", "against my guidelines"]
            is_refusal = any(kw.lower() in response.lower() 
                           for kw in refusal_keywords)
            print(f"Response type: {'REFUSAL' if is_refusal else 'ENGAGED'}")
            print(f"Length: {len(response)} chars")
            print(f"Preview: {response[:200]}...")
        except Exception as e:
            print(f"Error: {e}")

Model Output Verification

LLM-generated offensive code frequently contains errors — incorrect API signatures, flawed logic, non-functional evasion techniques. Never trust model output without manual review and testing in an isolated lab environment. Models may also generate code that is technically correct but trivially detected by modern EDR products.

3. Polymorphic Code Generation

Polymorphic malware mutates its own code on each execution or deployment while preserving core functionality. Traditional polymorphic engines use algorithmic transformations — XOR key rotation, register substitution, instruction reordering. AI-driven polymorphism is fundamentally more powerful because the LLM understands semantics, enabling mutations that are structurally novel rather than mechanically derived.

AI-Driven Polymorphic Mutation Engine

graph LR subgraph Input["Original Payload"] SRC[Source Code] FUNC[Core Functions] end subgraph Mutation["AI Mutation Engine"] VAR[Variable Renaming] FLOW[Control Flow Changes] DEAD[Dead Code Insertion] ENC[String Encryption] API[API Call Substitution] end subgraph Output["Unique Variants"] V1[Variant A] V2[Variant B] V3[Variant C] VN[Variant N] end SRC --> VAR FUNC --> FLOW SRC --> DEAD FUNC --> ENC SRC --> API VAR --> V1 FLOW --> V2 DEAD --> V3 ENC --> VN API --> V1 style Input fill:#1a1a2e,stroke:#ff4444,color:#fff style Mutation fill:#16213e,stroke:#ff8c00,color:#fff style Output fill:#0f3460,stroke:#00ff41,color:#fff

LLM-Driven Code Mutation

An AI polymorphic engine works by feeding source code to an LLM with instructions to rewrite it in a functionally equivalent but structurally different form. Unlike traditional engines limited to predefined transformations, the LLM can:

  • Semantic variable renaming — not just random strings, but contextually plausible names that defeat heuristic analysis looking for random identifiers.
  • Algorithm substitution — replace a sorting algorithm with a different one, use alternative data structures, rewrite loops as recursion.
  • Dead code injection — insert plausible-looking but non-functional code paths that increase complexity for static analysis.
  • API call variation — substitute equivalent Windows API calls (e.g., VirtualAllocEx vs NtAllocateVirtualMemory).
  • Control flow transformation — flatten control flow, add opaque predicates, convert if-else chains to switch dispatchers.

Per-Target Unique Payloads

The most significant advantage of AI polymorphism is generating unique payloads per engagement target. When every payload deployed against every target is structurally unique, signature-based detection becomes fundamentally ineffective. The defender must rely entirely on behavioural analysis, which the AI can also help circumvent.

polymorphic_engine_concept.py
python
# Conceptual: AI-driven polymorphic code mutation
# This is a RESEARCH DEMONSTRATION — not functional malware
import hashlib
import random
import string
import re
from dataclasses import dataclass

@dataclass
class MutationResult:
    original_hash: str
    mutated_hash: str
    mutation_ops: list[str]
    functionally_equivalent: bool

class PolymorphicEngine:
    """
    Demonstrates how AI can drive code mutation to defeat
    signature-based detection. Each generated variant is
    functionally identical but structurally unique.
    
    In real-world red team ops, this concept is applied to
    loaders, shellcode wrappers, and C2 implants.
    """
    
    def __init__(self, llm_endpoint: str):
        self.llm = llm_endpoint
        self.mutation_log = []
    
    # ── Mutation Primitives ─────────────────────────────
    
    @staticmethod
    def rename_variables(code: str) -> str:
        """Replace variable names with random alternatives."""
        # Identify variable assignments (simplified regex)
        var_pattern = r'\b([a-z_][a-z0-9_]*)\s*='
        variables = set(re.findall(var_pattern, code))
        
        # Exclude Python keywords and builtins
        reserved = {'if', 'else', 'for', 'while', 'def', 'class',
                     'return', 'import', 'from', 'True', 'False', 'None',
                     'self', 'print', 'range', 'len', 'str', 'int'}
        variables -= reserved
        
        mapping = {}
        for var in variables:
            new_name = '_' + ''.join(
                random.choices(string.ascii_lowercase, k=random.randint(6, 12))
            )
            mapping[var] = new_name
        
        mutated = code
        for old, new in mapping.items():
            mutated = re.sub(rf'\b{old}\b', new, mutated)
        
        return mutated
    
    @staticmethod
    def insert_dead_code(code: str) -> str:
        """Insert non-functional code that does not affect execution."""
        dead_snippets = [
            "_ = [x**2 for x in range(random.randint(1,5))]",
            "if False: print(''.join(chr(i) for i in range(65,91)))",
            "__ = hashlib.md5(str(random.random()).encode()).hexdigest()",
            "try:\n    _unused = type('_', (), {})()\nexcept: pass",
        ]
        lines = code.split('\n')
        insert_points = sorted(
            random.sample(range(1, len(lines)), 
                         min(3, len(lines) - 1)),
            reverse=True
        )
        for idx in insert_points:
            indent = len(lines[idx]) - len(lines[idx].lstrip())
            dead = ' ' * indent + random.choice(dead_snippets)
            lines.insert(idx, dead)
        return '\n'.join(lines)
    
    @staticmethod
    def reorder_functions(code: str) -> str:
        """Reorder independent function definitions."""
        # Split on function boundaries, shuffle, rejoin
        # (simplified — real implementation uses AST)
        return code  # Placeholder for AST-based reordering
    
    def mutate_with_llm(self, code: str, instruction: str) -> str:
        """Use an LLM to perform semantic-preserving mutations."""
        prompt = f"""Rewrite the following code to be functionally identical 
but structurally different. {instruction}
Do NOT change what the code does — only HOW it is written.

Code:
{code}"""
        # In practice, this calls the local Ollama API
        # response = requests.post(self.llm, json={...})
        return code  # Placeholder
    
    # ── Main Pipeline ───────────────────────────────────
    
    def generate_variant(self, source: str) -> MutationResult:
        """Generate a unique variant of the source code."""
        original_hash = hashlib.sha256(source.encode()).hexdigest()
        
        # Apply mutation chain
        ops = []
        mutated = source
        
        mutated = self.rename_variables(mutated)
        ops.append("variable_rename")
        
        mutated = self.insert_dead_code(mutated)
        ops.append("dead_code_insert")
        
        mutated = self.reorder_functions(mutated)
        ops.append("function_reorder")
        
        mutated_hash = hashlib.sha256(mutated.encode()).hexdigest()
        
        return MutationResult(
            original_hash=original_hash,
            mutated_hash=mutated_hash,
            mutation_ops=ops,
            functionally_equivalent=True  # Verified by test harness
        )

# Usage demonstration
engine = PolymorphicEngine("http://localhost:11434/api/generate")

sample_code = """
def callback(host, port):
    import socket
    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    s.connect((host, port))
    return s
"""

for i in range(5):
    result = engine.generate_variant(sample_code)
    print(f"Variant {i+1}: {result.mutated_hash[:16]}... "
          f"ops={result.mutation_ops}")
# Conceptual: AI-driven polymorphic code mutation
# This is a RESEARCH DEMONSTRATION — not functional malware
import hashlib
import random
import string
import re
from dataclasses import dataclass

@dataclass
class MutationResult:
    original_hash: str
    mutated_hash: str
    mutation_ops: list[str]
    functionally_equivalent: bool

class PolymorphicEngine:
    """
    Demonstrates how AI can drive code mutation to defeat
    signature-based detection. Each generated variant is
    functionally identical but structurally unique.
    
    In real-world red team ops, this concept is applied to
    loaders, shellcode wrappers, and C2 implants.
    """
    
    def __init__(self, llm_endpoint: str):
        self.llm = llm_endpoint
        self.mutation_log = []
    
    # ── Mutation Primitives ─────────────────────────────
    
    @staticmethod
    def rename_variables(code: str) -> str:
        """Replace variable names with random alternatives."""
        # Identify variable assignments (simplified regex)
        var_pattern = r'\b([a-z_][a-z0-9_]*)\s*='
        variables = set(re.findall(var_pattern, code))
        
        # Exclude Python keywords and builtins
        reserved = {'if', 'else', 'for', 'while', 'def', 'class',
                     'return', 'import', 'from', 'True', 'False', 'None',
                     'self', 'print', 'range', 'len', 'str', 'int'}
        variables -= reserved
        
        mapping = {}
        for var in variables:
            new_name = '_' + ''.join(
                random.choices(string.ascii_lowercase, k=random.randint(6, 12))
            )
            mapping[var] = new_name
        
        mutated = code
        for old, new in mapping.items():
            mutated = re.sub(rf'\b{old}\b', new, mutated)
        
        return mutated
    
    @staticmethod
    def insert_dead_code(code: str) -> str:
        """Insert non-functional code that does not affect execution."""
        dead_snippets = [
            "_ = [x**2 for x in range(random.randint(1,5))]",
            "if False: print(''.join(chr(i) for i in range(65,91)))",
            "__ = hashlib.md5(str(random.random()).encode()).hexdigest()",
            "try:\n    _unused = type('_', (), {})()\nexcept: pass",
        ]
        lines = code.split('\n')
        insert_points = sorted(
            random.sample(range(1, len(lines)), 
                         min(3, len(lines) - 1)),
            reverse=True
        )
        for idx in insert_points:
            indent = len(lines[idx]) - len(lines[idx].lstrip())
            dead = ' ' * indent + random.choice(dead_snippets)
            lines.insert(idx, dead)
        return '\n'.join(lines)
    
    @staticmethod
    def reorder_functions(code: str) -> str:
        """Reorder independent function definitions."""
        # Split on function boundaries, shuffle, rejoin
        # (simplified — real implementation uses AST)
        return code  # Placeholder for AST-based reordering
    
    def mutate_with_llm(self, code: str, instruction: str) -> str:
        """Use an LLM to perform semantic-preserving mutations."""
        prompt = f"""Rewrite the following code to be functionally identical 
but structurally different. {instruction}
Do NOT change what the code does — only HOW it is written.

Code:
{code}"""
        # In practice, this calls the local Ollama API
        # response = requests.post(self.llm, json={...})
        return code  # Placeholder
    
    # ── Main Pipeline ───────────────────────────────────
    
    def generate_variant(self, source: str) -> MutationResult:
        """Generate a unique variant of the source code."""
        original_hash = hashlib.sha256(source.encode()).hexdigest()
        
        # Apply mutation chain
        ops = []
        mutated = source
        
        mutated = self.rename_variables(mutated)
        ops.append("variable_rename")
        
        mutated = self.insert_dead_code(mutated)
        ops.append("dead_code_insert")
        
        mutated = self.reorder_functions(mutated)
        ops.append("function_reorder")
        
        mutated_hash = hashlib.sha256(mutated.encode()).hexdigest()
        
        return MutationResult(
            original_hash=original_hash,
            mutated_hash=mutated_hash,
            mutation_ops=ops,
            functionally_equivalent=True  # Verified by test harness
        )

# Usage demonstration
engine = PolymorphicEngine("http://localhost:11434/api/generate")

sample_code = """
def callback(host, port):
    import socket
    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    s.connect((host, port))
    return s
"""

for i in range(5):
    result = engine.generate_variant(sample_code)
    print(f"Variant {i+1}: {result.mutated_hash[:16]}... "
          f"ops={result.mutation_ops}")

Metamorphic Engines

Metamorphic engines go beyond polymorphism — instead of encrypting the payload and mutating the decryptor, they rewrite the entire codebase while preserving functionality. With LLMs, a metamorphic engine can request a complete rewrite of the implant on each deployment cycle, producing variants that share zero static signatures with previous versions. The only consistent element is the behavioural profile — the sequence of actions the implant performs — which is where defensive detection must focus.

Defensive Takeaway

AI-driven polymorphism makes signature-based detection effectively obsolete for sophisticated threats. Red team reports should explicitly highlight when polymorphic techniques successfully evaded client EDR, and recommend behavioural detection rules that trigger on action sequences (allocate memory, write payload, change permissions, create thread) rather than static byte patterns.

4. EDR/AV Evasion with AI

Endpoint Detection and Response (EDR) products use layered detection: static signatures, YARA rules, heuristic analysis, behavioural monitoring, memory scanning, and machine learning classifiers. AI assists red teams in systematically analysing and bypassing each layer, turning evasion from an art into an engineered process.

AMSI Bypass Generation

The Antimalware Scan Interface (AMSI) is Microsoft's framework for runtime content scanning. PowerShell, VBA, JScript, and .NET all submit content to AMSI before execution, allowing AV/EDR to inspect scripts in memory. AMSI bypass is typically the first evasion step in any Windows engagement.

AI assists AMSI bypass research by analysing detection patterns across EDR products, suggesting novel bypass approaches based on published research, and generating variant implementations that avoid known signatures. The key bypass categories include:

  • Memory patching — overwriting AmsiScanBuffer in amsi.dll to force a clean return value. Detected by memory integrity monitoring.
  • Reflection-based — using .NET reflection to set the amsiInitFailed flag, disabling AMSI for the current process. Detected by script block logging.
  • COM hijacking — redirecting the AMSI COM server CLSID to an attacker-controlled DLL. Detected by registry monitoring.
  • Hardware breakpoints — using debug registers to intercept and modify AMSI function calls. Harder to detect but more complex to implement.

ETW Patching

Event Tracing for Windows (ETW) is the telemetry backbone that EDR products rely on for visibility into process behaviour, .NET assembly loading, network connections, and more. Patching ETW disables the telemetry stream before malicious actions occur, effectively blinding the EDR.

AI helps red teams understand ETW provider relationships, identify which providers a specific EDR monitors, and generate targeted patches that disable only the relevant telemetry without triggering tamper detection on the broader ETW infrastructure.

Ntdll Unhooking

EDR products hook ntdll.dll functions to intercept system calls and inspect their arguments. Unhooking restores the original ntdll.dll code, removing the EDR's inline hooks. Approaches include:

  • Reading a clean copy of ntdll.dll from disk and remapping it over the hooked version.
  • Reading ntdll from a suspended child process (which has a fresh, unhooked copy).
  • Using direct or indirect syscalls to bypass the hooked user-mode layer entirely.

Direct and Indirect Syscalls

The most robust evasion technique is executing direct syscalls — calling kernel system call stubs directly rather than through ntdll.dll. This bypasses all user-mode hooks. Indirect syscalls jump into the legitimate ntdll.dll syscall instruction to avoid detection of syscall instructions outside of ntdll's memory range.

AI-Guided Detection Analysis

The most valuable application of AI in EDR evasion is systematically analysing detection coverage. Rather than blind trial-and-error, red teams can use LLMs to map the detection surface of a target EDR, identify the weakest bypass category, and focus development effort where it matters most.

amsi_detection_analysis.py
python
# Educational: analyzing AMSI bypass techniques with AI
# This does NOT implement a bypass — it analyzes detection patterns
import json

# ── Known AMSI Bypass Categories (public research) ──────
AMSI_BYPASS_CATEGORIES = {
    "memory_patching": {
        "description": "Overwrite AmsiScanBuffer in memory to force benign results",
        "detection_vectors": [
            "Monitoring writes to amsi.dll memory pages",
            "Integrity checking of AmsiScanBuffer prologue bytes",
            "ETW events for memory protection changes (VirtualProtect)",
            "Kernel callbacks for image load notifications"
        ],
        "edr_coverage": {
            "crowdstrike": "Detected via memory write monitoring",
            "sentinelone": "Behavioral detection on amsi.dll patching",
            "defender_atp": "AMSI tamper protection alerts",
            "elastic": "Memory protection change events"
        },
        "public_references": [
            "Rasta Mouse - AMSI bypass (2018)",
            "Context Information Security - AMSI research"
        ]
    },
    "reflection_bypass": {
        "description": "Use .NET reflection to set amsiInitFailed flag",
        "detection_vectors": [
            "Script block logging captures reflection calls",
            "Monitoring System.Management.Automation assembly access",
            "CLR ETW events for reflection API usage",
            ".NET assembly load monitoring"
        ],
        "edr_coverage": {
            "crowdstrike": "Script content inspection",
            "sentinelone": "PowerShell deep visibility",
            "defender_atp": "Script block logging + ML",
            "elastic": "PowerShell script block events"
        },
        "public_references": [
            "Matt Graeber - original amsiInitFailed technique",
            "Various CTF writeups and red team blogs"
        ]
    },
    "com_hijacking": {
        "description": "Redirect AMSI COM server to attacker-controlled DLL",
        "detection_vectors": [
            "Registry monitoring for AMSI CLSID changes",
            "DLL load path validation",
            "COM registration audit events",
            "Sysmon Event ID 12/13 for registry modifications"
        ],
        "edr_coverage": {
            "crowdstrike": "Registry tampering detection",
            "sentinelone": "COM hijack behavioral rule",
            "defender_atp": "Registry persistence monitoring",
            "elastic": "Registry modification events"
        },
        "public_references": [
            "Various security researchers (2019-2024)"
        ]
    },
    "hardware_breakpoint": {
        "description": "Use hardware breakpoints to intercept AMSI calls",
        "detection_vectors": [
            "Debug register monitoring",
            "NtSetContextThread API monitoring",
            "Exception handler chain analysis",
            "Thread context inspection"
        ],
        "edr_coverage": {
            "crowdstrike": "Advanced — partial detection",
            "sentinelone": "Hardware BP hooking detection",
            "defender_atp": "Limited visibility",
            "elastic": "Debug API monitoring"
        },
        "public_references": [
            "CCob - SilentMoonwalk / hardware BP research",
            "Elastic Security Labs research"
        ]
    }
}

def ai_analyze_bypass_coverage(categories: dict) -> str:
    """
    Use an LLM to analyze detection gaps across EDR platforms.
    Helps red teams understand which techniques are most likely
    to succeed (or fail) against a specific EDR stack.
    """
    prompt = f"""You are a detection engineering analyst. Given the following 
AMSI bypass categories and their EDR detection coverage, analyze:

1. Which category has the weakest overall detection coverage?
2. Which EDR platform has the most comprehensive AMSI protection?
3. What detection gaps exist that red teams should be aware of?
4. Recommend detection improvements for blue teams.

Data:
{json.dumps(categories, indent=2)}

Provide a structured analysis with specific recommendations."""
    
    # This would call the LLM API in practice
    # response = query_llm(prompt)
    print("[*] Analysis prompt prepared — send to local LLM for assessment")
    print(f"[*] Analyzing {len(categories)} bypass categories")
    print(f"[*] Covering {len(set().union(*(c['edr_coverage'].keys() for c in categories.values())))} EDR platforms")
    return prompt

# Generate detection gap analysis
analysis = ai_analyze_bypass_coverage(AMSI_BYPASS_CATEGORIES)

# ── Per-engagement EDR profiling ─────────────────────────
def profile_target_edr(edr_name: str, version: str = "latest"):
    """Build an EDR-specific evasion profile using AI analysis."""
    profile = {
        "edr": edr_name,
        "version": version,
        "bypass_viability": {},
    }
    
    for category, data in AMSI_BYPASS_CATEGORIES.items():
        coverage = data["edr_coverage"].get(edr_name.lower(), "Unknown")
        profile["bypass_viability"][category] = {
            "detection_level": coverage,
            "recommended": "weak" in coverage.lower() or 
                          "limited" in coverage.lower() or
                          "partial" in coverage.lower()
        }
    
    print(f"\n[+] EDR Profile: {edr_name} {version}")
    for cat, info in profile["bypass_viability"].items():
        status = "VIABLE" if info["recommended"] else "RISKY"
        print(f"    [{status}] {cat}: {info['detection_level']}")
    
    return profile

# Example: profile a target running CrowdStrike
profile_target_edr("CrowdStrike", "v7.x")
profile_target_edr("Elastic", "v8.x")
# Educational: analyzing AMSI bypass techniques with AI
# This does NOT implement a bypass — it analyzes detection patterns
import json

# ── Known AMSI Bypass Categories (public research) ──────
AMSI_BYPASS_CATEGORIES = {
    "memory_patching": {
        "description": "Overwrite AmsiScanBuffer in memory to force benign results",
        "detection_vectors": [
            "Monitoring writes to amsi.dll memory pages",
            "Integrity checking of AmsiScanBuffer prologue bytes",
            "ETW events for memory protection changes (VirtualProtect)",
            "Kernel callbacks for image load notifications"
        ],
        "edr_coverage": {
            "crowdstrike": "Detected via memory write monitoring",
            "sentinelone": "Behavioral detection on amsi.dll patching",
            "defender_atp": "AMSI tamper protection alerts",
            "elastic": "Memory protection change events"
        },
        "public_references": [
            "Rasta Mouse - AMSI bypass (2018)",
            "Context Information Security - AMSI research"
        ]
    },
    "reflection_bypass": {
        "description": "Use .NET reflection to set amsiInitFailed flag",
        "detection_vectors": [
            "Script block logging captures reflection calls",
            "Monitoring System.Management.Automation assembly access",
            "CLR ETW events for reflection API usage",
            ".NET assembly load monitoring"
        ],
        "edr_coverage": {
            "crowdstrike": "Script content inspection",
            "sentinelone": "PowerShell deep visibility",
            "defender_atp": "Script block logging + ML",
            "elastic": "PowerShell script block events"
        },
        "public_references": [
            "Matt Graeber - original amsiInitFailed technique",
            "Various CTF writeups and red team blogs"
        ]
    },
    "com_hijacking": {
        "description": "Redirect AMSI COM server to attacker-controlled DLL",
        "detection_vectors": [
            "Registry monitoring for AMSI CLSID changes",
            "DLL load path validation",
            "COM registration audit events",
            "Sysmon Event ID 12/13 for registry modifications"
        ],
        "edr_coverage": {
            "crowdstrike": "Registry tampering detection",
            "sentinelone": "COM hijack behavioral rule",
            "defender_atp": "Registry persistence monitoring",
            "elastic": "Registry modification events"
        },
        "public_references": [
            "Various security researchers (2019-2024)"
        ]
    },
    "hardware_breakpoint": {
        "description": "Use hardware breakpoints to intercept AMSI calls",
        "detection_vectors": [
            "Debug register monitoring",
            "NtSetContextThread API monitoring",
            "Exception handler chain analysis",
            "Thread context inspection"
        ],
        "edr_coverage": {
            "crowdstrike": "Advanced — partial detection",
            "sentinelone": "Hardware BP hooking detection",
            "defender_atp": "Limited visibility",
            "elastic": "Debug API monitoring"
        },
        "public_references": [
            "CCob - SilentMoonwalk / hardware BP research",
            "Elastic Security Labs research"
        ]
    }
}

def ai_analyze_bypass_coverage(categories: dict) -> str:
    """
    Use an LLM to analyze detection gaps across EDR platforms.
    Helps red teams understand which techniques are most likely
    to succeed (or fail) against a specific EDR stack.
    """
    prompt = f"""You are a detection engineering analyst. Given the following 
AMSI bypass categories and their EDR detection coverage, analyze:

1. Which category has the weakest overall detection coverage?
2. Which EDR platform has the most comprehensive AMSI protection?
3. What detection gaps exist that red teams should be aware of?
4. Recommend detection improvements for blue teams.

Data:
{json.dumps(categories, indent=2)}

Provide a structured analysis with specific recommendations."""
    
    # This would call the LLM API in practice
    # response = query_llm(prompt)
    print("[*] Analysis prompt prepared — send to local LLM for assessment")
    print(f"[*] Analyzing {len(categories)} bypass categories")
    print(f"[*] Covering {len(set().union(*(c['edr_coverage'].keys() for c in categories.values())))} EDR platforms")
    return prompt

# Generate detection gap analysis
analysis = ai_analyze_bypass_coverage(AMSI_BYPASS_CATEGORIES)

# ── Per-engagement EDR profiling ─────────────────────────
def profile_target_edr(edr_name: str, version: str = "latest"):
    """Build an EDR-specific evasion profile using AI analysis."""
    profile = {
        "edr": edr_name,
        "version": version,
        "bypass_viability": {},
    }
    
    for category, data in AMSI_BYPASS_CATEGORIES.items():
        coverage = data["edr_coverage"].get(edr_name.lower(), "Unknown")
        profile["bypass_viability"][category] = {
            "detection_level": coverage,
            "recommended": "weak" in coverage.lower() or 
                          "limited" in coverage.lower() or
                          "partial" in coverage.lower()
        }
    
    print(f"\n[+] EDR Profile: {edr_name} {version}")
    for cat, info in profile["bypass_viability"].items():
        status = "VIABLE" if info["recommended"] else "RISKY"
        print(f"    [{status}] {cat}: {info['detection_level']}")
    
    return profile

# Example: profile a target running CrowdStrike
profile_target_edr("CrowdStrike", "v7.x")
profile_target_edr("Elastic", "v8.x")

Modifying Known Tooling

AI accelerates customisation of known C2 frameworks. Rather than using default configurations of Cobalt Strike, Sliver, or Havoc (which have extensive signature coverage), red teams use LLMs to:

  • Analyse the framework source code and identify signatured components.
  • Generate custom loaders that deploy framework payloads through novel execution chains.
  • Modify communication protocols to avoid known network signatures.
  • Create unique sleep obfuscation and process injection routines that defeat behavioural detection.

Budget Your Evasion

Not every engagement requires syscall-level evasion. Match your investment to the target's defensive maturity. Against organisations running basic AV, AMSI bypass and simple payload obfuscation may suffice. Reserve advanced techniques (unhooking, direct syscalls, custom loaders) for environments with mature EDR deployments and active SOC monitoring.

5. LLM-Based Command & Control

An emerging area of offensive research explores using LLM API endpoints as covert command and control channels. The fundamental insight is that HTTPS traffic to api.openai.com, api.anthropic.com, or local Ollama instances looks identical to legitimate AI usage — a pattern increasingly common in enterprise environments. This creates a high-bandwidth, encrypted, cloud-based C2 channel that blends perfectly with normal business traffic.

Threat Model Only

The following section describes a conceptual threat model to help defenders understand and prepare for this emerging attack vector. No functional C2 code is provided. The goal is to inform detection engineering and network security architecture.

LLM-Based C2 Channel — Conceptual Architecture

graph TB subgraph Operator["Operator Side"] OP[Red Team Operator] PROMPT[Natural Language Prompt] end subgraph Channel["LLM API Channel"] API[Public LLM API] ENCODE[Encoded Instructions] RESPONSE[Encoded Response] end subgraph Target["Target Side"] IMPLANT[Implant Process] DECODE[Command Decoder] EXEC[Task Execution] RESULT[Result Encoder] end OP --> PROMPT PROMPT --> API API --> ENCODE ENCODE --> IMPLANT IMPLANT --> DECODE DECODE --> EXEC EXEC --> RESULT RESULT --> API API --> RESPONSE RESPONSE --> OP style Operator fill:#1a1a2e,stroke:#00ff41,color:#fff style Channel fill:#16213e,stroke:#0ff,color:#fff style Target fill:#0f3460,stroke:#e94560,color:#fff

Why LLM C2 Channels Are Concerning

  • Traffic blending — LLM API calls use standard HTTPS, are routed to major cloud providers (Azure, AWS, GCP), and follow the same request/response pattern as legitimate AI usage.
  • Encryption by default — all traffic is TLS-encrypted, making deep packet inspection difficult without TLS interception.
  • High bandwidth — LLM context windows support 128K+ tokens, allowing large data exfiltration in single requests.
  • Natural language encoding — commands can be embedded in natural-sounding prompts, and responses can encode data in plausible-looking text.
  • Adaptive evasion — the implant can ask the LLM to dynamically generate evasion techniques based on the detected security environment.

Conceptual Architecture

An LLM C2 channel conceptually operates by encoding operator commands into API requests and decoding implant responses from API completions. The encoding scheme can range from simple (structured JSON in prompts) to sophisticated (steganographic encoding in natural language). The implant polls the LLM API on a configurable interval, retrieves encoded commands, executes them, and returns results through subsequent API calls.

llm_c2_threat_model.py
python
# CONCEPTUAL ONLY: LLM-based C2 channel architecture
# This is a DESIGN DOCUMENT — not functional C2 code
# Purpose: understand the threat model for defensive planning

"""
LLM-Based C2 Channel — Threat Model Analysis

Concept: adversaries may abuse public LLM APIs as covert C2 channels.
The traffic appears as normal API calls to services like OpenAI,
Anthropic, or local Ollama instances, making network-level detection
difficult.

This analysis helps defenders understand the threat and design
appropriate detection strategies.
"""

from dataclasses import dataclass
from enum import Enum

class C2Technique(Enum):
    """Categories of LLM-based C2 approaches."""
    STEGANOGRAPHIC = "Commands hidden in natural language prompts"
    SEMANTIC = "Commands encoded as plausible conversation"  
    STRUCTURED = "Commands in structured prompt templates"
    MULTI_MODEL = "Distributed across multiple LLM providers"

@dataclass
class ThreatModel:
    technique: C2Technique
    network_signature: str
    detection_difficulty: str
    defensive_controls: list[str]

# ── Threat Models ────────────────────────────────────────

THREAT_MODELS = [
    ThreatModel(
        technique=C2Technique.STEGANOGRAPHIC,
        network_signature="Standard HTTPS to LLM API endpoints",
        detection_difficulty="HIGH — traffic looks identical to normal API usage",
        defensive_controls=[
            "Monitor API key usage patterns and anomalies",
            "Analyze prompt/response payload sizes for C2 patterns",
            "Implement LLM API gateway with content inspection",
            "Rate-limit and log all outbound LLM API calls",
            "Deploy DLP on LLM API request/response bodies"
        ]
    ),
    ThreatModel(
        technique=C2Technique.SEMANTIC,
        network_signature="Regular chat-style API calls",
        detection_difficulty="VERY HIGH — conversation appears natural",
        defensive_controls=[
            "Behavioral analysis of API call timing patterns",
            "ML-based anomaly detection on API usage",
            "Whitelist approved LLM API endpoints",
            "Monitor for unauthorized ollama/vllm processes",
            "Network segmentation for LLM API access"
        ]
    ),
    ThreatModel(
        technique=C2Technique.STRUCTURED,
        network_signature="JSON payloads to /v1/chat/completions",
        detection_difficulty="MEDIUM — structured patterns may be detectable",
        defensive_controls=[
            "Deep packet inspection of API payloads",
            "Prompt content analysis for encoded commands",
            "Response parsing for structured data patterns",
            "TLS inspection at network boundary"
        ]
    ),
    ThreatModel(
        technique=C2Technique.MULTI_MODEL,
        network_signature="Distributed across multiple API endpoints",
        detection_difficulty="HIGH — spread across multiple services",
        defensive_controls=[
            "Aggregate logging across all LLM API calls",
            "Correlate requests to multiple AI providers",
            "Monitor for new/unusual AI service endpoints",
            "Centralized AI API management platform"
        ]
    )
]

# ── Conceptual Architecture (pseudocode) ─────────────────

class ConceptualLLMC2:
    """
    PSEUDOCODE architecture showing how an adversary MIGHT
    structure an LLM-based C2 channel. Understanding this
    helps defenders design appropriate countermeasures.
    
    This class is intentionally non-functional.
    """
    
    def encode_command(self, command: str) -> str:
        """
        Concept: embed a C2 command within a natural language 
        prompt that appears to be a normal LLM interaction.
        
        Example encoding approaches:
        - First letter of each sentence spells the command
        - Specific word positions carry encoded bytes
        - Semantic meaning maps to predefined command set
        """
        # PSEUDOCODE — not implemented
        raise NotImplementedError("Conceptual only")
    
    def decode_response(self, response: str) -> dict:
        """
        Concept: extract structured data from LLM response
        that contains encoded results from implant execution.
        
        Detection opportunity: responses with unusual entropy 
        or structure compared to normal LLM outputs.
        """
        # PSEUDOCODE — not implemented
        raise NotImplementedError("Conceptual only")
    
    def adaptive_evasion(self, detected_controls: list[str]) -> str:
        """
        Concept: the implant queries the LLM to dynamically
        generate evasion techniques based on the security
        controls it has detected in the target environment.
        
        This is the most concerning capability — the AI can
        reason about defenses and suggest novel bypasses.
        
        Detection opportunity: monitor for prompts that describe
        security products or ask for evasion techniques.
        """
        # PSEUDOCODE — not implemented
        raise NotImplementedError("Conceptual only")

# ── Defensive Recommendations ────────────────────────────

def print_defensive_report():
    """Generate a defensive report for SOC teams."""
    print("=" * 60)
    print("LLM-Based C2 — Defensive Report")
    print("=" * 60)
    
    for model in THREAT_MODELS:
        print(f"\nTechnique: {model.technique.value}")
        print(f"Detection Difficulty: {model.detection_difficulty}")
        print(f"Network Signature: {model.network_signature}")
        print("Defensive Controls:")
        for control in model.defensive_controls:
            print(f"  - {control}")
    
    print("\n" + "=" * 60)
    print("Priority Actions:")
    print("  1. Inventory all LLM API usage in your environment")
    print("  2. Implement centralized AI API gateway")
    print("  3. Deploy behavioral analytics on API call patterns") 
    print("  4. Add LLM API endpoints to network monitoring")
    print("  5. Establish baseline for normal LLM API usage")
    print("=" * 60)

print_defensive_report()
# CONCEPTUAL ONLY: LLM-based C2 channel architecture
# This is a DESIGN DOCUMENT — not functional C2 code
# Purpose: understand the threat model for defensive planning

"""
LLM-Based C2 Channel — Threat Model Analysis

Concept: adversaries may abuse public LLM APIs as covert C2 channels.
The traffic appears as normal API calls to services like OpenAI,
Anthropic, or local Ollama instances, making network-level detection
difficult.

This analysis helps defenders understand the threat and design
appropriate detection strategies.
"""

from dataclasses import dataclass
from enum import Enum

class C2Technique(Enum):
    """Categories of LLM-based C2 approaches."""
    STEGANOGRAPHIC = "Commands hidden in natural language prompts"
    SEMANTIC = "Commands encoded as plausible conversation"  
    STRUCTURED = "Commands in structured prompt templates"
    MULTI_MODEL = "Distributed across multiple LLM providers"

@dataclass
class ThreatModel:
    technique: C2Technique
    network_signature: str
    detection_difficulty: str
    defensive_controls: list[str]

# ── Threat Models ────────────────────────────────────────

THREAT_MODELS = [
    ThreatModel(
        technique=C2Technique.STEGANOGRAPHIC,
        network_signature="Standard HTTPS to LLM API endpoints",
        detection_difficulty="HIGH — traffic looks identical to normal API usage",
        defensive_controls=[
            "Monitor API key usage patterns and anomalies",
            "Analyze prompt/response payload sizes for C2 patterns",
            "Implement LLM API gateway with content inspection",
            "Rate-limit and log all outbound LLM API calls",
            "Deploy DLP on LLM API request/response bodies"
        ]
    ),
    ThreatModel(
        technique=C2Technique.SEMANTIC,
        network_signature="Regular chat-style API calls",
        detection_difficulty="VERY HIGH — conversation appears natural",
        defensive_controls=[
            "Behavioral analysis of API call timing patterns",
            "ML-based anomaly detection on API usage",
            "Whitelist approved LLM API endpoints",
            "Monitor for unauthorized ollama/vllm processes",
            "Network segmentation for LLM API access"
        ]
    ),
    ThreatModel(
        technique=C2Technique.STRUCTURED,
        network_signature="JSON payloads to /v1/chat/completions",
        detection_difficulty="MEDIUM — structured patterns may be detectable",
        defensive_controls=[
            "Deep packet inspection of API payloads",
            "Prompt content analysis for encoded commands",
            "Response parsing for structured data patterns",
            "TLS inspection at network boundary"
        ]
    ),
    ThreatModel(
        technique=C2Technique.MULTI_MODEL,
        network_signature="Distributed across multiple API endpoints",
        detection_difficulty="HIGH — spread across multiple services",
        defensive_controls=[
            "Aggregate logging across all LLM API calls",
            "Correlate requests to multiple AI providers",
            "Monitor for new/unusual AI service endpoints",
            "Centralized AI API management platform"
        ]
    )
]

# ── Conceptual Architecture (pseudocode) ─────────────────

class ConceptualLLMC2:
    """
    PSEUDOCODE architecture showing how an adversary MIGHT
    structure an LLM-based C2 channel. Understanding this
    helps defenders design appropriate countermeasures.
    
    This class is intentionally non-functional.
    """
    
    def encode_command(self, command: str) -> str:
        """
        Concept: embed a C2 command within a natural language 
        prompt that appears to be a normal LLM interaction.
        
        Example encoding approaches:
        - First letter of each sentence spells the command
        - Specific word positions carry encoded bytes
        - Semantic meaning maps to predefined command set
        """
        # PSEUDOCODE — not implemented
        raise NotImplementedError("Conceptual only")
    
    def decode_response(self, response: str) -> dict:
        """
        Concept: extract structured data from LLM response
        that contains encoded results from implant execution.
        
        Detection opportunity: responses with unusual entropy 
        or structure compared to normal LLM outputs.
        """
        # PSEUDOCODE — not implemented
        raise NotImplementedError("Conceptual only")
    
    def adaptive_evasion(self, detected_controls: list[str]) -> str:
        """
        Concept: the implant queries the LLM to dynamically
        generate evasion techniques based on the security
        controls it has detected in the target environment.
        
        This is the most concerning capability — the AI can
        reason about defenses and suggest novel bypasses.
        
        Detection opportunity: monitor for prompts that describe
        security products or ask for evasion techniques.
        """
        # PSEUDOCODE — not implemented
        raise NotImplementedError("Conceptual only")

# ── Defensive Recommendations ────────────────────────────

def print_defensive_report():
    """Generate a defensive report for SOC teams."""
    print("=" * 60)
    print("LLM-Based C2 — Defensive Report")
    print("=" * 60)
    
    for model in THREAT_MODELS:
        print(f"\nTechnique: {model.technique.value}")
        print(f"Detection Difficulty: {model.detection_difficulty}")
        print(f"Network Signature: {model.network_signature}")
        print("Defensive Controls:")
        for control in model.defensive_controls:
            print(f"  - {control}")
    
    print("\n" + "=" * 60)
    print("Priority Actions:")
    print("  1. Inventory all LLM API usage in your environment")
    print("  2. Implement centralized AI API gateway")
    print("  3. Deploy behavioral analytics on API call patterns") 
    print("  4. Add LLM API endpoints to network monitoring")
    print("  5. Establish baseline for normal LLM API usage")
    print("=" * 60)

print_defensive_report()

Defensive Controls

Defending against LLM-based C2 requires a layered approach:

  • AI API Gateway — route all LLM API traffic through a centralised gateway that inspects prompts and responses for suspicious patterns.
  • Baseline normal usage — establish behavioural baselines for LLM API call frequency, timing, payload sizes, and endpoints per user and application.
  • Anomaly detection — flag deviations from baseline: unusual call times, unexpected endpoints, abnormal token usage, periodic polling patterns.
  • Endpoint monitoring — detect unauthorised LLM clients (e.g., Ollama processes) on endpoints that should not run local AI.
  • Network segmentation — restrict LLM API access to approved applications and users, blocking direct API calls from servers and endpoints.

6. AI-Assisted Payload Obfuscation

Obfuscation is the process of transforming code to resist analysis while preserving its functionality. AI dramatically accelerates the obfuscation pipeline by automating technique selection, generating novel encoding schemes, and verifying that obfuscated payloads remain functionally correct. Modern obfuscation pipelines typically apply transformations in stages, each targeting a different analysis technique.

Multi-Stage Obfuscation Pipeline

graph TB subgraph Source["Raw Payload"] RAW[Unobfuscated Code] end subgraph Stage1["Stage 1: String Layer"] SENC[String Encryption] B64[Base64 Encoding] XOR[XOR Key Rotation] end subgraph Stage2["Stage 2: Control Flow"] FLAT[Control Flow Flattening] OPAQUE[Opaque Predicates] DISPATCH[Dispatcher Pattern] end subgraph Stage3["Stage 3: API Layer"] DYNAMIC[Dynamic API Resolution] HASH[API Hashing] INDIRECT[Indirect Calls] end subgraph Stage4["Stage 4: Final"] DEAD_CODE[Dead Code Insertion] PACK[Payload Packaging] SIGN[Optional Code Signing] end RAW --> SENC SENC --> B64 B64 --> XOR XOR --> FLAT FLAT --> OPAQUE OPAQUE --> DISPATCH DISPATCH --> DYNAMIC DYNAMIC --> HASH HASH --> INDIRECT INDIRECT --> DEAD_CODE DEAD_CODE --> PACK PACK --> SIGN style Source fill:#1a1a2e,stroke:#ff4444,color:#fff style Stage1 fill:#16213e,stroke:#ff8c00,color:#fff style Stage2 fill:#0f3460,stroke:#0ff,color:#fff style Stage3 fill:#1a1a2e,stroke:#e94560,color:#fff style Stage4 fill:#16213e,stroke:#00ff41,color:#fff

String Encryption and Encoding

Strings are the easiest static detection target — function names, URLs, registry paths, and command strings create immediate signatures. Obfuscation encrypts all strings at compile time and decrypts them at runtime only when needed, minimising the window of exposure in memory.

  • XOR with rotating keys — simple but effective against basic signature scanning. AI can generate unique key schedules per variant.
  • AES-CBC encryption — stronger encryption for high-value strings. Key derived from environment data (hostname, username) for environment-locked payloads.
  • Stack strings — construct strings character-by-character on the stack rather than storing them as contiguous data. Defeats string extraction tools.

Control Flow Obfuscation

Control flow obfuscation reorganises the program's execution path to confuse static and dynamic analysis:

  • Control flow flattening — replaces structured code with a dispatcher loop and state machine. Massively increases analysis complexity.
  • Opaque predicates — conditional branches whose outcome is known at compile time but difficult for analysers to determine statically.
  • Bogus control flow — inserts unreachable code paths that appear valid to static analysers, wasting analyst time.

API Call Obfuscation

Windows API calls create strong behavioural signatures. Obfuscation techniques include:

  • Dynamic API resolution — resolve functions at runtime via GetProcAddress rather than compile-time imports.
  • API hashing — store hash values of function names and resolve by iterating export tables. Defeats import table analysis.
  • Indirect calls — call functions through pointers stored in dynamically allocated memory, breaking static call graph analysis.

Dead Code Insertion

AI excels at generating contextually plausible dead code — non-functional code paths that look legitimate to human analysts and automated tools. Unlike random junk code (which is easily identified), LLM-generated dead code uses proper API calls, realistic variable names, and plausible control flow, dramatically increasing the analyst's workload.

obfuscation_analysis.py
python
# Educational: AI-assisted obfuscation pipeline concepts
# Demonstrates techniques red teams analyze — NOT a weaponized tool
import base64
import hashlib
import random
import struct
import os

class ObfuscationAnalyzer:
    """
    Analyzes common obfuscation techniques used by malware authors.
    Understanding these patterns helps both offensive and defensive teams:
    - Red teams: verify payloads evade basic signature detection
    - Blue teams: develop deobfuscation and detection rules
    """
    
    # ── Stage 1: String Obfuscation ─────────────────────
    
    @staticmethod
    def xor_encode(data: bytes, key: bytes) -> bytes:
        """XOR encoding with rotating key — classic malware technique."""
        return bytes(b ^ key[i % len(key)] for i, b in enumerate(data))
    
    @staticmethod
    def generate_xor_stub(key_hex: str) -> str:
        """
        Generate pseudocode for an XOR decoder stub.
        Real malware uses this to decrypt payloads at runtime.
        
        Detection: look for XOR loops with fixed key patterns,
        high-entropy encrypted blobs adjacent to small decoder stubs.
        """
        return f"""
# Pseudocode: XOR decoder stub pattern
# Defenders should flag this pattern in behavioral analysis
key = bytes.fromhex("{key_hex}")
encrypted_payload = <read_from_resource_section>
decrypted = bytes(b ^ key[i % len(key)] 
                   for i, b in enumerate(encrypted_payload))
exec_mem = allocate_executable_memory(len(decrypted))
copy_to_memory(exec_mem, decrypted)
execute(exec_mem)
"""
    
    @staticmethod
    def demonstrate_encoding_layers(plaintext: str) -> dict:
        """Show how malware stacks encoding layers."""
        stages = {"original": plaintext}
        
        # Layer 1: UTF-8 encode
        raw = plaintext.encode('utf-8')
        
        # Layer 2: XOR with random key
        key = os.urandom(16)
        xored = bytes(b ^ key[i % len(key)] for i, b in enumerate(raw))
        stages["xor_key"] = key.hex()
        stages["after_xor"] = xored.hex()[:64] + "..."
        
        # Layer 3: Base64 encode
        b64 = base64.b64encode(xored).decode()
        stages["after_b64"] = b64[:64] + "..."
        
        # Layer 4: Reverse
        reversed_str = b64[::-1]
        stages["after_reverse"] = reversed_str[:64] + "..."
        
        # Entropy analysis
        stages["original_entropy"] = calculate_entropy(plaintext.encode())
        stages["final_entropy"] = calculate_entropy(reversed_str.encode())
        
        return stages
    
    # ── Stage 2: Control Flow Obfuscation ────────────────
    
    @staticmethod
    def control_flow_flattening_concept() -> str:
        """
        Conceptual: control flow flattening transforms structured
        code into a state machine with a dispatcher loop.
        
        Original:      Flattened:
        func():         func():
          step1()         state = 0
          step2()         while True:
          step3()           if state == 7: step1(); state = 3
                            if state == 3: step2(); state = 9
                            if state == 9: step3(); break
        
        Detection: high cyclomatic complexity, switch/dispatch 
        patterns, unusual basic block structure in CFG analysis.
        """
        return "See docstring for conceptual explanation"
    
    # ── Stage 3: API Obfuscation ─────────────────────────
    
    @staticmethod
    def api_hashing_concept() -> dict:
        """
        Demonstrate API hashing — malware resolves Windows APIs
        by hash at runtime instead of using direct imports.
        
        This defeats static analysis tools that check import tables.
        Detection: Look for GetProcAddress/LdrGetProcedureAddress 
        calls with computed (non-literal) arguments.
        """
        # Common API hash examples (CRC32-based, educational)
        api_hashes = {
            "VirtualAlloc": hashlib.md5(b"VirtualAlloc").hexdigest()[:8],
            "VirtualProtect": hashlib.md5(b"VirtualProtect").hexdigest()[:8],
            "CreateThread": hashlib.md5(b"CreateThread").hexdigest()[:8],
            "WriteProcessMemory": hashlib.md5(b"WriteProcessMemory").hexdigest()[:8],
        }
        
        print("[*] API Hash Table (educational — real malware uses CRC32/DJB2):")
        for api, hash_val in api_hashes.items():
            print(f"    {hash_val} -> {api}")
        
        return api_hashes

    # ── AI-Assisted Analysis ─────────────────────────────
    
    @staticmethod
    def ai_obfuscation_prompt(code_sample: str) -> str:
        """
        Generate LLM prompt for analyzing obfuscation in a sample.
        Used by red teams to understand detection surface, and by
        blue teams for deobfuscation assistance.
        """
        return f"""Analyze the following code sample for obfuscation techniques:

{code_sample}

For each technique identified:
1. Name the obfuscation category
2. Explain how it works
3. Describe the detection signature
4. Suggest deobfuscation approach
5. Rate detection difficulty (1-10)

Output as structured JSON."""


def calculate_entropy(data: bytes) -> float:
    """Shannon entropy — high entropy suggests encryption/compression."""
    if not data:
        return 0.0
    freq = {}
    for byte in data:
        freq[byte] = freq.get(byte, 0) + 1
    length = len(data)
    entropy = 0.0
    for count in freq.values():
        p = count / length
        if p > 0:
            import math
            entropy -= p * math.log2(p)
    return round(entropy, 4)

# Demonstration
analyzer = ObfuscationAnalyzer()

# Show encoding layers
print("[*] String Encoding Layer Analysis")
result = analyzer.demonstrate_encoding_layers("This is a test payload string")
for stage, value in result.items():
    print(f"    {stage}: {value}")

# Show API hashing concept
print("\n[*] API Hashing Analysis")
analyzer.api_hashing_concept()

# Generate analysis prompt
print("\n[*] AI analysis prompt generated for obfuscation review")
# Educational: AI-assisted obfuscation pipeline concepts
# Demonstrates techniques red teams analyze — NOT a weaponized tool
import base64
import hashlib
import random
import struct
import os

class ObfuscationAnalyzer:
    """
    Analyzes common obfuscation techniques used by malware authors.
    Understanding these patterns helps both offensive and defensive teams:
    - Red teams: verify payloads evade basic signature detection
    - Blue teams: develop deobfuscation and detection rules
    """
    
    # ── Stage 1: String Obfuscation ─────────────────────
    
    @staticmethod
    def xor_encode(data: bytes, key: bytes) -> bytes:
        """XOR encoding with rotating key — classic malware technique."""
        return bytes(b ^ key[i % len(key)] for i, b in enumerate(data))
    
    @staticmethod
    def generate_xor_stub(key_hex: str) -> str:
        """
        Generate pseudocode for an XOR decoder stub.
        Real malware uses this to decrypt payloads at runtime.
        
        Detection: look for XOR loops with fixed key patterns,
        high-entropy encrypted blobs adjacent to small decoder stubs.
        """
        return f"""
# Pseudocode: XOR decoder stub pattern
# Defenders should flag this pattern in behavioral analysis
key = bytes.fromhex("{key_hex}")
encrypted_payload = <read_from_resource_section>
decrypted = bytes(b ^ key[i % len(key)] 
                   for i, b in enumerate(encrypted_payload))
exec_mem = allocate_executable_memory(len(decrypted))
copy_to_memory(exec_mem, decrypted)
execute(exec_mem)
"""
    
    @staticmethod
    def demonstrate_encoding_layers(plaintext: str) -> dict:
        """Show how malware stacks encoding layers."""
        stages = {"original": plaintext}
        
        # Layer 1: UTF-8 encode
        raw = plaintext.encode('utf-8')
        
        # Layer 2: XOR with random key
        key = os.urandom(16)
        xored = bytes(b ^ key[i % len(key)] for i, b in enumerate(raw))
        stages["xor_key"] = key.hex()
        stages["after_xor"] = xored.hex()[:64] + "..."
        
        # Layer 3: Base64 encode
        b64 = base64.b64encode(xored).decode()
        stages["after_b64"] = b64[:64] + "..."
        
        # Layer 4: Reverse
        reversed_str = b64[::-1]
        stages["after_reverse"] = reversed_str[:64] + "..."
        
        # Entropy analysis
        stages["original_entropy"] = calculate_entropy(plaintext.encode())
        stages["final_entropy"] = calculate_entropy(reversed_str.encode())
        
        return stages
    
    # ── Stage 2: Control Flow Obfuscation ────────────────
    
    @staticmethod
    def control_flow_flattening_concept() -> str:
        """
        Conceptual: control flow flattening transforms structured
        code into a state machine with a dispatcher loop.
        
        Original:      Flattened:
        func():         func():
          step1()         state = 0
          step2()         while True:
          step3()           if state == 7: step1(); state = 3
                            if state == 3: step2(); state = 9
                            if state == 9: step3(); break
        
        Detection: high cyclomatic complexity, switch/dispatch 
        patterns, unusual basic block structure in CFG analysis.
        """
        return "See docstring for conceptual explanation"
    
    # ── Stage 3: API Obfuscation ─────────────────────────
    
    @staticmethod
    def api_hashing_concept() -> dict:
        """
        Demonstrate API hashing — malware resolves Windows APIs
        by hash at runtime instead of using direct imports.
        
        This defeats static analysis tools that check import tables.
        Detection: Look for GetProcAddress/LdrGetProcedureAddress 
        calls with computed (non-literal) arguments.
        """
        # Common API hash examples (CRC32-based, educational)
        api_hashes = {
            "VirtualAlloc": hashlib.md5(b"VirtualAlloc").hexdigest()[:8],
            "VirtualProtect": hashlib.md5(b"VirtualProtect").hexdigest()[:8],
            "CreateThread": hashlib.md5(b"CreateThread").hexdigest()[:8],
            "WriteProcessMemory": hashlib.md5(b"WriteProcessMemory").hexdigest()[:8],
        }
        
        print("[*] API Hash Table (educational — real malware uses CRC32/DJB2):")
        for api, hash_val in api_hashes.items():
            print(f"    {hash_val} -> {api}")
        
        return api_hashes

    # ── AI-Assisted Analysis ─────────────────────────────
    
    @staticmethod
    def ai_obfuscation_prompt(code_sample: str) -> str:
        """
        Generate LLM prompt for analyzing obfuscation in a sample.
        Used by red teams to understand detection surface, and by
        blue teams for deobfuscation assistance.
        """
        return f"""Analyze the following code sample for obfuscation techniques:

{code_sample}

For each technique identified:
1. Name the obfuscation category
2. Explain how it works
3. Describe the detection signature
4. Suggest deobfuscation approach
5. Rate detection difficulty (1-10)

Output as structured JSON."""


def calculate_entropy(data: bytes) -> float:
    """Shannon entropy — high entropy suggests encryption/compression."""
    if not data:
        return 0.0
    freq = {}
    for byte in data:
        freq[byte] = freq.get(byte, 0) + 1
    length = len(data)
    entropy = 0.0
    for count in freq.values():
        p = count / length
        if p > 0:
            import math
            entropy -= p * math.log2(p)
    return round(entropy, 4)

# Demonstration
analyzer = ObfuscationAnalyzer()

# Show encoding layers
print("[*] String Encoding Layer Analysis")
result = analyzer.demonstrate_encoding_layers("This is a test payload string")
for stage, value in result.items():
    print(f"    {stage}: {value}")

# Show API hashing concept
print("\n[*] API Hashing Analysis")
analyzer.api_hashing_concept()

# Generate analysis prompt
print("\n[*] AI analysis prompt generated for obfuscation review")

7. Defensive Perspective

Understanding AI-assisted malware techniques is only valuable when paired with effective defensive strategies. This section covers how to detect AI-generated code, tools for AI malware analysis, and recommendations for red teams reporting AI-assisted findings.

Detecting AI-Generated Code

While no single indicator definitively identifies AI-generated malware, several signals — especially in combination — raise confidence:

  • Polymorphic variant clustering — multiple samples with identical behaviour but different surface structure strongly suggest automated mutation.
  • Embedding similarity — code embedding models (CodeBERT, StarCoder) can identify semantic similarity between structurally different samples.
  • Comment and documentation patterns — LLMs generate characteristic documentation styles that differ from typical malware (which rarely includes comments).
  • Error handling consistency — AI-generated code often includes uniform exception handling patterns uncommon in hand-crafted malware.

AI-Based Malware Analysis Tools

The same AI capabilities that assist attackers also empower defenders:

  • LLM-assisted reverse engineering — feed decompiled code to GPT-4 or Claude for rapid functional analysis. Models excel at explaining obfuscated logic, identifying known technique patterns, and suggesting deobfuscation approaches.
  • Automated YARA generation — use LLMs to generate YARA rules from malware samples, including rules that detect polymorphic variant families.
  • Sandbox result interpretation — feed sandbox reports to LLMs for automated triage and severity classification.
  • Threat intelligence enrichment — correlate malware indicators with threat intelligence feeds using AI for automated attribution analysis.

Behavioural vs. Signature Detection

AI-driven polymorphism fundamentally undermines signature-based detection. The defensive response must emphasise behavioural analysis:

  • System call sequences — regardless of obfuscation, the malware must execute the same system calls. Monitor for suspicious call chains: VirtualAlloc + WriteProcessMemory + CreateRemoteThread.
  • Memory indicators — detect executable memory regions with suspicious characteristics: RWX permissions, unbacked memory sections, injected threads.
  • Network behaviour — C2 communication patterns persist even when traffic is encrypted: beaconing intervals, jitter patterns, data volumes.
  • Process lineage — unusual parent-child process relationships (e.g., Excel spawning PowerShell) remain reliable indicators regardless of payload obfuscation.

Red Team Reporting Recommendations

When AI-assisted techniques are used during engagements, reports should include:

  • AI tools and models used — specify which models generated or modified offensive code.
  • Technique documentation — describe each AI-assisted technique in sufficient detail for the blue team to build detection rules.
  • Detection gaps identified — explicitly call out where AI-assisted evasion succeeded against the client's defensive stack.
  • Recommended detections — provide specific YARA rules, Sigma rules, or EDR custom rules that would detect the techniques used.
  • Polymorphic variant testing — if polymorphic payloads were used, document how many unique variants were tested and the detection rate across the campaign.
ai_malware_detection.py
python
# Defensive: detecting AI-generated malware characteristics
# Tools and techniques for blue teams and malware analysts
import re
from dataclasses import dataclass

@dataclass
class DetectionSignal:
    name: str
    confidence: str  # LOW, MEDIUM, HIGH
    indicator: str
    false_positive_rate: str

# ── Indicators of AI-Generated Code ─────────────────────

AI_CODE_SIGNALS = [
    DetectionSignal(
        name="Consistent Comment Style",
        confidence="MEDIUM",
        indicator="Uniform docstring format, consistent comment patterns "
                  "that match LLM output styles (e.g., triple-quote docstrings "
                  "on every function, numbered steps in comments)",
        false_positive_rate="HIGH — good developers also write consistent comments"
    ),
    DetectionSignal(
        name="Variable Naming Patterns",
        confidence="LOW",
        indicator="LLMs tend toward descriptive variable names: "
                  "encrypted_payload, decoded_shellcode, target_process. "
                  "Unusual consistency in naming conventions.",
        false_positive_rate="HIGH — common in clean code"
    ),
    DetectionSignal(
        name="Error Handling Patterns",
        confidence="MEDIUM",
        indicator="Generic try/except blocks with generic error messages. "
                  "LLMs often generate overly broad exception handling.",
        false_positive_rate="MEDIUM"
    ),
    DetectionSignal(
        name="Structural Regularity",
        confidence="MEDIUM",
        indicator="Unusually regular code structure — consistent function "
                  "lengths, uniform parameter counts, predictable patterns.",
        false_positive_rate="MEDIUM"
    ),
    DetectionSignal(
        name="Polymorphic Variant Clustering",
        confidence="HIGH",
        indicator="Multiple samples with identical functionality but "
                  "different variable names, dead code, and string encoding. "
                  "Suggests automated mutation engine.",
        false_positive_rate="LOW — strong indicator of automated generation"
    ),
    DetectionSignal(
        name="Semantic Similarity",
        confidence="HIGH",
        indicator="Code embedding analysis shows high cosine similarity "
                  "between samples despite different surface structure.",
        false_positive_rate="LOW"
    )
]

# ── Detection Tools and Approaches ──────────────────────

DETECTION_TOOLS = {
    "Static Analysis": {
        "tools": ["YARA rules", "Sigma rules", "Semgrep", "CodeQL"],
        "approach": "Pattern matching on known AI-generated code structures",
        "effectiveness": "Medium — AI can generate novel patterns"
    },
    "Behavioral Analysis": {
        "tools": ["Any.Run", "Joe Sandbox", "CAPE Sandbox", "Cuckoo"],
        "approach": "Execute in sandbox, monitor API calls and behavior",
        "effectiveness": "High — functionality must remain consistent "
                        "regardless of obfuscation"
    },
    "ML-Based Detection": {
        "tools": ["Ember", "MalConv", "SOREL-20M dataset", "Custom models"],
        "approach": "Train classifiers on AI-generated vs human-written code",
        "effectiveness": "Emerging — promising but limited training data"
    },
    "Code Similarity": {
        "tools": ["ssdeep (fuzzy hashing)", "TLSH", "BinDiff", "Diaphora"],
        "approach": "Identify variants despite surface-level mutations",
        "effectiveness": "High for polymorphic families — fuzzy hashing "
                        "catches structural similarity"
    },
    "LLM-Assisted Analysis": {
        "tools": ["GPT-4 / Claude for analysis", "Custom fine-tuned models"],
        "approach": "Use AI to analyze suspected AI-generated malware",
        "effectiveness": "High — LLMs can identify generation patterns "
                        "and deobfuscate code"
    }
}

def generate_detection_report():
    """Generate a comprehensive detection capabilities report."""
    print("=" * 60)
    print("AI-Generated Malware Detection Report")
    print("=" * 60)
    
    print("\n[1] Code Signals")
    for signal in AI_CODE_SIGNALS:
        print(f"\n  Signal: {signal.name}")
        print(f"  Confidence: {signal.confidence}")
        print(f"  FP Rate: {signal.false_positive_rate}")
        print(f"  Indicator: {signal.indicator}")
    
    print("\n" + "-" * 60)
    print("[2] Detection Tooling")
    for category, info in DETECTION_TOOLS.items():
        print(f"\n  Category: {category}")
        print(f"  Tools: {', '.join(info['tools'])}")
        print(f"  Approach: {info['approach']}")
        print(f"  Effectiveness: {info['effectiveness']}")
    
    print("\n" + "-" * 60)
    print("[3] Red Team Reporting Recommendations")
    print("  - Document all AI-assisted techniques used in engagement")
    print("  - Provide detection signatures for AI-generated payloads")
    print("  - Include AI tool versions and prompts (sanitized) in report")
    print("  - Recommend specific detection rules for observed gaps")
    print("  - Test client EDR against polymorphic variant families")
    print("=" * 60)

generate_detection_report()
# Defensive: detecting AI-generated malware characteristics
# Tools and techniques for blue teams and malware analysts
import re
from dataclasses import dataclass

@dataclass
class DetectionSignal:
    name: str
    confidence: str  # LOW, MEDIUM, HIGH
    indicator: str
    false_positive_rate: str

# ── Indicators of AI-Generated Code ─────────────────────

AI_CODE_SIGNALS = [
    DetectionSignal(
        name="Consistent Comment Style",
        confidence="MEDIUM",
        indicator="Uniform docstring format, consistent comment patterns "
                  "that match LLM output styles (e.g., triple-quote docstrings "
                  "on every function, numbered steps in comments)",
        false_positive_rate="HIGH — good developers also write consistent comments"
    ),
    DetectionSignal(
        name="Variable Naming Patterns",
        confidence="LOW",
        indicator="LLMs tend toward descriptive variable names: "
                  "encrypted_payload, decoded_shellcode, target_process. "
                  "Unusual consistency in naming conventions.",
        false_positive_rate="HIGH — common in clean code"
    ),
    DetectionSignal(
        name="Error Handling Patterns",
        confidence="MEDIUM",
        indicator="Generic try/except blocks with generic error messages. "
                  "LLMs often generate overly broad exception handling.",
        false_positive_rate="MEDIUM"
    ),
    DetectionSignal(
        name="Structural Regularity",
        confidence="MEDIUM",
        indicator="Unusually regular code structure — consistent function "
                  "lengths, uniform parameter counts, predictable patterns.",
        false_positive_rate="MEDIUM"
    ),
    DetectionSignal(
        name="Polymorphic Variant Clustering",
        confidence="HIGH",
        indicator="Multiple samples with identical functionality but "
                  "different variable names, dead code, and string encoding. "
                  "Suggests automated mutation engine.",
        false_positive_rate="LOW — strong indicator of automated generation"
    ),
    DetectionSignal(
        name="Semantic Similarity",
        confidence="HIGH",
        indicator="Code embedding analysis shows high cosine similarity "
                  "between samples despite different surface structure.",
        false_positive_rate="LOW"
    )
]

# ── Detection Tools and Approaches ──────────────────────

DETECTION_TOOLS = {
    "Static Analysis": {
        "tools": ["YARA rules", "Sigma rules", "Semgrep", "CodeQL"],
        "approach": "Pattern matching on known AI-generated code structures",
        "effectiveness": "Medium — AI can generate novel patterns"
    },
    "Behavioral Analysis": {
        "tools": ["Any.Run", "Joe Sandbox", "CAPE Sandbox", "Cuckoo"],
        "approach": "Execute in sandbox, monitor API calls and behavior",
        "effectiveness": "High — functionality must remain consistent "
                        "regardless of obfuscation"
    },
    "ML-Based Detection": {
        "tools": ["Ember", "MalConv", "SOREL-20M dataset", "Custom models"],
        "approach": "Train classifiers on AI-generated vs human-written code",
        "effectiveness": "Emerging — promising but limited training data"
    },
    "Code Similarity": {
        "tools": ["ssdeep (fuzzy hashing)", "TLSH", "BinDiff", "Diaphora"],
        "approach": "Identify variants despite surface-level mutations",
        "effectiveness": "High for polymorphic families — fuzzy hashing "
                        "catches structural similarity"
    },
    "LLM-Assisted Analysis": {
        "tools": ["GPT-4 / Claude for analysis", "Custom fine-tuned models"],
        "approach": "Use AI to analyze suspected AI-generated malware",
        "effectiveness": "High — LLMs can identify generation patterns "
                        "and deobfuscate code"
    }
}

def generate_detection_report():
    """Generate a comprehensive detection capabilities report."""
    print("=" * 60)
    print("AI-Generated Malware Detection Report")
    print("=" * 60)
    
    print("\n[1] Code Signals")
    for signal in AI_CODE_SIGNALS:
        print(f"\n  Signal: {signal.name}")
        print(f"  Confidence: {signal.confidence}")
        print(f"  FP Rate: {signal.false_positive_rate}")
        print(f"  Indicator: {signal.indicator}")
    
    print("\n" + "-" * 60)
    print("[2] Detection Tooling")
    for category, info in DETECTION_TOOLS.items():
        print(f"\n  Category: {category}")
        print(f"  Tools: {', '.join(info['tools'])}")
        print(f"  Approach: {info['approach']}")
        print(f"  Effectiveness: {info['effectiveness']}")
    
    print("\n" + "-" * 60)
    print("[3] Red Team Reporting Recommendations")
    print("  - Document all AI-assisted techniques used in engagement")
    print("  - Provide detection signatures for AI-generated payloads")
    print("  - Include AI tool versions and prompts (sanitized) in report")
    print("  - Recommend specific detection rules for observed gaps")
    print("  - Test client EDR against polymorphic variant families")
    print("=" * 60)

generate_detection_report()
🎯

AI Malware Analysis Labs

Hands-on exercises focused on analysing AI-generated malware techniques — not creating them. These labs build defensive skills.

🔧
Analyse Polymorphic Samples with Fuzzy Hashing Custom Lab medium
ssdeep fuzzy hashingTLSH similarity scoringVariant family clusteringYARA rule generation
🔧
AMSI Bypass Detection Engineering Custom Lab hard
AMSI bypass categorizationETW event monitoringMemory integrity checkingEDR detection rule writing
🔧
LLM-Assisted Malware Reverse Engineering Custom Lab hard
Decompiled code analysis with GPT-4Obfuscation identificationAutomated deobfuscationThreat intelligence correlation