Social Engineering

Advanced

T1566 | Phishing T1598 | Phishing for Information

AI Social Engineering

AI has fundamentally transformed social engineering from a craft into a scalable weapon. Deepfake video, real-time voice cloning, and LLM-generated phishing can now bypass human intuition at scale. Understanding these techniques is essential for red teamers and defenders alike.

Legal & Ethical Boundaries

AI social engineering tools can cause serious real-world harm. Only use these techniques in authorised red team engagements with explicit written scope that includes social engineering. Deepfakes and voice cloning without consent may violate laws in your jurisdiction.

The AI Social Engineering Threat Landscape

AI Social Engineering Attack Surface (2026)

Generation

Text / Email

LLM-crafted phishing

Voice Clone

3s sample → full voice

Deepfake Video

Real-time face swap + lip sync

Delivery

Email Phone (Vishing) Video Call SMS / Chat

Objectives

Credential harvest Wire transfer MFA bypass Access

Real-World Incidents

2024 — $25M deepfake heist: A Hong Kong finance worker was tricked into transferring funds after a video call with AI-generated deepfakes of senior executives.
2024 — CEO voice clone: Criminals used AI voice cloning to impersonate a CEO, authorising a fraudulent €220K wire transfer via phone call.
2025 — Election deepfakes: AI-generated robocalls mimicking political candidates used to suppress voter turnout in multiple countries.

Why Attackers Love AI

Scale: Generate thousands of unique, personalised phishing emails in minutes
Quality: Perfect grammar, cultural context, and writing style mimicry
Speed: Real-time voice cloning needs only a 3-second sample
Cost: Open-source models make deepfakes free to produce
Evasion: Each output is unique — defeats signature-based email filters

1. LLM-Powered Phishing

Traditional phishing relies on templates that security-aware users learn to spot. AI-generated phishing is contextually unique, grammatically perfect, and can be personalised using OSINT data scraped from LinkedIn, social media, and company websites.

Red Team Simulation Framework

python

# Phishing simulation framework for authorised red team engagements
# REQUIRES: Written authorisation with social engineering in scope

import openai
import json

def generate_phishing_pretext(target_info: dict, scenario: str) -> str:
    """Generate a context-appropriate phishing pretext.
    
    Args:
        target_info: OSINT data about the target (name, role, company, interests)
        scenario: Attack scenario (credential_harvest, malware_delivery, wire_fraud)
    """
    prompt = f"""You are simulating a phishing email for an authorised red team engagement.
    
Target profile:
- Name: {target_info['name']}
- Role: {target_info['role']}
- Company: {target_info['company']}
- Recent activity: {target_info.get('recent_activity', 'N/A')}

Scenario: {scenario}

Generate a realistic phishing email that would be contextually appropriate for this 
target. Include subject line, sender name, and email body. The email should leverage 
the target's role and recent activity for credibility.

Format as JSON: {{"subject": "", "from_name": "", "from_address": "", "body": ""}}"""

    response = openai.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": prompt}],
        temperature=0.7
    )
    return json.loads(response.choices[0].message.content)

# Example usage in authorised engagement
target = {
    "name": "Jane Smith",
    "role": "VP of Engineering", 
    "company": "Acme Corp",
    "recent_activity": "Spoke at CloudConf 2026 about Kubernetes migration"
}

email = generate_phishing_pretext(target, "credential_harvest")
print(f"Subject: {email['subject']}")
print(f"From: {email['from_name']} <{email['from_address']}>")
print(f"\n{email['body']}")

# Phishing simulation framework for authorised red team engagements
# REQUIRES: Written authorisation with social engineering in scope

import openai
import json

def generate_phishing_pretext(target_info: dict, scenario: str) -> str:
    """Generate a context-appropriate phishing pretext.
    
    Args:
        target_info: OSINT data about the target (name, role, company, interests)
        scenario: Attack scenario (credential_harvest, malware_delivery, wire_fraud)
    """
    prompt = f"""You are simulating a phishing email for an authorised red team engagement.
    
Target profile:
- Name: {target_info['name']}
- Role: {target_info['role']}
- Company: {target_info['company']}
- Recent activity: {target_info.get('recent_activity', 'N/A')}

Scenario: {scenario}

Generate a realistic phishing email that would be contextually appropriate for this 
target. Include subject line, sender name, and email body. The email should leverage 
the target's role and recent activity for credibility.

Format as JSON: {{"subject": "", "from_name": "", "from_address": "", "body": ""}}"""

    response = openai.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": prompt}],
        temperature=0.7
    )
    return json.loads(response.choices[0].message.content)

# Example usage in authorised engagement
target = {
    "name": "Jane Smith",
    "role": "VP of Engineering", 
    "company": "Acme Corp",
    "recent_activity": "Spoke at CloudConf 2026 about Kubernetes migration"
}

email = generate_phishing_pretext(target, "credential_harvest")
print(f"Subject: {email['subject']}")
print(f"From: {email['from_name']} <{email['from_address']}>")
print(f"\n{email['body']}")

GoPhish + LLM Integration

For full red team campaigns, integrate LLM-generated content with GoPhish to track open rates, click rates, and credential submissions. Generate unique email content per target to defeat email clustering defences.

2. Voice Cloning & Vishing

Modern voice cloning models need as little as 3 seconds of audio to produce a convincing clone. Combined with real-time speech-to-speech models, attackers can conduct live phone calls in someone else's voice.

Voice Cloning Tools (Research/Red Team)

Tool	Type	Sample Needed	Real-time?
ElevenLabs	Cloud API	~1 min audio	Yes (streaming)
OpenVoice	Open-source	~10 seconds	Near real-time
RVC (Retrieval Voice)	Open-source	~10 min (training)	Yes
Fish Speech	Open-source	~3 seconds	Near real-time
F5-TTS	Open-source	~5 seconds	Near real-time
StyleTTS2	Open-source	~10 seconds	No (batch)
Parler TTS	Open-source (HF)	~10 seconds	No (batch)

Red Team Vishing Workflow

bash

# Vishing attack simulation workflow (authorised engagement only)

# Step 1: Collect voice sample from public sources
# LinkedIn videos, YouTube talks, podcast appearances, earnings calls
yt-dlp -x --audio-format wav "https://youtube.com/watch?v=TARGET_TALK"

# Step 2: Clone voice with OpenVoice (local, no data leakage)
git clone https://github.com/myshell-ai/OpenVoice.git
cd OpenVoice
pip install -e .

python openvoice_cli.py \
  --reference_audio target_sample.wav \
  --text "Hi, this is [Name] from IT. We detected unusual activity on your account. 
          I need you to verify your identity by logging into our security portal." \
  --output vishing_sample.wav

# Step 3: Real-time voice conversion for live calls
# Use RVC or SoVITS for live voice conversion during a phone call
# Pipe microphone → voice model → VOIP output

# Step 4: Combine with AI-generated pretext
# Feed OSINT about the target into an LLM to generate a contextual script
# The pretext should reference real projects, people, or events
# Vishing attack simulation workflow (authorised engagement only)

# Step 1: Collect voice sample from public sources
# LinkedIn videos, YouTube talks, podcast appearances, earnings calls
yt-dlp -x --audio-format wav "https://youtube.com/watch?v=TARGET_TALK"

# Step 2: Clone voice with OpenVoice (local, no data leakage)
git clone https://github.com/myshell-ai/OpenVoice.git
cd OpenVoice
pip install -e .

python openvoice_cli.py \
  --reference_audio target_sample.wav \
  --text "Hi, this is [Name] from IT. We detected unusual activity on your account. 
          I need you to verify your identity by logging into our security portal." \
  --output vishing_sample.wav

# Step 3: Real-time voice conversion for live calls
# Use RVC or SoVITS for live voice conversion during a phone call
# Pipe microphone → voice model → VOIP output

# Step 4: Combine with AI-generated pretext
# Feed OSINT about the target into an LLM to generate a contextual script
# The pretext should reference real projects, people, or events

3. Deepfake Video

Real-time deepfake technology allows attackers to impersonate anyone on a video call. This has already been used in the wild for fraud, and the barrier to entry is dropping rapidly.

Deepfake Tools

DeepFaceLive: Real-time face swap for video calls (open-source)
SimSwap: High-fidelity face swapping with single image
Wav2Lip: Accurate lip sync for any face with any audio
FaceFusion: Next-gen face swapping and enhancement
Deep-Live-Cam: Real-time face swap for video calls (successor to roop)
LivePortrait: Animate portraits from a single source image

Attack Scenarios

Executive impersonation: Deepfake CEO on Zoom authorising wire transfers
IT help desk: Fake IT admin on Teams requesting credentials
Vendor impersonation: Fake supplier representative changing payment details
KYC bypass: Deepfake video verification to open fraudulent accounts

3.5 Video Synthesis & the Detection Arms Race

The 2024–2026 wave of frontier video generators (OpenAI Sora 2, Google Veo 3, Runway Gen-3 Alpha, HeyGen 4, Hedra Character-3, Synthesia 2.0) collapsed the cost of producing a 30-second photorealistic avatar from "film studio" to "a $20 subscription." Real-time avatar pipelines (D-ID Live, HeyGen Streaming Avatar, Hedra) now drive Zoom / Teams / WhatsApp video at <500 ms end-to-end latency. Combined with voice cloning above, an attacker can run a fully interactive deepfake call against a finance team or a help desk.

Real-world incidents (2024–2026)

Arup, Hong Kong (Jan 2024) — finance employee wired US$25.6 M after a deepfake video call with a fake "CFO" plus several fake colleagues.
Ferrari (July 2024) — attempted CEO voice/video impersonation against an executive; foiled by a challenge question only the real CEO could answer.
WPP (May 2024) — deepfake of CEO Mark Read on Teams used to launch a wire-fraud attempt against an exec.
2025 — multiple US-state political deepfake robocalls and an OpenAI-flagged use of voice cloning in influence ops (publicly disclosed).

2025–2026 Generation Stack

OpenAI Sora 2: Long-form text-to-video, audio-aligned; watermarked with C2PA + SynthID-like metadata.
Google Veo 3: Native synced audio + video; 1080p / 8s clips; SynthID watermark embedded.
Runway Gen-3 Alpha / Gen-4: Image-to-video, motion brush, subject consistency.
HeyGen 4 / Streaming Avatar: Realtime avatars on live video calls.
Hedra Character-3: Audio-driven talking-head with strong lip-sync.
D-ID Live / Synthesia 2.0: Enterprise streaming avatars; used in real-world help-desk impersonation tests.
Open-source: Wan 2.1, Hunyuan Video, Mochi 1, CogVideoX, LTX-Video — self-hostable, no provenance metadata by default.

Detection Stack

Reality Defender: commercial deepfake detection across image / video / audio; DOD & bank deployments.
Sensity AI: deepfake + AI-generated content detection; KYC focus.
Intel FakeCatcher: remote photoplethysmography (rPPG) — looks for blood-flow micro-changes that GANs don't reproduce.
Microsoft Video Authenticator + Content Credentials (C2PA).
Google SynthID: imperceptible watermark detector for Veo / Imagen / Lyria outputs.
OpenAI DALL·E / Sora detector + C2PA Content Credentials.
TrueMedia.org / Hive Moderation: ensemble classifiers used by newsrooms.
Open-source: DeepfakeBench, AltFreezing, DFDC baseline models.

Detection is necessary but not sufficient

Independent evaluations consistently show that even leading deepfake detectors lose significant accuracy on the newest generators within months of release. Treat detection as a signal, not a verdict. Combine with provenance metadata (C2PA), out-of-band verification, and process controls (dual-control wire transfers, callback rules).

4. Detection & Defence

Defending against AI social engineering requires both technical controls and human awareness training. Traditional email filters are insufficient against LLM-generated, contextually unique content.

Technical Defences

Voice verification protocols: Callback procedures with pre-shared code words
Deepfake detection models: Microsoft Video Authenticator, Intel FakeCatcher
AI email analysis: Analyse writing style deviation from known sender patterns
DMARC / SPF / DKIM: Still essential — blocks impersonation at the email protocol level
Out-of-band verification: Verify high-value requests via a separate communication channel

Human Defences

AI-aware training: Teach staff that voice and video can be faked
Challenge phrases: Pre-agreed words for verifying identity in calls
Dual authorisation: Wire transfers require 2 people to approve
Red team exercises: Regular simulated attacks with AI-generated content
Slow down urgency: Train staff to pause when pressured for immediate action

Red Team Reporting

When reporting AI social engineering findings, document: the AI model used, the OSINT data that enabled personalisation, the attack success rate, and specific recommendations for that organisation. Include audio/video samples (with consent) to demonstrate the realism of the attack to executives.

4.5 Biometric & MFA Bypass, Smishing, RCS Deepfakes

AI-generated audio and video have moved past chat fraud into account-recovery, KYC onboarding, and voice-MFA bypass. Banks, exchanges, and gig platforms that rely on "selfie video + ID" or "voice passphrase" are the soft underbelly.

Voice-MFA & call-centre bypass

Pindrop, Nuance Gatekeeper, ID R&D, Daon: commercial voice-biometrics now ship a separate "synthetic-speech detector" alongside speaker verification.
Real-world: Vice (2023) showed an ElevenLabs clone defeating Lloyds Bank voice ID; multiple US banks have since added liveness challenges.
Defence: require knowledge factor + posession factor on top of voice; treat voice as a low-assurance signal.

Selfie-liveness / KYC bypass

Onfido, Veriff, iProov, Sumsub, Persona: 2024–2026 disclosures show deepfake injection via virtual cameras (OBS + Deep-Live-Cam) bypassing passive liveness.
Active liveness (head turns, randomised challenges) blocks most pre-rendered deepfakes; iProov Flashmark uses a randomised face illumination as a one-time challenge.
Defence: insist on active liveness with randomised challenges, hardware attestation of the camera (e.g., iOS DeviceCheck / Android Play Integrity), document-NFC chip read.

Smishing & RCS deepfakes

RCS (Rich Communication Services) is now default on Android and iOS 18+; verified-sender branding is widely abused via lookalike business profiles.
AI-generated short voice notes / video clips embedded in RCS / WhatsApp / iMessage are the new pretext vehicle — "hi, it's your daughter, my phone broke…".
Telecoms response: STIR/SHAKEN A-attestation for voice; for messaging, Google Verified Calls + Apple Business Connect; spam-AI filters in carriers.
Defence: family/team challenge phrases; out-of-band callback to known number; user training that attaches voice/video samples to phishing exercises.

Phishing-resistant MFA is the answer

Replace voice / SMS / TOTP with FIDO2 / WebAuthn passkeys for both consumer and workforce.
For workforce: Windows Hello for Business / YubiKey 5 / iCloud Keychain passkeys + Conditional Access requiring "phishing-resistant MFA."
For payments: out-of-band approval in the bank app rather than voice authorisation.
Deepfakes do not bypass FIDO origin binding — the strongest single mitigation in this entire domain.

5. Building Your AI SE Toolkit

bash

# Recommended AI social engineering toolkit for red teamers
# All tools should be run in an isolated VM

# Text generation (phishing pretexts)
pip install openai           # GPT-4o API for phishing generation
# OR use local models:
ollama pull dolphin-mixtral   # Local model

# Voice cloning
git clone https://github.com/myshell-ai/OpenVoice.git
pip install -e OpenVoice/
# Alternative: Fish Speech (zero-shot, 3s sample)
git clone https://github.com/fishaudio/fish-speech.git
pip install -e fish-speech/
# Alternative: F5-TTS (zero-shot voice cloning)
pip install f5-tts

# Deepfake video 
git clone https://github.com/iperov/DeepFaceLive.git
# OR real-time face swap (successor to roop):
git clone https://github.com/hacksider/Deep-Live-Cam.git
pip install -r Deep-Live-Cam/requirements.txt

# OSINT for target profiling
pip install theHarvester
pip install social-analyzer

# Campaign management
# GoPhish for email campaigns: https://getgophish.com
# Track: open rate, click rate, credential harvest rate

# Audio sample collection
pip install yt-dlp            # Download public talks/interviews
# Recommended AI social engineering toolkit for red teamers
# All tools should be run in an isolated VM

# Text generation (phishing pretexts)
pip install openai           # GPT-4o API for phishing generation
# OR use local models:
ollama pull dolphin-mixtral   # Local model

# Voice cloning
git clone https://github.com/myshell-ai/OpenVoice.git
pip install -e OpenVoice/
# Alternative: Fish Speech (zero-shot, 3s sample)
git clone https://github.com/fishaudio/fish-speech.git
pip install -e fish-speech/
# Alternative: F5-TTS (zero-shot voice cloning)
pip install f5-tts

# Deepfake video 
git clone https://github.com/iperov/DeepFaceLive.git
# OR real-time face swap (successor to roop):
git clone https://github.com/hacksider/Deep-Live-Cam.git
pip install -r Deep-Live-Cam/requirements.txt

# OSINT for target profiling
pip install theHarvester
pip install social-analyzer

# Campaign management
# GoPhish for email campaigns: https://getgophish.com
# Track: open rate, click rate, credential harvest rate

# Audio sample collection
pip install yt-dlp            # Download public talks/interviews

Getting Started

Start by understanding the AI pentesting fundamentals, then practice prompt engineering before attempting social engineering simulations.

Social Engineering Practice Labs

GoPhish AI Phishing Campaign Custom Lab medium

LLM phishing generationCampaign trackingCredential harvesting

Open Lab

OpenVoice Voice Cloning Lab Custom Lab hard

Voice cloningVishing simulationAudio deepfake detection

Open Lab

Deep-Live-Cam Deepfake Exercise Custom Lab hard

Real-time face swapVideo call impersonationDeepfake detection

Open Lab

Need help setting up? Check our Lab Setup Guide →

Offensive Operations

Operator Playbook

Plan and measure authorized AI-enabled social engineering simulations without enabling real-world fraud or non-consensual impersonation.

Authorized use only

Offensive Focus

Assess how AI improves pretext quality, personalization, timing, and channel realism inside written scope.
Use synthetic personas, consented voice/video assets, and controlled delivery infrastructure.
Measure business process resilience, identity verification, and escalation handling.

Evidence To Capture

Written scope and allowed test classes
Timestamped prompts, retrieved context, tool calls, and response artifacts
Request IDs, model/provider/version, policy decisions, and tenant or user role
Screenshots or exported logs that reproduce the finding without exposing client secrets

Offensive Test Cases

AI-assisted pretext review

Objective: Generate and score benign campaign variants for realism, policy triggers, and approval workflow gaps.
Authorized setup: Use fictional identities or approved internal personas and preapproved themes.
Evidence: Pretext variants, approval records, delivery constraints, and detection outcomes.

Voice/video verification tabletop

Objective: Evaluate whether teams follow verification procedures when presented with simulated AI impersonation risk.
Authorized setup: Use consented or synthetic media in a tabletop or controlled exercise.
Evidence: Scenario script, participant decisions, verification steps, and process gaps.

Common Findings

Processes rely on voice, video, or writing style as identity proof.
Helpdesk and finance workflows lack out-of-band verification under pressure.
Awareness training does not cover AI-enhanced personalization and synthetic media.

Lab Ideas

Create a tabletop deepfake incident script with fictional executives.
Compare generic and AI-personalized phishing lures in a non-delivery lab.
Draft a verification checklist for high-risk requests.

AI Social Engineering

The AI Social Engineering Threat Landscape

AI Social Engineering Attack Surface (2026)

Real-World Incidents

Why Attackers Love AI

1. LLM-Powered Phishing

Red Team Simulation Framework

2. Voice Cloning & Vishing

Voice Cloning Tools (Research/Red Team)

Red Team Vishing Workflow

3. Deepfake Video

Deepfake Tools

Attack Scenarios

3.5 Video Synthesis & the Detection Arms Race

2025–2026 Generation Stack

Detection Stack

4. Detection & Defence

Technical Defences

Human Defences

4.5 Biometric & MFA Bypass, Smishing, RCS Deepfakes

Voice-MFA & call-centre bypass

Selfie-liveness / KYC bypass

Smishing & RCS deepfakes

Phishing-resistant MFA is the answer

5. Building Your AI SE Toolkit

Social Engineering Practice Labs

Operator Playbook

Offensive Focus

Evidence To Capture

Offensive Test Cases

AI-assisted pretext review

Voice/video verification tabletop

Common Findings

Lab Ideas

Related Topics

Physical Security

OSINT

AI Attack & Defense

Red Team Operations