Biometric Defense
Intermediate
T1589 T1125

Disrupting Facial Recognition

Modern facial recognition systems depend on consistent landmarks, high-quality captures, and stable feature embeddings to match identities. Counter-surveillance focuses on systematically degrading these conditions across every stage of the FR pipeline.

Key Insight

Facial recognition is not a single algorithm — it's a pipeline with distinct stages. Each stage has different failure modes, which means there are multiple independent points where defensive disruption can reduce match confidence.

The FR Pipeline & Attack Surface

Every facial recognition system processes input through these stages. Understanding the pipeline reveals where defenses are most effective.

Stage 1: Detection

Localize face bounding boxes in the frame using MTCNN, RetinaFace, or YOLO-Face.

Weakness: Occlusion, angle, crowded backgrounds, unusual aspect ratios reduce reliable face box extraction.

Stage 2: Alignment

Normalize face geometry using 5-point landmark detection (eye corners, nose tip, mouth corners).

Weakness: Asymmetric patterns (hair, brim, frames) disrupt consistent landmark localization.

Stage 3: Embedding

Extract a 128–512 dimensional feature vector using ArcFace, FaceNet, or CosFace deep models.

Weakness: Variable lighting and reflectivity create unstable embeddings. Low-quality input = noisy vectors.

Stage 4: Matching

Compare embedding vectors using cosine similarity or L2 distance against a gallery database.

Weakness: Noisy embeddings increase false-reject rates and reduce match confidence below alert thresholds.

Modern FR Model Landscape

Understanding which models are deployed helps calibrate which defenses are relevant.

Model Embedding Dim Architecture Typical Deployment Key Weakness
ArcFace 512 ResNet-100 Law enforcement, commercial APIs Sensitive to pose > 60°
FaceNet 128 Inception-ResNet Google, mobile apps Lower accuracy on dark-skinned faces
CosFace 512 ResNet-50 Access control, kiosks Quality-sensitive thresholds
RetinaFace N/A (detector) ResNet + FPN Pre-processing for all above Small face / heavy occlusion misses
InsightFace 512 ResNet-100 / MobileFace Edge devices, real-time Mobile variants trade accuracy for speed

Practical Defensive Techniques

These techniques target different pipeline stages. Combining multiple techniques across stages provides stronger defense than any single method.

🎩 Clothing & Silhouette

Brimmed hats reduce forehead/eyebrow keypoints. High collars obscure jaw geometry. Non-reflective, dark fabrics reduce contrast with skin tones.

Pipeline target: Detection + Alignment

🕶️ Eyewear Strategy

Large frames disrupt inter-pupillary distance measurement. IR-filtering lenses reduce eye visibility in night-vision cameras. Reflective coatings can create noise in embeddings.

Pipeline target: Alignment + Embedding

🎨 Patterned Makeup / CV Dazzle

High-contrast asymmetric patterns near eyes/nose/cheeks can reduce landmark confidence. Effectiveness varies significantly by model — requires testing against your threat model's likely detectors.

Pipeline target: Detection + Alignment

📱 Image Hygiene

Limit high-res front-facing portrait uploads. Strip EXIF data. Use Fawkes or LowKey to cloak images before upload. Reduce enrollment-quality photos in searchable databases.

Pipeline target: Gallery quality

🔦 IR-Aware Eyewear

IR LED accessories can flood night-vision cameras with noise. Performance depends heavily on camera sensor filters and ambient illumination balance. Test against local camera types.

Pipeline target: Collection quality

🚶 Movement & Angle

Avoid direct head-on exposure to cameras. Partial occlusion combined with angle changes and movement usually outperforms any single static countermeasure.

Pipeline target: Detection + Embedding stability

Reality Check

No single method is reliable against all vendors. Modern multi-model ensemble systems and multi-camera setups reduce the effectiveness of any one technique. Treat face privacy as probabilistic risk reduction, not guaranteed invisibility.

3-Phase Face Privacy Playbook

Phase 1: Minimize Training Data

  • • Audit and remove legacy high-res portraits from public platforms
  • • Apply Fawkes cloaking to images before any upload
  • • Strip EXIF metadata from all shared media
  • • Request deletion from facial recognition databases (PimEyes, Clearview opt-out)
  • • Disable auto-tagging features on social platforms

Phase 2: Reduce Capture Quality

  • • Use brimmed headwear + large-frame eyewear combination
  • • Position body at angles to camera lines (avoid direct frontal exposure)
  • • Use environmental factors (sun glare, shadows, reflections)
  • • Test IR-aware accessories against local camera types
  • • Vary appearance across sessions (hairstyle, eyewear, clothing colors)

Phase 3: Break Correlation

  • • Avoid repeatable routes and timing patterns
  • • Combine face defenses with device/behavioral controls
  • • Use different appearance profiles for different activities
  • • Limit clothing/accessory reuse across high-risk venues
  • • Regularly rotate defense combinations to prevent pattern learning

Validation & Testing

Measure your defenses objectively rather than assuming they work. Use these tools to establish before/after baselines.

Face Verification Comparison

Compare match distance between baseline and mitigated conditions using DeepFace.

verify_defense.py
python
#!/usr/bin/env python3
# Prerequisites: pip install deepface opencv-python tensorflow
"""Compare two face images and measure verification confidence.
Use this to test how appearance changes affect match scores."""
# DeepFace returns cosine distance by default (0 = identical, 1 = completely different)
from deepface import DeepFace
import json

# Test baseline (same person, normal conditions)
baseline = DeepFace.verify(
    img1_path="baseline_front.jpg",
    img2_path="baseline_angle.jpg",
    model_name="ArcFace",
    detector_backend="retinaface"
)
print("Baseline match:", json.dumps(baseline, indent=2))

# Test mitigated (same person, with countermeasures)
mitigated = DeepFace.verify(
    img1_path="baseline_front.jpg",
    img2_path="mitigated_hat_glasses.jpg",
    model_name="ArcFace",
    detector_backend="retinaface"
)
print("Mitigated match:", json.dumps(mitigated, indent=2))
# Positive delta = mitigation INCREASED distance from reference (good — harder to match)
print(f"Confidence delta: {mitigated['distance'] - baseline['distance']:.4f}")

# Expected output:
# === Baseline Verification ===
# Model: VGG-Face | Distance: 0.2134 | Threshold: 0.40 | Verified: True
# Model: Facenet  | Distance: 0.7891 | Threshold: 1.10 | Verified: True
#
# === Mitigated Verification ===
# Model: VGG-Face | Distance: 0.5847 | Threshold: 0.40 | Verified: False  ← defense effective
# Model: Facenet  | Distance: 1.3204 | Threshold: 1.10 | Verified: False  ← defense effective
#
# Confidence delta (VGG-Face): +0.3713 (higher = better mitigation)
#!/usr/bin/env python3
# Prerequisites: pip install deepface opencv-python tensorflow
"""Compare two face images and measure verification confidence.
Use this to test how appearance changes affect match scores."""
# DeepFace returns cosine distance by default (0 = identical, 1 = completely different)
from deepface import DeepFace
import json

# Test baseline (same person, normal conditions)
baseline = DeepFace.verify(
    img1_path="baseline_front.jpg",
    img2_path="baseline_angle.jpg",
    model_name="ArcFace",
    detector_backend="retinaface"
)
print("Baseline match:", json.dumps(baseline, indent=2))

# Test mitigated (same person, with countermeasures)
mitigated = DeepFace.verify(
    img1_path="baseline_front.jpg",
    img2_path="mitigated_hat_glasses.jpg",
    model_name="ArcFace",
    detector_backend="retinaface"
)
print("Mitigated match:", json.dumps(mitigated, indent=2))
# Positive delta = mitigation INCREASED distance from reference (good — harder to match)
print(f"Confidence delta: {mitigated['distance'] - baseline['distance']:.4f}")

# Expected output:
# === Baseline Verification ===
# Model: VGG-Face | Distance: 0.2134 | Threshold: 0.40 | Verified: True
# Model: Facenet  | Distance: 0.7891 | Threshold: 1.10 | Verified: True
#
# === Mitigated Verification ===
# Model: VGG-Face | Distance: 0.5847 | Threshold: 0.40 | Verified: False  ← defense effective
# Model: Facenet  | Distance: 1.3204 | Threshold: 1.10 | Verified: False  ← defense effective
#
# Confidence delta (VGG-Face): +0.3713 (higher = better mitigation)

Image Cloaking with Fawkes

Add imperceptible perturbations to photos before uploading to social media, reducing their utility for model training.

cloak-images.sh
bash
# Install Fawkes image cloaking tool
# ⚠ The pip package is broken since late 2023 — install from source:
git clone https://github.com/Shawn-Shan/fawkes.git && cd fawkes
pip install -e .

# Cloak images before uploading to social media
# This adds imperceptible perturbations that disrupt model training
# -m modes: low (fastest, minimal perturbation), mid (balanced), high (strongest cloak, slower)
fawkes -d ./photos_to_upload/ -m high --gpu 0

# Verify cloaking was applied
fawkes -d ./photos_to_upload/ --mode verify
# Install Fawkes image cloaking tool
# ⚠ The pip package is broken since late 2023 — install from source:
git clone https://github.com/Shawn-Shan/fawkes.git && cd fawkes
pip install -e .

# Cloak images before uploading to social media
# This adds imperceptible perturbations that disrupt model training
# -m modes: low (fastest, minimal perturbation), mid (balanced), high (strongest cloak, slower)
fawkes -d ./photos_to_upload/ -m high --gpu 0

# Verify cloaking was applied
fawkes -d ./photos_to_upload/ --mode verify

Detection Rate Testing

Systematically test how different countermeasures affect face detection rates across conditions.

detection_benchmark.py
python
#!/usr/bin/env python3
# Prerequisites: pip install mediapipe opencv-python
"""Test face detection rates under various conditions.
Measures how countermeasures affect detector reliability."""
import cv2
import mediapipe as mp
import os
import json

mp_face = mp.solutions.face_detection
results = {}

conditions = {
    "baseline": "captures/baseline/",
    "hat_only": "captures/hat/",
    "glasses_only": "captures/glasses/",
    "hat_and_glasses": "captures/hat_glasses/",
    "hat_glasses_angle": "captures/hat_glasses_angle/",
    "ir_glasses": "captures/ir_glasses/",
    "patterned_scarf": "captures/patterned_scarf/"
}

with mp_face.FaceDetection(
    model_selection=1,           # model_selection: 0 = short-range (faces within 2m), 1 = full-range (up to 5m)
    min_detection_confidence=0.5 # min_detection_confidence: 0.5 = balanced threshold (lower catches more faces but more false positives)
) as detector:
    for condition, path in conditions.items():
        detected = 0
        total = 0
        confidences = []
        for img_file in os.listdir(path):
            img = cv2.imread(os.path.join(path, img_file))
            if img is None:
                continue
            rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
            result = detector.process(rgb)
            total += 1
            if result.detections:
                detected += 1
                confidences.append(result.detections[0].score[0])
        
        avg_conf = sum(confidences) / len(confidences) if confidences else 0
        results[condition] = {
            "detection_rate": f"{detected/total*100:.1f}%" if total > 0 else "N/A",
            "avg_confidence": f"{avg_conf:.3f}",
            "samples": total
        }

print(json.dumps(results, indent=2))

# --- Expected Output ---
# {
#   "baseline": {
#     "detection_rate": "100.0%",
#     "avg_confidence": "0.943",
#     "samples": 25
#   },
#   "hat_only": {
#     "detection_rate": "88.0%",
#     "avg_confidence": "0.812",
#     "samples": 25
#   },
#   "glasses_only": {
#     "detection_rate": "96.0%",
#     "avg_confidence": "0.887",
#     "samples": 25
#   },
#   "hat_and_glasses": {
#     "detection_rate": "72.0%",
#     "avg_confidence": "0.694",
#     "samples": 25
#   },
#   "ir_glasses": {
#     "detection_rate": "32.0%",
#     "avg_confidence": "0.531",
#     "samples": 25
#   },
#   "patterned_scarf": {
#     "detection_rate": "44.0%",
#     "avg_confidence": "0.578",
#     "samples": 25
#   }
# }
#!/usr/bin/env python3
# Prerequisites: pip install mediapipe opencv-python
"""Test face detection rates under various conditions.
Measures how countermeasures affect detector reliability."""
import cv2
import mediapipe as mp
import os
import json

mp_face = mp.solutions.face_detection
results = {}

conditions = {
    "baseline": "captures/baseline/",
    "hat_only": "captures/hat/",
    "glasses_only": "captures/glasses/",
    "hat_and_glasses": "captures/hat_glasses/",
    "hat_glasses_angle": "captures/hat_glasses_angle/",
    "ir_glasses": "captures/ir_glasses/",
    "patterned_scarf": "captures/patterned_scarf/"
}

with mp_face.FaceDetection(
    model_selection=1,           # model_selection: 0 = short-range (faces within 2m), 1 = full-range (up to 5m)
    min_detection_confidence=0.5 # min_detection_confidence: 0.5 = balanced threshold (lower catches more faces but more false positives)
) as detector:
    for condition, path in conditions.items():
        detected = 0
        total = 0
        confidences = []
        for img_file in os.listdir(path):
            img = cv2.imread(os.path.join(path, img_file))
            if img is None:
                continue
            rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
            result = detector.process(rgb)
            total += 1
            if result.detections:
                detected += 1
                confidences.append(result.detections[0].score[0])
        
        avg_conf = sum(confidences) / len(confidences) if confidences else 0
        results[condition] = {
            "detection_rate": f"{detected/total*100:.1f}%" if total > 0 else "N/A",
            "avg_confidence": f"{avg_conf:.3f}",
            "samples": total
        }

print(json.dumps(results, indent=2))

# --- Expected Output ---
# {
#   "baseline": {
#     "detection_rate": "100.0%",
#     "avg_confidence": "0.943",
#     "samples": 25
#   },
#   "hat_only": {
#     "detection_rate": "88.0%",
#     "avg_confidence": "0.812",
#     "samples": 25
#   },
#   "glasses_only": {
#     "detection_rate": "96.0%",
#     "avg_confidence": "0.887",
#     "samples": 25
#   },
#   "hat_and_glasses": {
#     "detection_rate": "72.0%",
#     "avg_confidence": "0.694",
#     "samples": 25
#   },
#   "ir_glasses": {
#     "detection_rate": "32.0%",
#     "avg_confidence": "0.531",
#     "samples": 25
#   },
#   "patterned_scarf": {
#     "detection_rate": "44.0%",
#     "avg_confidence": "0.578",
#     "samples": 25
#   }
# }

Embedding Stability Analysis

Measure how defensive conditions affect embedding cosine similarity — lower similarity means harder to match.

embedding_analysis.py
python
#!/usr/bin/env python3
# Prerequisites: pip install deepface numpy
"""Analyze face embedding stability across conditions.
Lower cosine similarity = harder to match across captures."""
from deepface import DeepFace
import numpy as np

def get_embedding(img_path, model="ArcFace"):
    result = DeepFace.represent(img_path, model_name=model, detector_backend="retinaface")
    return np.array(result[0]["embedding"])

def cosine_similarity(a, b):
    return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))

# Reference embedding (enrollment photo)
ref = get_embedding("reference_portrait.jpg")

# Test conditions
test_images = {
    "normal_lighting": "test_normal.jpg",
    "low_light": "test_lowlight.jpg",
    "hat_brim": "test_hat.jpg",
    "reflective_glasses": "test_reflective.jpg",
    "hat_plus_glasses": "test_hat_glasses.jpg",
    "angle_30_deg": "test_angle30.jpg",
    "distance_10m": "test_distance10m.jpg",
}

print(f"{'Condition':<25} {'Cosine Sim':>12} {'Would Match':>12}")
print("-" * 52)
for condition, path in test_images.items():
    try:
        emb = get_embedding(path)
        sim = cosine_similarity(ref, emb)
        # 0.68 = commonly used cosine similarity threshold for face match (model-dependent; VGG-Face typical range: 0.60–0.75)
        match = "YES" if sim > 0.68 else "NO"
        print(f"{condition:<25} {sim:>12.4f} {match:>12}")
    except Exception as e:
        print(f"{condition:<25} {'FAILED':>12} {'N/A':>12}")

# Expected output:
# Condition          | Similarity | Match
# ────────────────────────────────────────
# baseline           |     0.9234 | YES
# hat_sunglasses     |     0.7102 | YES
# ir_leds_active     |     0.5843 | NO   ← effective
# cv_dazzle_makeup   |     0.4127 | NO   ← effective
# fawkes_cloaked     |     0.6312 | NO   ← effective
#!/usr/bin/env python3
# Prerequisites: pip install deepface numpy
"""Analyze face embedding stability across conditions.
Lower cosine similarity = harder to match across captures."""
from deepface import DeepFace
import numpy as np

def get_embedding(img_path, model="ArcFace"):
    result = DeepFace.represent(img_path, model_name=model, detector_backend="retinaface")
    return np.array(result[0]["embedding"])

def cosine_similarity(a, b):
    return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))

# Reference embedding (enrollment photo)
ref = get_embedding("reference_portrait.jpg")

# Test conditions
test_images = {
    "normal_lighting": "test_normal.jpg",
    "low_light": "test_lowlight.jpg",
    "hat_brim": "test_hat.jpg",
    "reflective_glasses": "test_reflective.jpg",
    "hat_plus_glasses": "test_hat_glasses.jpg",
    "angle_30_deg": "test_angle30.jpg",
    "distance_10m": "test_distance10m.jpg",
}

print(f"{'Condition':<25} {'Cosine Sim':>12} {'Would Match':>12}")
print("-" * 52)
for condition, path in test_images.items():
    try:
        emb = get_embedding(path)
        sim = cosine_similarity(ref, emb)
        # 0.68 = commonly used cosine similarity threshold for face match (model-dependent; VGG-Face typical range: 0.60–0.75)
        match = "YES" if sim > 0.68 else "NO"
        print(f"{condition:<25} {sim:>12.4f} {match:>12}")
    except Exception as e:
        print(f"{condition:<25} {'FAILED':>12} {'N/A':>12}")

# Expected output:
# Condition          | Similarity | Match
# ────────────────────────────────────────
# baseline           |     0.9234 | YES
# hat_sunglasses     |     0.7102 | YES
# ir_leds_active     |     0.5843 | NO   ← effective
# cv_dazzle_makeup   |     0.4127 | NO   ← effective
# fawkes_cloaked     |     0.6312 | NO   ← effective

Key Metrics to Track

Metric Baseline Target Mitigated Target How to Measure
Detection Rate 95–99% < 60% MediaPipe / RetinaFace across 50+ captures
Cosine Similarity > 0.75 < 0.55 DeepFace embedding comparison
Verification Pass TRUE FALSE DeepFace.verify threshold check
Cross-Model Transfer Match on 4/4 Match on ≤ 1/4 Test ArcFace, FaceNet, VGG-Face, DeepID

Deepfake & Face Synthesis Threats

Face synthesis models create a parallel threat surface: attackers can generate realistic face images to bypass liveness checks, spoof identity verification, or create deniable false identities.

Face-Swap Attacks

Tools like DeepFaceLab, FaceFusion, and roop can swap one face onto another in real time or in post-production video.

  • Threat: Real-time video call spoofing bypasses KYC/liveness checks
  • Detection: Look for temporal flickering, blending artifacts, and gaze inconsistency
  • Defense: Multi-factor liveness (depth, IR, challenge-response) defeats most face-swap pipelines

GAN-Generated Faces

StyleGAN2/3 and diffusion models generate photorealistic faces of people who don't exist — useful for creating false identities at scale.

  • Threat: Synthetic profiles for social engineering and sock-puppet networks
  • Detection: Spectral analysis reveals GAN fingerprints; check for pupil asymmetry and ear inconsistencies
  • Tools: DM Image Detection, Microsoft Video Authenticator

Liveness Check Bypass

Attackers replay 3D-rendered face models or manipulated video streams to defeat basic liveness detection.

  • Replay attacks: Pre-recorded video played to camera defeats 2D liveness
  • 3D mask attacks: Silicone or 3D-printed masks fool depth sensors below certain thresholds
  • Mitigation: Active liveness (random gesture prompts) + multi-spectral IR imaging

Cloud FR API Testing

Commercial FR APIs (AWS Rekognition, Azure Face, Google Vision) have different robustness profiles against synthetic and adversarial inputs.

  • AWS Rekognition: Free tier allows 5K images/month for testing
  • Azure Face API: Includes built-in liveness detection since 2023
  • Demographic bias: NIST FRVT reports document accuracy disparities across demographics — review before trusting vendor claims

Deepfake Awareness

Deepfake detection is an arms race. Current detectors catch ~95% of known generators but new models routinely evade older detectors. Treat deepfake detection as a probabilistic signal, not a binary gate. Combine with behavioral analysis and out-of-band verification.
  • Degrade detection: combine headwear + eyewear + angle to push detection rate below reliable thresholds
  • Destabilize embeddings: use variable appearance profiles to create inconsistent feature vectors
  • Break temporal correlation: vary routes, timing, and appearance combinations across sessions
  • Validate continuously: run quarterly benchmark tests to ensure controls remain effective against updated models
  • 🎯

    Facial Recognition Defense Labs

    Hands-on exercises to measure and improve your face privacy posture.

    🔧
    FR Pipeline Benchmark Custom Lab medium
    Install DeepFace and MediaPipeCapture baseline photos (frontal, angled, different lighting)Apply hat/glasses/angle countermeasures and recaptureCompare detection rates and embedding similarity scoresDocument which combinations produce the largest confidence drops
    🔧
    Image Cloaking Lab Custom Lab easy
    Install Fawkes image cloaking toolCloak a set of portrait photos at different protection levelsVerify cloaked images are visually identical to originalsTest cloaked vs uncloaked images against DeepFace verificationMeasure impact on face search services (PimEyes)