Intermediate

Secrets Scanning & Harvesting

Developers leak credentials constantly. API keys in commits, tokens in build logs, passwords in config files. Your job is to find them before (or after) they do.

Low-Hanging Fruit

Secret scanning is often the fastest path to initial access. A single leaked AWS key or GitHub token can compromise entire organizations.

Where Secrets Hide

📁 Source Code

  • • Git history (deleted but recoverable)
  • • Config files (.env, config.yml)
  • • Hardcoded in source
  • • Test files with real creds

🔧 CI/CD Artifacts

  • • Build logs (masked but leaked)
  • • Pipeline artifacts
  • • Container layers
  • • Debug outputs

☁️ Cloud & Infra

  • • S3 buckets (public)
  • • Instance metadata
  • • Terraform state files
  • • Kubernetes secrets (base64)

🌐 Public Exposure

  • • Pastebin / Gists
  • • Stack Overflow answers
  • • Documentation sites
  • • JS source maps

TruffleHog

The gold standard for secret detection. Scans git history, verifies if secrets are still valid.

bash
# Install
pip install trufflehog
# or
brew install trufflehog

# Scan a GitHub repo (includes all history)
trufflehog git https://github.com/target/repo --only-verified

# Scan local repo
trufflehog git file://. --only-verified

# Scan entire GitHub org
trufflehog github --org=target-org --only-verified

# Scan with all detectors (slower but thorough)
trufflehog git https://github.com/target/repo

# Output as JSON for parsing
trufflehog git https://github.com/target/repo --json > secrets.json

# Scan specific branch
trufflehog git https://github.com/target/repo --branch=develop

# Scan filesystem (non-git)
trufflehog filesystem /path/to/code

# Scan S3 bucket
trufflehog s3 --bucket=target-bucket
# Install
pip install trufflehog
# or
brew install trufflehog

# Scan a GitHub repo (includes all history)
trufflehog git https://github.com/target/repo --only-verified

# Scan local repo
trufflehog git file://. --only-verified

# Scan entire GitHub org
trufflehog github --org=target-org --only-verified

# Scan with all detectors (slower but thorough)
trufflehog git https://github.com/target/repo

# Output as JSON for parsing
trufflehog git https://github.com/target/repo --json > secrets.json

# Scan specific branch
trufflehog git https://github.com/target/repo --branch=develop

# Scan filesystem (non-git)
trufflehog filesystem /path/to/code

# Scan S3 bucket
trufflehog s3 --bucket=target-bucket

GitLeaks

bash
# Install
brew install gitleaks
# or download from releases

# Scan repo (current state)
gitleaks detect --source=/path/to/repo

# Scan including git history
gitleaks detect --source=/path/to/repo --log-opts="--all"

# Scan remote repo
gitleaks detect --source=https://github.com/target/repo

# Output report
gitleaks detect --source=. --report-path=leaks.json --report-format=json

# Use custom rules
gitleaks detect --source=. --config=custom-rules.toml

# Scan staged changes (pre-commit hook)
gitleaks protect --staged

# Verbose output
gitleaks detect --source=. --verbose
# Install
brew install gitleaks
# or download from releases

# Scan repo (current state)
gitleaks detect --source=/path/to/repo

# Scan including git history
gitleaks detect --source=/path/to/repo --log-opts="--all"

# Scan remote repo
gitleaks detect --source=https://github.com/target/repo

# Output report
gitleaks detect --source=. --report-path=leaks.json --report-format=json

# Use custom rules
gitleaks detect --source=. --config=custom-rules.toml

# Scan staged changes (pre-commit hook)
gitleaks protect --staged

# Verbose output
gitleaks detect --source=. --verbose

GitHub Dorking

bash
# AWS Keys
org:targetcompany AWS_ACCESS_KEY_ID
org:targetcompany AKIA
filename:.env AWS_SECRET_ACCESS_KEY

# API Keys & Tokens
org:targetcompany api_key
org:targetcompany apikey
org:targetcompany api-key
org:targetcompany "Authorization: Bearer"

# Passwords
org:targetcompany password
org:targetcompany passwd
org:targetcompany "password ="
filename:config password

# Private Keys
org:targetcompany "BEGIN RSA PRIVATE KEY"
org:targetcompany "BEGIN OPENSSH PRIVATE KEY"
filename:id_rsa

# Database Credentials
org:targetcompany "mongodb+srv://"
org:targetcompany "postgres://"
org:targetcompany "mysql://"
org:targetcompany connection_string

# Cloud Config
org:targetcompany "client_secret"
org:targetcompany "subscription_id"
org:targetcompany "tenant_id"

# Specific Files
org:targetcompany filename:.env
org:targetcompany filename:credentials
org:targetcompany filename:secrets.yml
org:targetcompany filename:.npmrc _authToken
# AWS Keys
org:targetcompany AWS_ACCESS_KEY_ID
org:targetcompany AKIA
filename:.env AWS_SECRET_ACCESS_KEY

# API Keys & Tokens
org:targetcompany api_key
org:targetcompany apikey
org:targetcompany api-key
org:targetcompany "Authorization: Bearer"

# Passwords
org:targetcompany password
org:targetcompany passwd
org:targetcompany "password ="
filename:config password

# Private Keys
org:targetcompany "BEGIN RSA PRIVATE KEY"
org:targetcompany "BEGIN OPENSSH PRIVATE KEY"
filename:id_rsa

# Database Credentials
org:targetcompany "mongodb+srv://"
org:targetcompany "postgres://"
org:targetcompany "mysql://"
org:targetcompany connection_string

# Cloud Config
org:targetcompany "client_secret"
org:targetcompany "subscription_id"
org:targetcompany "tenant_id"

# Specific Files
org:targetcompany filename:.env
org:targetcompany filename:credentials
org:targetcompany filename:secrets.yml
org:targetcompany filename:.npmrc _authToken

CI/CD Log Harvesting

bash
# GitHub Actions logs
# Go to: Actions > Workflow Run > Download logs
# Or via API:
curl -H "Authorization: token $GITHUB_TOKEN" \
  https://api.github.com/repos/owner/repo/actions/runs/RUN_ID/logs

# Common leakage patterns in logs:
# - echo statements printing env vars
# - Debug mode enabled
# - Error messages with connection strings
# - npm install showing .npmrc content

# Search downloaded logs
grep -r "password|secret|key|token|api" logs/

# Jenkins console output
curl -u user:token http://jenkins/job/JobName/lastBuild/consoleText | \
  grep -i "password|secret|key|token"

# GitLab job logs
curl --header "PRIVATE-TOKEN: $TOKEN" \
  "https://gitlab.com/api/v4/projects/ID/jobs/JOB_ID/trace"
# GitHub Actions logs
# Go to: Actions > Workflow Run > Download logs
# Or via API:
curl -H "Authorization: token $GITHUB_TOKEN" \
  https://api.github.com/repos/owner/repo/actions/runs/RUN_ID/logs

# Common leakage patterns in logs:
# - echo statements printing env vars
# - Debug mode enabled
# - Error messages with connection strings
# - npm install showing .npmrc content

# Search downloaded logs
grep -r "password|secret|key|token|api" logs/

# Jenkins console output
curl -u user:token http://jenkins/job/JobName/lastBuild/consoleText | \
  grep -i "password|secret|key|token"

# GitLab job logs
curl --header "PRIVATE-TOKEN: $TOKEN" \
  "https://gitlab.com/api/v4/projects/ID/jobs/JOB_ID/trace"

Secret Patterns

Secret Type Pattern Example
AWS Access Key AKIA[0-9A-Z]16 AKIAIOSFODNN7EXAMPLE
GitHub Token ghp_[a-zA-Z0-9]36 ghp_xxxxxxxxxxxxxxxxxxxx
Slack Token xox[baprs]-[0-9a-zA-Z-]+ xoxb-123456789-abcdefgh
Private Key -----BEGIN .* PRIVATE KEY----- -----BEGIN RSA PRIVATE KEY-----
JWT eyJ[a-zA-Z0-9_-]*\.[a-zA-Z0-9_-]* eyJhbGciOiJIUzI1NiIs...
Stripe Key sk_live_[0-9a-zA-Z]24 sk_live_xxxxxxxxxxxxxxxx

Container Layer Analysis

bash
# Pull and analyze image layers
docker pull target-image:latest

# Use dive to explore layers
dive target-image:latest
# Look for: .env files, config files, removed secrets

# Extract all layers manually
docker save target-image:latest -o image.tar
tar -xf image.tar
# Each layer is a tar, extract and search

# Search extracted layers
find . -name "*.tar" -exec tar -tf {} \; | grep -i "env|config|secret"

# Use container-diff
container-diff analyze target-image:latest --type=file

# Trivy for secrets in images
trivy image --scanners secret target-image:latest
# Pull and analyze image layers
docker pull target-image:latest

# Use dive to explore layers
dive target-image:latest
# Look for: .env files, config files, removed secrets

# Extract all layers manually
docker save target-image:latest -o image.tar
tar -xf image.tar
# Each layer is a tar, extract and search

# Search extracted layers
find . -name "*.tar" -exec tar -tf {} \; | grep -i "env|config|secret"

# Use container-diff
container-diff analyze target-image:latest --type=file

# Trivy for secrets in images
trivy image --scanners secret target-image:latest

Validating Found Secrets

bash
# AWS - Test if key is valid
aws sts get-caller-identity

# GitHub - Test token
curl -H "Authorization: token LEAKED_TOKEN" https://api.github.com/user

# Slack - Test token  
curl -X POST https://slack.com/api/auth.test -d "token=LEAKED_TOKEN"

# Stripe - Test key
curl https://api.stripe.com/v1/charges -u sk_live_LEAKED:

# Google Cloud - Test service account
gcloud auth activate-service-account --key-file=leaked-key.json
gcloud projects list

# Azure - Test credentials
az login --service-principal -u CLIENT_ID -p CLIENT_SECRET --tenant TENANT_ID

# NPM - Test token
npm whoami --registry=https://registry.npmjs.org --//registry.npmjs.org/:_authToken=LEAKED

# TruffleHog validates automatically with --only-verified flag
# AWS - Test if key is valid
aws sts get-caller-identity

# GitHub - Test token
curl -H "Authorization: token LEAKED_TOKEN" https://api.github.com/user

# Slack - Test token  
curl -X POST https://slack.com/api/auth.test -d "token=LEAKED_TOKEN"

# Stripe - Test key
curl https://api.stripe.com/v1/charges -u sk_live_LEAKED:

# Google Cloud - Test service account
gcloud auth activate-service-account --key-file=leaked-key.json
gcloud projects list

# Azure - Test credentials
az login --service-principal -u CLIENT_ID -p CLIENT_SECRET --tenant TENANT_ID

# NPM - Test token
npm whoami --registry=https://registry.npmjs.org --//registry.npmjs.org/:_authToken=LEAKED

# TruffleHog validates automatically with --only-verified flag

Tools

Secret Lifecycle & Exposure Points

flowchart LR A["🔑 Secret Created API Key / Token / Password"] --> B{Where Does It Go?} B -->|"Committed"| C["📁 Git History Forever Recoverable"] B -->|"Build Arg"| D["📦 Docker Layer Image Metadata"] B -->|"CI Variable"| E["📝 Build Logs Echo / Debug Output"] B -->|"Artifact"| F["📤 Published Artifacts Release Binaries"] C --> G["🚨 Exposed"] D --> G E --> G F --> G G --> H["🛡️ Detection TruffleHog / Gitleaks GitHub Secret Scanning"] H --> I["🔄 Rotation Revoke & Replace"] style A fill:#ede9fe,stroke:#7c3aed style G fill:#fee2e2,stroke:#dc2626 style H fill:#fef3c7,stroke:#d97706 style I fill:#dcfce7,stroke:#16a34a

Real-World: Uber Breach via Hardcoded Credentials (2016)

Engineers committed AWS credentials into a private GitHub repo. Attackers accessed the repo, found the keys, and exfiltrated data on 57 million riders and drivers. The breach resulted in a $148M settlement. Lesson: Private repos are not secret vaults — scan everything.
🎯

Secrets Scanning Labs

Practice finding and remediating leaked secrets.

🔧
Git History Secret Mining Custom Lab easy
Clone a repository with intentionally committed secrets in old commitsRun TruffleHog with --since-commit and --entropy flagsRun Gitleaks with a custom .gitleaks.toml configCompare results: TruffleHog (verified) vs Gitleaks (regex) vs HardcodesUse git-filter-repo to permanently remove secrets from history
🔧
CI/CD Log Harvesting Custom Lab medium
Access a Jenkins/GitLab instance and review build console outputSearch for masked and unmasked secrets in build logsCheck environment variable dumps from debug stepsAnalyze artifact stores for bundled .env or credentials filesImplement log redaction and secret masking policies

📋 Framework Alignment

OWASP CI/CD: CICD-SEC-6 (Insufficient Credential Hygiene), CICD-SEC-7 (Insecure System Configuration) | MITRE ATT&CK: T1552.001 (Credentials In Files), T1552.004 (Private Keys) | CIS Controls: 16.4 (Establish Process for Credential Mgmt), 3.12 (Network Segment Sensitive Data)