Intermediate

Secrets Scanning & Harvesting

Developers leak credentials constantly. API keys in commits, tokens in build logs, passwords in config files. Your job is to find them before (or after) they do.

Low-Hanging Fruit

Secret scanning is often the fastest path to initial access. A single leaked AWS key or GitHub token can compromise entire organizations.

Where Secrets Hide

📁 Source Code

  • • Git history (deleted but recoverable)
  • • Config files (.env, config.yml)
  • • Hardcoded in source
  • • Test files with real creds

🔧 CI/CD Artifacts

  • • Build logs (masked but leaked)
  • • Pipeline artifacts
  • • Container layers
  • • Debug outputs

☁️ Cloud & Infra

  • • S3 buckets (public)
  • • Instance metadata
  • • Terraform state files
  • • Kubernetes secrets (base64)

🌐 Public Exposure

  • • Pastebin / Gists
  • • Stack Overflow answers
  • • Documentation sites
  • • JS source maps

TruffleHog

The gold standard for secret detection. Scans git history, verifies if secrets are still valid.

bash
# Install
pip install trufflehog
# or
brew install trufflehog

# Scan a GitHub repo (includes all history)
trufflehog git https://github.com/target/repo --only-verified

# Scan local repo
trufflehog git file://. --only-verified

# Scan entire GitHub org
trufflehog github --org=target-org --only-verified

# Scan with all detectors (slower but thorough)
trufflehog git https://github.com/target/repo

# Output as JSON for parsing
trufflehog git https://github.com/target/repo --json > secrets.json

# Scan specific branch
trufflehog git https://github.com/target/repo --branch=develop

# Scan filesystem (non-git)
trufflehog filesystem /path/to/code

# Scan S3 bucket
trufflehog s3 --bucket=target-bucket

GitLeaks

bash
# Install
brew install gitleaks
# or download from releases

# Scan repo (current state)
gitleaks detect --source=/path/to/repo

# Scan including git history
gitleaks detect --source=/path/to/repo --log-opts="--all"

# Scan remote repo
gitleaks detect --source=https://github.com/target/repo

# Output report
gitleaks detect --source=. --report-path=leaks.json --report-format=json

# Use custom rules
gitleaks detect --source=. --config=custom-rules.toml

# Scan staged changes (pre-commit hook)
gitleaks protect --staged

# Verbose output
gitleaks detect --source=. --verbose

GitHub Dorking

bash
# AWS Keys
org:targetcompany AWS_ACCESS_KEY_ID
org:targetcompany AKIA
filename:.env AWS_SECRET_ACCESS_KEY

# API Keys & Tokens
org:targetcompany api_key
org:targetcompany apikey
org:targetcompany api-key
org:targetcompany "Authorization: Bearer"

# Passwords
org:targetcompany password
org:targetcompany passwd
org:targetcompany "password ="
filename:config password

# Private Keys
org:targetcompany "BEGIN RSA PRIVATE KEY"
org:targetcompany "BEGIN OPENSSH PRIVATE KEY"
filename:id_rsa

# Database Credentials
org:targetcompany "mongodb+srv://"
org:targetcompany "postgres://"
org:targetcompany "mysql://"
org:targetcompany connection_string

# Cloud Config
org:targetcompany "client_secret"
org:targetcompany "subscription_id"
org:targetcompany "tenant_id"

# Specific Files
org:targetcompany filename:.env
org:targetcompany filename:credentials
org:targetcompany filename:secrets.yml
org:targetcompany filename:.npmrc _authToken

CI/CD Log Harvesting

bash
# GitHub Actions logs
# Go to: Actions > Workflow Run > Download logs
# Or via API:
curl -H "Authorization: token $GITHUB_TOKEN" \
  https://api.github.com/repos/owner/repo/actions/runs/RUN_ID/logs

# Common leakage patterns in logs:
# - echo statements printing env vars
# - Debug mode enabled
# - Error messages with connection strings
# - npm install showing .npmrc content

# Search downloaded logs
grep -r "password|secret|key|token|api" logs/

# Jenkins console output
curl -u user:token http://jenkins/job/JobName/lastBuild/consoleText | \
  grep -i "password|secret|key|token"

# GitLab job logs
curl --header "PRIVATE-TOKEN: $TOKEN" \
  "https://gitlab.com/api/v4/projects/ID/jobs/JOB_ID/trace"

Secret Patterns

Secret Type Pattern Example
AWS Access Key AKIA[0-9A-Z]16 AKIAIOSFODNN7EXAMPLE
GitHub Token ghp_[a-zA-Z0-9]36 ghp_xxxxxxxxxxxxxxxxxxxx
Slack Token xox[baprs]-[0-9a-zA-Z-]+ xoxb-123456789-abcdefgh
Private Key -----BEGIN .* PRIVATE KEY----- -----BEGIN RSA PRIVATE KEY-----
JWT eyJ[a-zA-Z0-9_-]*\.[a-zA-Z0-9_-]* eyJhbGciOiJIUzI1NiIs...
Stripe Key sk_live_[0-9a-zA-Z]24 sk_live_xxxxxxxxxxxxxxxx

Container Layer Analysis

bash
# Pull and analyze image layers
docker pull target-image:latest

# Use dive to explore layers
dive target-image:latest
# Look for: .env files, config files, removed secrets

# Extract all layers manually
docker save target-image:latest -o image.tar
tar -xf image.tar
# Each layer is a tar, extract and search

# Search extracted layers
find . -name "*.tar" -exec tar -tf {} \; | grep -i "env|config|secret"

# Use container-diff
container-diff analyze target-image:latest --type=file

# Trivy for secrets in images
trivy image --scanners secret target-image:latest

Validating Found Secrets

bash
# AWS - Test if key is valid
aws sts get-caller-identity

# GitHub - Test token
curl -H "Authorization: token LEAKED_TOKEN" https://api.github.com/user

# Slack - Test token  
curl -X POST https://slack.com/api/auth.test -d "token=LEAKED_TOKEN"

# Stripe - Test key
curl https://api.stripe.com/v1/charges -u sk_live_LEAKED:

# Google Cloud - Test service account
gcloud auth activate-service-account --key-file=leaked-key.json
gcloud projects list

# Azure - Test credentials
az login --service-principal -u CLIENT_ID -p CLIENT_SECRET --tenant TENANT_ID

# NPM - Test token
npm whoami --registry=https://registry.npmjs.org --//registry.npmjs.org/:_authToken=LEAKED

# TruffleHog validates automatically with --only-verified flag

Tools