Reconnaissance

Active Reconnaissance

Active reconnaissance involves direct interaction with target systems to gather detailed information about web technologies, application structure, and potential vulnerabilities. This phase generates traffic detectable by security monitoring.

Detection Risk

Active reconnaissance will leave traces in target logs. WAFs, IDS/IPS, and SIEM systems may detect and alert on scanning activity. Ensure you have explicit authorization and coordinate with the client's security team if stealth is required.

Warning

Active recon generates traffic that may trigger security alerts. Always have written authorization before proceeding.

Tools & Resources

WhatWeb

Website fingerprinting for CMS

brew install whatweb GitHub →

httpx

Fast HTTP probing & tech detection

go install ...httpx@latest GitHub →

katana

Next-gen web crawling framework

go install ...katana@latest GitHub →

waybackurls

Fetch URLs from Wayback Machine

go install waybackurls@latest GitHub →

Technology Fingerprinting

Identify web servers, frameworks, CMS platforms, and client-side technologies to understand the technology stack and potential vulnerabilities.

WhatWeb

whatweb.sh
bash
# Basic fingerprinting
whatweb https://example.com

# Verbose output with full details
whatweb -v https://example.com

# Aggressive mode (more plugins, more requests)
whatweb -a 3 https://example.com

# Output to JSON
whatweb --log-json=output.json https://example.com

# Scan multiple targets
whatweb -i targets.txt -v

# Quiet mode (just technologies)
whatweb -q https://example.com

httpx Technology Detection

httpx-tech.sh
bash
# Basic HTTP probing
echo "example.com" | httpx

# Technology detection
echo "example.com" | httpx -tech-detect

# Full details
cat subdomains.txt | httpx -title -status-code -tech-detect -content-length

# With response headers
cat subdomains.txt | httpx -include-response-header

# JSON output
cat subdomains.txt | httpx -tech-detect -json -o results.json

# Filter by status
cat subdomains.txt | httpx -mc 200,301,302 -tech-detect

Manual Header Analysis

header-analysis.sh
bash
# Curl headers
curl -I https://example.com

# Full response with headers
curl -i https://example.com

# Follow redirects
curl -IL https://example.com

# Look for revealing headers:
# Server: Apache/2.4.41
# X-Powered-By: PHP/7.4.3
# X-AspNet-Version: 4.0.30319
# X-Generator: Drupal 8
# Via: 1.1 vegur (Heroku)

# Check security headers
curl -sI https://example.com | grep -iE "(strict-transport|x-frame|x-content-type|x-xss|content-security)"

# Nmap HTTP enumeration
nmap -sV -p 80,443 --script http-headers,http-server-header example.com

Web Crawling & Spidering

Crawl the target to discover endpoints, parameters, forms, and hidden functionality.

Katana

katana.sh
bash
# Basic crawling
katana -u https://example.com

# Depth control
katana -u https://example.com -d 5

# Output to file
katana -u https://example.com -o crawl_results.txt

# Include JavaScript parsing
katana -u https://example.com -jc

# Headless browser mode (renders JavaScript)
katana -u https://example.com -headless

# With custom headers
katana -u https://example.com -H "Cookie: session=abc123"

# Filter by extension
katana -u https://example.com -ef png,jpg,gif,svg,css,woff

# JSON output
katana -u https://example.com -json -o crawl.json

# Multiple targets
katana -list urls.txt -d 3 -o all_crawl.txt

gospider

gospider.sh
bash
# Basic spidering
gospider -s https://example.com

# With depth and concurrency
gospider -s https://example.com -d 3 -c 10

# Output to directory
gospider -s https://example.com -o output_dir

# Include subdomains
gospider -s https://example.com --subs

# Custom User-Agent
gospider -s https://example.com -u "Mozilla/5.0 (Windows NT 10.0; Win64; x64)"

# With cookies
gospider -s https://example.com --cookie "session=abc123"

# Filter JS files
gospider -s https://example.com | grep "\.js" | sort -u

hakrawler

hakrawler.sh
bash
# Basic crawl
echo "https://example.com" | hakrawler

# With depth
echo "https://example.com" | hakrawler -d 3

# Include subdomains
echo "https://example.com" | hakrawler -subs

# JavaScript parsing
echo "https://example.com" | hakrawler -js

# Custom headers
echo "https://example.com" | hakrawler -h "Authorization: Bearer token"

# Filter unique URLs
echo "https://example.com" | hakrawler | sort -u

Archive & Historical Data

Historical URL data reveals removed endpoints, old parameters, and hidden functionality that may still be accessible.

historical-urls.sh
bash
# Wayback Machine URLs
waybackurls example.com | sort -u > wayback_urls.txt

# GAU (GetAllUrls) - Multiple sources
gau example.com | sort -u > all_urls.txt

# GAU with subdomains
gau --subs example.com

# GAU providers: wayback, commoncrawl, otx, urlscan
gau --providers wayback,otx example.com

# Filter interesting endpoints
waybackurls example.com | grep -iE "\.(php|asp|aspx|jsp|json|xml|config|sql|log|bak|old)$"

# Find parameters
waybackurls example.com | grep "?" | cut -d "?" -f 2 | tr "&" "\n" | cut -d "=" -f 1 | sort -u

# Extract JS files for analysis
waybackurls example.com | grep "\.js$" | sort -u > js_files.txt

# Check if old URLs still work
cat wayback_urls.txt | httpx -silent -mc 200 -o still_alive.txt

JavaScript Analysis

JavaScript files often contain API endpoints, secrets, and hardcoded credentials.

js-analysis.sh
bash
# Extract JS files from crawl
katana -u https://example.com -jc | grep "\.js" > js_files.txt

# Download all JS files
while read url; do
  curl -s "$url" -o "js/$(echo $url | md5sum | cut -d' ' -f1).js"
done < js_files.txt

# LinkFinder - Extract endpoints from JS
linkfinder -i https://example.com/app.js -o cli

# Run on all JS files
cat js_files.txt | while read url; do
  linkfinder -i "$url" -o cli
done | sort -u > endpoints.txt

# SecretFinder - Find secrets in JS
python3 SecretFinder.py -i https://example.com/app.js -o cli

# JS Miner
python3 jsminer.py -u https://example.com/app.js

# Grep for interesting patterns
curl -s https://example.com/app.js | grep -oE "(api|endpoint|url|path|secret|key|token|auth|password)['"][^'"]*['"]"

# JSBeautifier for minified code
js-beautify app.min.js > app.js

JS Secrets to Look For

API Keys & Tokens

  • • AWS access keys
  • • Google API keys
  • • Firebase credentials
  • • Stripe/payment keys
  • • OAuth tokens

Hidden Endpoints

  • • Admin API routes
  • • Debug endpoints
  • • Internal services
  • • Undocumented features
  • • GraphQL schemas

Parameter Discovery

param-discovery.sh
bash
# Arjun - Parameter discovery
arjun -u https://example.com/endpoint

# With wordlist
arjun -u https://example.com/endpoint -w params.txt

# Multiple endpoints
arjun -i urls.txt -oT params_found.txt

# ParamSpider
paramspider -d example.com -o params.txt

# x8 - Hidden parameter discovery
x8 -u "https://example.com/endpoint" -w params.txt

# Extract params from crawl results
cat crawl_results.txt | grep "?" | cut -d "?" -f 2 | tr "&" "\n" | cut -d "=" -f 1 | sort -u > found_params.txt

# Common parameters to test
# id, user, username, email, file, path, url, redirect, token, 
# debug, test, admin, page, search, query, action, type

Virtual Host Discovery

vhost-discovery.sh
bash
# ffuf for vhost discovery
ffuf -w vhosts.txt -u http://example.com -H "Host: FUZZ.example.com" -fc 404

# Filter by response size (exclude default)
ffuf -w vhosts.txt -u http://example.com -H "Host: FUZZ.example.com" -fs 1234

# Gobuster vhost mode
gobuster vhost -u http://example.com -w vhosts.txt -t 50

# wfuzz
wfuzz -c -w vhosts.txt -u "http://example.com" -H "Host: FUZZ.example.com" --hc 404

# Common vhost wordlists
# SecLists/Discovery/DNS/subdomains-top1million-5000.txt
# /usr/share/amass/wordlists/all.txt

# Manual testing
curl -s http://10.10.10.10 -H "Host: dev.example.com"
curl -s http://10.10.10.10 -H "Host: staging.example.com"
curl -s http://10.10.10.10 -H "Host: admin.example.com"

Active Recon Checklist

🔍 Technology Stack

  • ☐ Web server identified
  • ☐ Framework/CMS detected
  • ☐ Programming language determined
  • ☐ Version numbers noted
  • ☐ CDN/WAF identified

🕷️ Crawling

  • ☐ Application crawled
  • ☐ Forms identified
  • ☐ File uploads found
  • ☐ API endpoints discovered
  • ☐ Admin panels located

📜 JavaScript Analysis

  • ☐ JS files extracted
  • ☐ Endpoints extracted
  • ☐ Secrets searched
  • ☐ API schemas found
  • ☐ Source maps checked

📋 Documentation

  • ☐ Wayback URLs collected
  • ☐ Parameters documented
  • ☐ Virtual hosts tested
  • ☐ Attack surface mapped
  • ☐ Priority targets identified

Practice Labs

Information

With comprehensive reconnaissance complete, proceed to Scanning to identify specific vulnerabilities in the discovered assets.