File Upload Security

Risk Severity
๐Ÿ”ด Critical
Fix Effort
๐Ÿ—๏ธ High (Significant Work)
Est. Time
โฑ๏ธ 4-8 hours
Reference
A03:2021 CWE-434

Insecure file uploads can lead to remote code execution, cross-site scripting, denial of service, and server compromise. This guide covers comprehensive defenses for file upload functionality.

Why File Uploads Are Critical

File upload vulnerabilities consistently rank among the most dangerous web security flaws. A single misconfiguration can allow an attacker to upload a web shell and gain complete control of your server. The key principle: never trust any aspect of user-uploaded files โ€” not the filename, extension, Content-Type header, or even the file contents.

Common Attack Vectors

Understanding attack techniques helps you build better defenses. Here are the most common methods attackers use to exploit file upload functionality:

โš ๏ธ Code Execution

  • โ€ข shell.php โ†’ Web shell upload
  • โ€ข shell.php.jpg โ†’ Double extension bypass
  • โ€ข shell.php%00.jpg โ†’ Null byte injection
  • โ€ข .htaccess โ†’ Apache config override
  • โ€ข Polyglot files (valid image + valid code)

โš ๏ธ Other Attacks

  • โ€ข SVG/HTML โ†’ Stored XSS
  • โ€ข XXE via XML/DOCX/XLSX
  • โ€ข DoS via zip bombs/large files
  • โ€ข Path traversal (../../../etc/passwd)
  • โ€ข SSRF via image URLs

๐Ÿ”ฌ Attack Technique Deep Dive

Web Shells: Attackers upload executable scripts (PHP, ASP, JSP) that provide remote command execution. Once accessed via browser, they have full server access.
Double Extension (shell.php.jpg): Exploits misconfigured servers that parse files based on non-final extensions. Apache with improper handlers may execute .php.jpg as PHP.
Polyglot Files: Files that are valid in multiple formats simultaneously. A JPEG can contain valid PHP code in comment sections that executes if the server processes it as PHP.
.htaccess Attacks: On Apache servers, uploading a .htaccess file can reconfigure the upload directory to execute any file type as PHP.

Multi-Layer Validation

Defense in Depth

Never rely on a single validation method. Attackers can bypass extension checks, MIME types, and even magic bytes. Use all validation layers together.

The following Python example demonstrates a comprehensive validation approach. Each step addresses a specific attack vector, and together they form a robust defense against malicious uploads.

๐Ÿ” Validation Layers Explained

1. File Existence Check

Ensures a file was actually submitted and has a filename

2. Size Validation

Prevents DoS attacks via huge files. Check size BEFORE reading entire file into memory

3. Extension Allowlist

Only allow specific extensions. Uses secure_filename() to sanitize

4. Magic Byte Validation

Uses python-magic to check actual file type from bytes, not headers

5. Extension-Type Matching

Ensures claimed extension matches detected file type (prevents shell.php.jpg)

6. Image Reprocessing

Re-saves images using PIL to strip EXIF metadata and embedded payloads (polyglot prevention)

7. Random Filename

Generates UUID-based name to prevent path traversal and filename attacks

8. Path Traversal Check

Final verification that resolved path stays within upload directory

Complete Validation Example (Python)

Dependencies: pip install python-magic Pillow Werkzeug

python
import os
import magic
import hashlib
from PIL import Image
from werkzeug.utils import secure_filename
from uuid import uuid4

# Configuration
ALLOWED_EXTENSIONS = {'png', 'jpg', 'jpeg', 'gif', 'pdf'}
ALLOWED_MIMES = {
    'image/png': 'png',
    'image/jpeg': 'jpg', 
    'image/gif': 'gif',
    'application/pdf': 'pdf'
}
MAX_FILE_SIZE = 5 * 1024 * 1024  # 5MB
UPLOAD_DIR = '/var/uploads'  # Outside webroot!

class FileUploadError(Exception):
    pass

def validate_and_save_file(file_storage):
    """Comprehensive file upload validation"""
    
    # 1. Check if file exists
    if not file_storage or not file_storage.filename:
        raise FileUploadError("No file provided")
    
    # 2. Check file size (before reading entire file)
    file_storage.seek(0, os.SEEK_END)
    size = file_storage.tell()
    file_storage.seek(0)
    
    if size > MAX_FILE_SIZE:
        raise FileUploadError(f"File too large (max {MAX_FILE_SIZE // 1024 // 1024}MB)")
    
    if size == 0:
        raise FileUploadError("Empty file")
    
    # 3. Validate extension (from original filename)
    original_name = secure_filename(file_storage.filename)
    ext = original_name.rsplit('.', 1)[-1].lower() if '.' in original_name else ''
    
    if ext not in ALLOWED_EXTENSIONS:
        raise FileUploadError(f"Extension not allowed: {ext}")
    
    # 4. Validate MIME type using magic bytes (not Content-Type header!)
    file_content = file_storage.read()
    file_storage.seek(0)
    
    detected_mime = magic.from_buffer(file_content, mime=True)
    if detected_mime not in ALLOWED_MIMES:
        raise FileUploadError(f"File type not allowed: {detected_mime}")
    
    # 5. Ensure extension matches detected type
    expected_ext = ALLOWED_MIMES[detected_mime]
    if ext not in [expected_ext, 'jpg'] or (ext == 'jpg' and expected_ext != 'jpg'):
        # Handle jpg/jpeg case
        if not (ext in ['jpg', 'jpeg'] and expected_ext in ['jpg', 'jpeg']):
            raise FileUploadError("Extension doesn't match file type")
    
    # 6. Additional validation for images - re-process to strip metadata
    if detected_mime.startswith('image/'):
        try:
            img = Image.open(file_storage)
            img.verify()  # Verify it's a valid image
            file_storage.seek(0)
            
            # Re-open and save to strip EXIF/metadata
            img = Image.open(file_storage)
            clean_img = Image.new(img.mode, img.size)
            clean_img.putdata(list(img.getdata()))
        except Exception as e:
            raise FileUploadError(f"Invalid image file: {e}")
    
    # 7. Generate safe filename (never use original)
    safe_name = f"{uuid4().hex}.{expected_ext}"
    
    # 8. Save to upload directory
    save_path = os.path.join(UPLOAD_DIR, safe_name)
    
    # Double-check path traversal protection
    if not os.path.abspath(save_path).startswith(os.path.abspath(UPLOAD_DIR)):
        raise FileUploadError("Invalid save path")
    
    if detected_mime.startswith('image/'):
        clean_img.save(save_path)
    else:
        with open(save_path, 'wb') as f:
            f.write(file_content)
    
    return safe_name

Secure Storage

Even with perfect validation, how you store and serve uploaded files is critical. Many breaches occur because validated files are stored in locations where they can be executed.

Storage Best Practices

โœ… Store Outside Webroot

Upload files to a directory not served by the web server. Serve through a controller that validates access.

โœ… Use Random Filenames

Never use user-supplied filenames. Generate random names (UUID) and store original name in database.

โœ… Separate Domain for User Content

Serve user uploads from a separate domain (e.g., cdn.example.com) to prevent cookie theft via XSS.

โœ… Use Cloud Storage

S3, Azure Blob, GCS with proper IAM. Eliminates local code execution risk.

Nginx Configuration

What this does: Configures Nginx to serve uploaded files safely by preventing script execution and forcing downloads. The internal directive creates a secure pattern where files are only accessible via your application (X-Accel-Redirect), never directly.

nginx
# Nginx - Disable execution in upload directory
location /uploads {
    # Serve files as downloads, not executable
    add_header Content-Disposition "attachment";
    
    # Disable script execution
    location ~ \.php$ {
        deny all;
    }
    
    # Set safe Content-Type
    default_type application/octet-stream;
    
    # Disable includes
    location ~ \.s?html?$ {
        add_header Content-Type text/plain;
    }
}

# Better: Serve through application controller
location /files {
    internal;  # Only accessible via X-Accel-Redirect
    alias /var/uploads/;
}

Safe Content Delivery

When serving user-uploaded files, you must set proper HTTP headers to prevent browsers from misinterpreting the content. This Flask example shows a secure file serving endpoint with access control, safe content types, and security headers.

๐Ÿ”’ Key Security Headers

  • X-Content-Type-Options: nosniff โ€” Prevents browser from guessing content type
  • Content-Security-Policy: default-src 'none' โ€” Blocks all scripts/styles in the response
  • X-Frame-Options: DENY โ€” Prevents clickjacking via iframes
  • Content-Disposition: attachment โ€” Forces download instead of rendering
python
from flask import send_file, abort
import os

SAFE_CONTENT_TYPES = {
    'png': 'image/png',
    'jpg': 'image/jpeg',
    'jpeg': 'image/jpeg', 
    'gif': 'image/gif',
    'pdf': 'application/pdf',
}

@app.route('/files/<file_id>')
def serve_file(file_id):
    # Look up file in database
    file_record = db.get_file(file_id)
    if not file_record:
        abort(404)
    
    # Check user has permission
    if not current_user.can_access(file_record):
        abort(403)
    
    # Get file path
    file_path = os.path.join(UPLOAD_DIR, file_record.stored_name)
    if not os.path.exists(file_path):
        abort(404)
    
    # Determine safe content type
    ext = file_record.stored_name.rsplit('.', 1)[-1].lower()
    content_type = SAFE_CONTENT_TYPES.get(ext, 'application/octet-stream')
    
    response = send_file(
        file_path,
        mimetype=content_type,
        as_attachment=True,  # Force download for non-image types
        download_name=file_record.original_name
    )
    
    # Security headers
    response.headers['X-Content-Type-Options'] = 'nosniff'
    response.headers['Content-Security-Policy'] = "default-src 'none'"
    response.headers['X-Frame-Options'] = 'DENY'
    
    return response

Malware Scanning

Even with content validation, malicious payloads can slip through. Integrate antivirus scanning as an additional defense layer. This example demonstrates both local (ClamAV) and cloud-based (VirusTotal) scanning approaches.

๐Ÿ–ฅ๏ธ ClamAV (Local)

  • โœ“ Free and open source
  • โœ“ No API rate limits
  • โœ“ Fast (no network latency)
  • โœ“ Privacy (files stay local)
  • โš ๏ธ Requires local installation
  • โš ๏ธ Signature updates needed

โ˜๏ธ VirusTotal (Cloud)

  • โœ“ 70+ antivirus engines
  • โœ“ No local setup required
  • โœ“ Constantly updated
  • โš ๏ธ Free tier rate limits
  • โš ๏ธ Privacy concerns (files uploaded)
  • โš ๏ธ Network dependency

Pro Tip

Use a hash-first approach with VirusTotal: calculate the SHA-256 hash and check if it's already been scanned. This avoids uploading sensitive files while still benefiting from crowd-sourced threat intelligence.
python
import clamd
import hashlib

def scan_file_for_malware(file_path: str) -> bool:
    """Scan file with ClamAV"""
    try:
        cd = clamd.ClamdUnixSocket()
        result = cd.scan(file_path)
        
        if result is None:
            return True  # No threats found
        
        # Check scan result
        status = result.get(file_path)
        if status and status[0] == 'FOUND':
            # Log the detection
            logger.warning(f"Malware detected in {file_path}: {status[1]}")
            # Delete the file
            os.remove(file_path)
            return False
        
        return True
    except clamd.ConnectionError:
        logger.error("ClamAV not available - rejecting upload for safety")
        return False

# Alternative: VirusTotal API for cloud scanning
import requests

def scan_with_virustotal(file_path: str, api_key: str) -> bool:
    """Scan file with VirusTotal"""
    # Calculate file hash
    with open(file_path, 'rb') as f:
        file_hash = hashlib.sha256(f.read()).hexdigest()
    
    # Check if already scanned
    response = requests.get(
        f'https://www.virustotal.com/api/v3/files/{file_hash}',
        headers={'x-apikey': api_key}
    )
    
    if response.status_code == 200:
        data = response.json()
        stats = data['data']['attributes']['last_analysis_stats']
        if stats['malicious'] > 0 or stats['suspicious'] > 0:
            return False
    
    # If not found, upload for scanning (implement as needed)
    return True

Framework-Specific Examples

Here are production-ready implementations of secure file upload handling for popular web frameworks. Each example includes extension filtering, size limits, magic byte validation, and secure storage.

Express.js (Node.js)

This Express.js example uses Multer for multipart form handling with custom storage configuration. The file-type library validates magic bytes after upload to prevent extension spoofing. Files are stored outside the webroot with randomized names.

Dependencies: npm install multer file-type crypto
javascript
const multer = require('multer');
const path = require('path');
const crypto = require('crypto');
const fileType = require('file-type');

// Configure multer
const storage = multer.diskStorage({
  destination: '/var/uploads',  // Outside webroot
  filename: (req, file, cb) => {
    // Generate random filename
    const ext = path.extname(file.originalname).toLowerCase();
    const name = crypto.randomBytes(16).toString('hex') + ext;
    cb(null, name);
  }
});

const fileFilter = (req, file, cb) => {
  const allowedMimes = ['image/jpeg', 'image/png', 'image/gif'];
  if (allowedMimes.includes(file.mimetype)) {
    cb(null, true);
  } else {
    cb(new Error('Invalid file type'), false);
  }
};

const upload = multer({
  storage,
  fileFilter,
  limits: {
    fileSize: 5 * 1024 * 1024  // 5MB
  }
});

// Validate magic bytes after upload
const validateMagicBytes = async (filePath) => {
  const type = await fileType.fromFile(filePath);
  const allowedTypes = ['image/jpeg', 'image/png', 'image/gif'];
  return type && allowedTypes.includes(type.mime);
};

app.post('/upload', upload.single('file'), async (req, res) => {
  if (!req.file) {
    return res.status(400).json({ error: 'No file uploaded' });
  }
  
  // Validate magic bytes
  const isValid = await validateMagicBytes(req.file.path);
  if (!isValid) {
    fs.unlinkSync(req.file.path);  // Delete invalid file
    return res.status(400).json({ error: 'Invalid file content' });
  }
  
  res.json({ filename: req.file.filename });
});

ASP.NET Core

This C#/.NET Core service class provides enterprise-grade file upload handling. It validates extensions against an allowlist, checks file signatures (magic bytes) using direct byte comparison, and generates GUID-based filenames to prevent overwrites and path traversal attacks.

๐Ÿ“˜ How Magic Byte Validation Works

The _fileSignatures dictionary maps extensions to their expected file headers. For example, PNG files always start with 0x89 0x50 0x4E 0x47 (โ€ฐPNG). The code reads the first N bytes of the uploaded file and compares them with SequenceEqual().

csharp
public class FileUploadService
{
    private readonly string[] _allowedExtensions = { ".jpg", ".jpeg", ".png", ".gif", ".pdf" };
    private readonly Dictionary<string, byte[]> _fileSignatures = new()
    {
        { ".jpg", new byte[] { 0xFF, 0xD8, 0xFF } },
        { ".jpeg", new byte[] { 0xFF, 0xD8, 0xFF } },
        { ".png", new byte[] { 0x89, 0x50, 0x4E, 0x47 } },
        { ".gif", new byte[] { 0x47, 0x49, 0x46, 0x38 } },
        { ".pdf", new byte[] { 0x25, 0x50, 0x44, 0x46 } }
    };
    
    public async Task<string> UploadFileAsync(IFormFile file)
    {
        // Check extension
        var ext = Path.GetExtension(file.FileName).ToLowerInvariant();
        if (!_allowedExtensions.Contains(ext))
            throw new InvalidOperationException("Invalid file extension");
        
        // Check size
        if (file.Length > 5 * 1024 * 1024)
            throw new InvalidOperationException("File too large");
        
        // Validate magic bytes
        using var reader = new BinaryReader(file.OpenReadStream());
        var headerBytes = reader.ReadBytes(_fileSignatures[ext].Length);
        
        if (!headerBytes.SequenceEqual(_fileSignatures[ext]))
            throw new InvalidOperationException("File signature mismatch");
        
        // Generate safe filename
        var safeFileName = $"{Guid.NewGuid()}{ext}";
        var uploadPath = Path.Combine(_uploadDirectory, safeFileName);
        
        // Save file
        using var stream = new FileStream(uploadPath, FileMode.Create);
        await file.CopyToAsync(stream);
        
        return safeFileName;
    }
}

๐Ÿงช Testing Verification

File Upload Test Cases

text
# Extension bypass tests - all should be REJECTED
shell.php
shell.php.jpg
shell.php%00.jpg
shell.pHp
shell.php5
shell.phtml
.htaccess

# Magic bytes mismatch - should be REJECTED
# File with .jpg extension but PHP magic bytes

# Path traversal - should be REJECTED
../../../etc/passwd
....//....//etc/passwd

# Large file - should be REJECTED
# File exceeding size limit

Web Shell Detection

bash
# Check if uploaded files are accessible
curl https://site.com/uploads/test.jpg
# Should work for legitimate images

# Check if PHP executes in upload directory
curl https://site.com/uploads/test.php
# Should return 403/404 or raw PHP code (NOT execute)

# Check Content-Type header
curl -I https://site.com/uploads/test.jpg
# Should have X-Content-Type-Options: nosniff

โš ๏ธ Common Mistakes

โŒ Trusting Client Headers

python
# DON'T: Content-Type is user-controlled
if file.content_type == 'image/jpeg':
    save(file)  # Attacker sets any type

Headers are trivially spoofed

โœ… Correct Approach

python
# DO: Check actual file bytes
import magic
mime = magic.from_buffer(file.read(2048))
if mime.startswith('image/'):
    save(file)

Verify file type from actual content

โŒ Storing in Webroot

python
# DON'T: Direct web access
save(f'/var/www/html/uploads/{filename}')
# Attacker can access at /uploads/shell.php

Uploaded files directly executable

โœ… Correct Approach

python
# DO: Store outside webroot, serve via app
save(f'/data/uploads/{uuid4()}.dat')
# Serve through application with proper checks

No direct execution possible

Verification Checklist

Use this checklist during code reviews and security assessments to verify file upload implementations are properly secured. All items should be checked for a production-ready system.

  • โ˜ File extension validated against allowlist
  • โ˜ Magic bytes (file signature) validated
  • โ˜ Content-Type header NOT trusted (validate server-side)
  • โ˜ File size limits enforced
  • โ˜ Filenames sanitized/randomized
  • โ˜ Files stored outside webroot
  • โ˜ No script execution in upload directory
  • โ˜ X-Content-Type-Options: nosniff header set
  • โ˜ Path traversal prevention (../ in filenames)
  • โ˜ Images reprocessed to strip metadata/payloads
  • โ˜ Antivirus scanning implemented
  • โ˜ Upload rate limiting in place