Loading content...
Email attachments are so commonplace today that we barely notice them. We click "Attach," select a file, and expect it to arrive intact at the destination—whether that's a 15 KB invoice PDF or a 25 MB presentation deck. But this seamless experience masks considerable engineering complexity.
An email attachment involves every MIME concept we've studied working in concert: the file is identified with a Content-Type, encoded with Base64 for safe transport, wrapped in multipart structure with the message body, and labeled with Content-Disposition for proper handling at the destination.
This page explores the complete attachment lifecycle—from user selection through encoding, transmission, reception, security scanning, and extraction. Understanding this flow is essential for anyone building email systems, debugging delivery problems, or implementing file-handling features in applications.
By the end of this page, you will understand how email clients construct attachments, the complete MIME structure of a message with attachments, size limits and how they're enforced, security implications and scanning, inline vs attached content differences, and troubleshooting attachment problems.
An email attachment is simply a MIME part with specific characteristics. Let's dissect exactly how a file becomes an attachment.
Required Components
Every attachment consists of four elements:
Optional but Common
12345678910111213141516171819202122
# A complete attachment part within a multipart message --boundary_string_abc123Content-Type: application/pdf; name="quarterly-report-Q1-2024.pdf"Content-Transfer-Encoding: base64Content-Disposition: attachment; filename="quarterly-report-Q1-2024.pdf"; size=2458976; creation-date="Fri, 15 Mar 2024 09:30:00 GMT"; modification-date="Mon, 18 Mar 2024 14:22:00 GMT"Content-Description: Q1 2024 Financial Report JVBERi0xLjcKJeLjz9MKMSAwIG9iago8PAovVHlwZSAvQ2F0YWxvZwovUGFnZXMgMiAwIFIKPj4KZW5kb2JqCjUgMCBvYmoKPDwKL0xlbmd0aCA0NAovRmlsdGVyIC9GbGF0ZURlY29kZQo+PgpzdHJlYW0KeJwzNzJQMDJWMDY0MLe0VEhJ...(thousands more lines of Base64)...JSVFBRgo= --boundary_string_abc123(next part or closing delimiter)Breaking Down Each Header
Content-Type: application/pdf; name="quarterly-report-Q1-2024.pdf"
The MIME type identifies the file format. The name parameter is a legacy way to suggest a filename (Content-Disposition is preferred).
Content-Transfer-Encoding: base64
Indicates the file content is Base64-encoded. Email clients must decode before saving.
Content-Disposition: attachment; filename="..."
The attachment disposition tells clients to offer downloading rather than displaying inline. Parameters include:
filename (primary filename suggestion)filename* (UTF-8 encoded filename for international characters)size (file size in bytes, informational)creation-date / modification-date (file timestamps)The Body
After the blank line following headers, the entire file content appears as Base64 text. For a 2.5 MB PDF, this is approximately 3.3 MB of Base64 text, wrapped at 76 characters per line.
You'll often see filename specified in both Content-Type (name=) and Content-Disposition (filename=). Content-Disposition takes precedence per RFC 2183, but legacy clients may check Content-Type. For maximum compatibility, include both with identical values.
When you send an email with text, HTML formatting, and attachments, the resulting MIME structure can be surprisingly deep. Let's examine a realistic complete message.
The Typical Structure
A formatted email with inline images and attachments uses nested multipart types:
multipart/mixed (wraps everything)
├── multipart/alternative (body formats)
│ ├── text/plain (plain text fallback)
│ └── multipart/related (HTML + inline images)
│ ├── text/html (formatted body)
│ ├── image/png (inline logo)
│ └── image/jpeg (inline photo)
├── application/pdf (attachment 1)
└── application/vnd.ms-excel (attachment 2)
This structure allows:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113
From: sales@company.comTo: customer@example.comSubject: Your Invoice and StatementDate: Wed, 20 Mar 2024 10:15:30 -0700MIME-Version: 1.0Content-Type: multipart/mixed; boundary="=====MIXED_BOUNDARY=====" --=====MIXED_BOUNDARY=====Content-Type: multipart/alternative; boundary="=====ALT_BOUNDARY=====" --=====ALT_BOUNDARY=====Content-Type: text/plain; charset=utf-8Content-Transfer-Encoding: quoted-printable Dear Customer, Please find your invoice and monthly statement attached. Invoice: #INV-2024-0342Amount: $1,234.56Due Date: April 19, 2024 Payment can be made via the link in the HTML version of this email,or by check mailed to our address. Best regards,Accounts Receivable --=====ALT_BOUNDARY=====Content-Type: multipart/related; boundary="=====RELATED_BOUNDARY====="; type="text/html" --=====RELATED_BOUNDARY=====Content-Type: text/html; charset=utf-8Content-Transfer-Encoding: quoted-printable <!DOCTYPE html><html><head> <style> body { font-family: 'Segoe UI', Arial, sans-serif; line-height: 1.6; } .header { background: #1e40af; color: white; padding: 20px; } .invoice-box { background: #f8fafc; border: 1px solid #e2e8f0; padding: 20px; margin: 20px 0; } .amount { font-size: 24px; color: #059669; font-weight: bold; } .pay-btn { background: #2563eb; color: white; padding: 12px 24px; text-decoration: none; border-radius: 6px; } </style></head><body> <div class=3D"header"> <img src=3D"cid:logo@company.com" alt=3D"Company Logo" height=3D"40"> </div> <p>Dear Customer,</p> <div class=3D"invoice-box"> <p><strong>Invoice:</strong> #INV-2024-0342</p> <p class=3D"amount">Amount Due: $1,234.56</p> <p><strong>Due Date:</strong> April 19, 2024</p> </div> <p><a href=3D"https://pay.company.com/inv-0342" class=3D"pay-btn"> Pay Now</a></p> <p>Please find your invoice and statement attached.</p></body></html> --=====RELATED_BOUNDARY=====Content-Type: image/pngContent-Transfer-Encoding: base64Content-ID: <logo@company.com>Content-Disposition: inline; filename="logo.png" iVBORw0KGgoAAAANSUhEUgAAAGQAAAAoCAYAAAAIeF9DAAAAlUlEQVR42u3QMQEAAAgDoGn/zmYHDwQIdhIJCREJCREJCREJCREJCREJCREJCREJCREJCREJCREJCREJCREJCREJCRERkZAQCQkRCQkRCQkRCQkRCQkRCQkRCQkRCQkRCQkR... --=====RELATED_BOUNDARY=====-- --=====ALT_BOUNDARY=====-- --=====MIXED_BOUNDARY=====Content-Type: application/pdf; name="Invoice-2024-0342.pdf"Content-Transfer-Encoding: base64Content-Disposition: attachment; filename="Invoice-2024-0342.pdf"; size=245632 JVBERi0xLjcKJeLjz9MKMSAwIG9iago8PAovVHlwZSAvQ2F0YWxvZwovUGFnZXMgMiAwIFIKPj4KZW5kb2JqCjUgMCBvYmoKPDwKL0xlbmd0aCA0NAovRmls...(Base64 PDF content - ~330 KB of text for a 245 KB PDF)...JSVFT0YK --=====MIXED_BOUNDARY=====Content-Type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet; name="Statement-March-2024.xlsx"Content-Transfer-Encoding: base64Content-Disposition: attachment; filename="Statement-March-2024.xlsx"; size=89744 UEsDBBQAAAAIAGCKOU4NcwzNfQEAAOcCAAALABwAX3JlbHMvLnJlbHNVVAkAA5H+6VyR/ulcdXgLAAEE6AMAAAToAwAAzZLBTsMwDIbv......(Base64 Excel content)...AABQSwECLQAUAAAACAAA --=====MIXED_BOUNDARY=====--Structure Breakdown
This email demonstrates the full MIME hierarchy:
multipart/mixed (outer container)
multipart/alternative (first part of mixed)
multipart/related (within alternative)
Attachment parts (subsequent parts of mixed)
The distinction between inline and attached content is subtle but important. It determines whether content is displayed within the message body or offered for download.
Content-Disposition Values
Content-Disposition: inline → Display within message
Content-Disposition: attachment → Offer for download
When to Use Each
| Disposition | Use Case | Example |
|---|---|---|
| inline | Images shown in HTML body | Company logo, embedded photo |
| inline | Simple images that ARE the message | Single image email |
| attachment | Files user should save | PDFs, documents, archives |
| attachment | Unknown/dangerous file types | Executables, scripts |
| attachment | Large media files | Videos, large images |
| (neither) | Legacy compatibility | Some clients ignore header |
123456789101112131415161718192021222324252627282930
# INLINE: Image displayed within HTML bodyContent-Type: image/jpegContent-Transfer-Encoding: base64Content-ID: <product-photo@newsletter.example.com>Content-Disposition: inline; filename="product.jpg" /9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAUDBA... # In HTML:<img src="cid:product-photo@newsletter.example.com" alt="New Product"> --- # ATTACHMENT: File offered for downloadContent-Type: image/jpegContent-Transfer-Encoding: base64Content-Disposition: attachment; filename="product-hi-res.jpg" /9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAUDBA... # Same image data, but client shows download prompt instead of displaying --- # Content-ID enables inline referencing# The cid: URL scheme links HTML to MIME parts HTML: <img src="cid:unique-id@domain"> ↓ matches ↓Header: Content-ID: <unique-id@domain>Different email clients interpret disposition differently. Some clients display inline images automatically; others require user action. Some ignore disposition entirely and decide based on Content-Type. For critical applications, test across multiple clients.
The cid: URL Scheme
Content-ID enables referencing MIME parts from HTML. The scheme is:
cid:content-id-value
Where content-id-value matches the Content-ID header (without angle brackets).
Example Flow:
Content-ID: <logo-2024@company.example.com><img src="cid:logo-2024@company.example.com">Why Not Use Regular URLs for Images?
External images (https://example.com/image.jpg) in email have issues:
Inline images via cid: display immediately, work offline, and respect recipient privacy.
Email systems impose various size limits that affect attachments. Understanding where limits are enforced helps diagnose delivery failures.
Where Limits Are Enforced
Attachment limits exist at multiple points:
Common Limit Values
| Provider/System | Max Attachment | Max Message | Notes |
|---|---|---|---|
| Gmail | 25 MB | ~25 MB | Google Drive links suggested for larger |
| Microsoft 365 | 25 MB (web) | 25-150 MB | Outlook desktop may have higher limits |
| Yahoo Mail | 25 MB | ~25 MB | Similar to Gmail |
| Exchange Server | 10-150 MB | Configurable | Admin-defined limits |
| Corporate Gateways | 5-20 MB | Varies | Often more restrictive |
| SMTP Standard | No limit specified | Implementation-defined | RFC suggests ~10 MB reasonable |
The 33% Overhead Problem
Base64 encoding increases attachment size by approximately 33%. A 25 MB file becomes ~33.3 MB after encoding. When limits are advertised as "25 MB," it's often ambiguous whether this means:
Safe practice: assume the limit applies to the final encoded message size, and target files at ~75% of advertised limits.
ERROR Responses for Oversized Messages
When messages are rejected for size:
552 5.3.4 Message size exceeds fixed limit
552 5.2.3 Message exceeds maximum fixed size
552 Requested mail action aborted: exceeded storage allocation
These errors typically result in a bounce message (NDR) to the sender.
Some email gateways silently strip large attachments rather than bouncing the message. The recipient receives the email without attachments, possibly without any notification. If an attachment is critical, request confirmation of receipt or use alternative delivery methods for large files.
Alternatives for Large Files
When email size limits are problematic:
Enterprise Solutions
Corporate environments often use:
Email attachments are a primary vector for malware distribution. Understanding security implications is essential for anyone handling email at scale.
Attack Vectors via Attachments
Content-Type Spoofing
Attackers may set misleading Content-Type headers:
# Malicious: executable disguised as PDF
Content-Type: application/pdf; name="invoice.exe"
Content-Disposition: attachment; filename="invoice.pdf.exe"
Naive clients might show a PDF icon (based on Content-Type) while the file is actually an executable. Security scanning must examine actual file content, not just headers.
Security Scanning Approaches
| Extension | Type | Reason for Blocking |
|---|---|---|
| .exe, .com, .scr | Executables | Direct malware execution |
| .bat, .cmd, .ps1 | Scripts | Command/PowerShell execution |
| .vbs, .js, .wsf | Scripts | Windows Script Host execution |
| .msi, .msp | Installers | Software installation |
| .docm, .xlsm, .pptm | Macro-enabled Office | Macro malware |
| .jar, .jnlp | Java | Java applet execution |
| .hta, .html | HTML | Script execution, phishing |
| .iso, .img | Disk images | Can contain executables |
| .lnk | Shortcuts | Can execute commands |
Some malware arrives in password-protected ZIP files with the password in the email body. This bypasses scanning (file can't be examined) while providing user with means to open it. Users should be extremely suspicious of unsolicited password-protected attachments.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105
// TypeScript: Basic attachment security validation interface AttachmentValidation { isSafe: boolean; warnings: string[]; errors: string[];} const BLOCKED_EXTENSIONS = new Set([ 'exe', 'com', 'scr', 'bat', 'cmd', 'ps1', 'vbs', 'js', 'wsf', 'msi', 'msp', 'docm', 'xlsm', 'pptm', 'jar', 'jnlp', 'hta', 'iso', 'img', 'lnk', 'pif', 'application', 'gadget', 'msc', 'msp', 'reg', 'scf', 'ws']); const DANGEROUS_CONTENT_TYPES = new Set([ 'application/x-msdownload', 'application/x-msdos-program', 'application/x-executable', 'application/x-sh', 'application/x-shellscript',]); function validateAttachment( filename: string, contentType: string, fileSize: number, fileMagicBytes: Buffer): AttachmentValidation { const warnings: string[] = []; const errors: string[] = []; // Extract extension(s) const parts = filename.toLowerCase().split('.'); const extension = parts.pop() ?? ''; const secondExtension = parts.length > 0 ? parts.pop() : null; // Check for blocked extensions if (BLOCKED_EXTENSIONS.has(extension)) { errors.push(`Blocked file extension: .${extension}`); } // Check for double extension trick if (secondExtension && BLOCKED_EXTENSIONS.has(secondExtension)) { errors.push(`Suspicious double extension: .${secondExtension}.${extension}`); } // Check dangerous content types if (DANGEROUS_CONTENT_TYPES.has(contentType.toLowerCase())) { errors.push(`Dangerous content type: ${contentType}`); } // Check content type vs extension mismatch if (contentType === 'application/pdf' && extension !== 'pdf') { warnings.push(`Content-Type/extension mismatch: ${contentType} vs .${extension}`); } // Verify magic bytes match claimed type const actualType = detectFileType(fileMagicBytes); if (actualType && !contentType.includes(actualType)) { warnings.push(`Content type mismatch: header says ${contentType}, file appears to be ${actualType}`); } // Check for ZIP bomb potential if ((extension === 'zip' || extension === 'gz') && fileSize < 1000) { warnings.push('Very small archive - potential zip bomb'); } // Check for excessively large files if (fileSize > 25 * 1024 * 1024) { warnings.push(`Large attachment: ${(fileSize / 1024 / 1024).toFixed(1)} MB`); } return { isSafe: errors.length === 0, warnings, errors };} function detectFileType(magic: Buffer): string | null { // Check magic bytes (file signatures) if (magic[0] === 0xFF && magic[1] === 0xD8) return 'image/jpeg'; if (magic[0] === 0x89 && magic[1] === 0x50) return 'image/png'; if (magic[0] === 0x25 && magic[1] === 0x50) return 'application/pdf'; if (magic[0] === 0x50 && magic[1] === 0x4B) return 'application/zip'; if (magic[0] === 0x4D && magic[1] === 0x5A) return 'application/x-msdownload'; // EXE // ... more magic byte checks return null;} // Usageconst result = validateAttachment( 'invoice.pdf.exe', 'application/pdf', 45000, Buffer.from([0x4D, 0x5A, 0x90, 0x00]) // MZ header = executable); console.log(result);// {// isSafe: false,// warnings: ['Content type mismatch: header says application/pdf, file appears to be application/x-msdownload'],// errors: ['Blocked file extension: .exe', 'Suspicious double extension: .pdf.exe']// }Whether building an email client, processing incoming messages, or migrating data, extracting attachments is a common task. Here's the complete process.
Extraction Algorithm
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124
// TypeScript: Complete attachment extraction import { simpleParser, ParsedMail, Attachment } from 'mailparser';import * as fs from 'fs/promises';import * as path from 'path'; interface ExtractedAttachment { filename: string; originalFilename: string; contentType: string; size: number; content: Buffer; contentId?: string; isInline: boolean;} async function extractAttachments(emailSource: Buffer | string): Promise<ExtractedAttachment[]> { const parsed: ParsedMail = await simpleParser(emailSource); const extracted: ExtractedAttachment[] = []; if (!parsed.attachments) { return extracted; } for (const attachment of parsed.attachments) { // Sanitize filename const sanitizedFilename = sanitizeFilename( attachment.filename || `attachment-${extracted.length + 1}` ); extracted.push({ filename: sanitizedFilename, originalFilename: attachment.filename ?? 'unknown', contentType: attachment.contentType, size: attachment.size, content: attachment.content, contentId: attachment.contentId, isInline: attachment.contentDisposition === 'inline' }); } return extracted;} function sanitizeFilename(filename: string): string { // Remove directory traversal let sanitized = path.basename(filename); // Remove null bytes and control characters sanitized = sanitized.replace(/[\x00-\x1f\x7f]/g, ''); // Replace dangerous characters sanitized = sanitized.replace(/[<>:"/\\|?*]/g, '_'); // Handle Windows reserved names const reserved = /^(con|prn|aux|nul|com[1-9]|lpt[1-9])(\..*)?$/i; if (reserved.test(sanitized)) { sanitized = '_' + sanitized; } // Limit length if (sanitized.length > 200) { const ext = path.extname(sanitized); const base = path.basename(sanitized, ext); sanitized = base.substring(0, 200 - ext.length) + ext; } // Ensure not empty if (!sanitized || sanitized === '.' || sanitized === '..') { sanitized = 'attachment'; } return sanitized;} async function saveAttachments( emailSource: Buffer, outputDir: string): Promise<string[]> { const attachments = await extractAttachments(emailSource); const savedPaths: string[] = []; await fs.mkdir(outputDir, { recursive: true }); for (const attachment of attachments) { // Handle duplicate filenames let finalPath = path.join(outputDir, attachment.filename); let counter = 1; while (await fileExists(finalPath)) { const ext = path.extname(attachment.filename); const base = path.basename(attachment.filename, ext); finalPath = path.join(outputDir, `${base}-${counter}${ext}`); counter++; } await fs.writeFile(finalPath, attachment.content); savedPaths.push(finalPath); console.log(`Saved: ${finalPath} (${formatBytes(attachment.size)})`); } return savedPaths;} async function fileExists(filepath: string): Promise<boolean> { try { await fs.access(filepath); return true; } catch { return false; }} function formatBytes(bytes: number): string { if (bytes < 1024) return bytes + ' B'; if (bytes < 1024 * 1024) return (bytes / 1024).toFixed(1) + ' KB'; return (bytes / 1024 / 1024).toFixed(1) + ' MB';} // Usageconst emailBuffer = await fs.readFile('message.eml');const paths = await saveAttachments(emailBuffer, './extracted-attachments');console.log(`Extracted ${paths.length} attachments`);Attachment filenames come from untrusted sources. Always sanitize before using for filesystem operations. Path traversal attacks (../../etc/passwd), reserved names (CON, NUL on Windows), and encoding tricks can all cause security issues or crashes.
Handling Inline Content
Inline attachments (Content-Disposition: inline with Content-ID) are referenced by HTML body content. To display emails correctly:
<!-- Original (requires cid: resolution) -->
<img src="cid:logo@company.com" alt="Logo">
<!-- After processing (data URI) -->
<img src="..." alt="Logo">
<!-- Or (local path) -->
<img src="/attachments/msg-123/logo.png" alt="Logo">
For web-based email clients, using data URIs avoids additional HTTP requests but increases HTML size. For native clients, saving to temp files works better for large images.
When attachments fail, diagnosing the problem requires understanding the complete transmission path. Here are common issues and their solutions.
Symptom: Attachment Missing Entirely
Possible causes:
Diagnosis:
Symptom: Attachment Corrupted
Possible causes:
Diagnosis:
Symptom: "winmail.dat" Instead of Attachments
Microsoft Outlook using TNEF (Transport Neutral Encapsulation Format) wraps attachments in a proprietary format:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
#!/bin/bash# Diagnostic script for email attachment issues EMAIL_FILE="$1" echo "=== Attachment Diagnostics ==="echo "" # Check for multipart structureecho "1. Multipart Structure:"if grep -q "multipart/" "$EMAIL_FILE"; then grep -i "content-type:.*multipart" "$EMAIL_FILE" echo " Boundaries declared:" grep -oP 'boundary="?[^";s]+' "$EMAIL_FILE"else echo " NOT multipart - unlikely to have attachments"fiecho "" # Count partsecho "2. MIME Parts (approximate):"PART_COUNT=$(grep -c "^--" "$EMAIL_FILE" 2>/dev/null || echo "0")echo " Found ~$PART_COUNT boundary markers"echo "" # Check for attachment dispositionecho "3. Attachment Dispositions:"grep -i "content-disposition:.*attachment" "$EMAIL_FILE" || echo " No 'attachment' disposition found"echo "" # Check for Content-ID (inline)echo "4. Inline Content (Content-ID):"grep -i "content-id:" "$EMAIL_FILE" || echo " No Content-ID headers found"echo "" # List Content-Typesecho "5. Content-Types Found:"grep -i "^content-type:" "$EMAIL_FILE" | head -20echo "" # Check for Base64echo "6. Base64 Encoding:"if grep -q "content-transfer-encoding:.*base64" "$EMAIL_FILE"; then echo " Base64 encoding found" # Validate Base64 blocks B64_CHARS=$(grep -oP '^[A-Za-z0-9+/=]{20,}$' "$EMAIL_FILE" | wc -l) echo " Lines of Base64 data: ~$B64_CHARS"else echo " No Base64 encoding found"fiecho "" # Check filenamesecho "7. Filenames Found:"grep -oP 'filename="?[^";s]+' "$EMAIL_FILE" | sed 's/filename=/ /g'echo "" # File sizeecho "8. Total Message Size:"ls -lh "$EMAIL_FILE" | awk '{print " "$5}'Attachments are where all MIME concepts come together in practical application. Let's consolidate the essential knowledge:
Module Complete: MIME
You've now completed the comprehensive MIME module. From the fundamental purpose of extending email beyond ASCII, through content types, multipart structures, encoding mechanisms, to the practical reality of attachments—you have a complete understanding of the technology that enables rich email communication.
MIME's influence extends far beyond email into HTTP, APIs, file type handling, and data serialization across the modern Internet. The concepts you've learned here form essential knowledge for any engineer working with networked applications.
Congratulations! You've completed the MIME module with a thorough understanding of email attachments—from anatomy and structure through security and troubleshooting. You now understand how seemingly simple file sharing actually involves sophisticated MIME machinery working seamlessly behind the scenes. Next, you'll explore email security mechanisms that protect against spam, phishing, and spoofing.