Http Methods And Status - Learning Module

Loading content...

0/228

Content Types: MIME Types and Media Types

The Universal Language of Data Formats

When your browser receives data from a server, how does it know whether to render HTML, display an image, play a video, or trigger a file download? The answer lies in Content Types—formally known as MIME Types (Multipurpose Internet Mail Extensions) or Media Types.

Every HTTP response includes a Content-Type header that declares the format of the body:

Content-Type: text/html; charset=utf-8
Content-Type: application/json
Content-Type: image/png
Content-Type: application/pdf

This seemingly simple mechanism enables the entire richness of the web—documents, APIs, streaming media, file downloads, and interactive applications—all transmitted over the same HTTP protocol with the recipient knowing exactly how to interpret each payload.

Learning Objectives

By the end of this page, you will: • Understand MIME type structure and registration • Master common content types for web development • Implement content negotiation correctly • Handle character encoding and content disposition • Avoid security vulnerabilities related to content types

MIME Type Structure and Syntax

MIME types follow a hierarchical structure defined by RFC 2045 and maintained by IANA (Internet Assigned Numbers Authority).

Basic Structure

type/subtype; parameter=value

Examples:

text/html
application/json; charset=utf-8
image/png
multipart/form-data; boundary=----WebKitFormBoundary

Components Explained

Type (Top-Level): The general category of data

text — Human-readable text
image — Visual bitmap or vector data
audio — Sound data
video — Moving images
application — Binary data or application-specific formats
multipart — Multiple body parts
font — Font data (added later)
model — 3D model data

Subtype: Specific format within the type

text/plain, text/html, text/css
image/png, image/jpeg, image/svg+xml
application/json, application/pdf, application/xml

Parameters: Additional format details

charset=utf-8 — Character encoding
boundary=abc123 — Multipart delimiter
profile=high — Codec profile

MIME Type Categories
Top-Level Type	Description	Common Subtypes
`text`	Human-readable text formats	plain, html, css, javascript, csv, xml
`image`	Image data	png, jpeg, gif, webp, svg+xml, avif
`audio`	Audio/sound data	mpeg, ogg, wav, webm, aac
`video`	Video data	mp4, webm, ogg, avi
`application`	Binary/application data	json, xml, pdf, zip, octet-stream
`multipart`	Multi-part content	form-data, byteranges, mixed
`font`	Font files	woff, woff2, ttf, otf
`model`	3D models	gltf+json, obj, stl

Vendor and Personal Types

Beyond standard types, MIME allows custom types:

Vendor Types (vnd.): Registered by companies

application/vnd.ms-excel
application/vnd.google-apps.document
application/vnd.api+json

Personal/Experimental Types (prs./x-): Unregistered

application/x-www-form-urlencoded
application/prs.custom-format

Note: The x- prefix is deprecated for new types but remains in widespread use (e.g., text/x-python).

Suffix Conventions

Suffixes indicate underlying format structure:

Suffix	Meaning	Example
`+xml`	XML-based	`application/svg+xml`, `application/rss+xml`
`+json`	JSON-based	`application/vnd.api+json`, `application/hal+json`
`+zip`	ZIP-compressed	`application/epub+zip`

This allows generic XML/JSON parsers to process data even without knowing the specific format.

IANA Registry

IANA maintains the official registry of MIME types at: https://www.iana.org/assignments/media-types

When in doubt about a content type, check this authoritative source. Using registered types ensures interoperability across systems.

Text Content Types

Text types represent human-readable content. Character encoding is crucial for these types.

Common Text Content Types
Content Type	Use Case	Notes
`text/plain`	Plain text without formatting	Default for unknown text; add charset
`text/html`	HTML documents	Web pages; always specify charset
`text/css`	CSS stylesheets	Linked or embedded styles
`text/javascript`	JavaScript code	Preferred over application/javascript
`text/csv`	Comma-separated values	Spreadsheet data; define charset
`text/xml`	XML documents	For human-readable XML; charset UTF-8 default
`text/markdown`	Markdown text	Not universally registered

Character Encoding (Charset)

For text types, charset is critical:

Content-Type: text/html; charset=utf-8
Content-Type: text/plain; charset=iso-8859-1

Why charset matters:

Without charset, browsers may guess (often wrongly)
Wrong charset → garbled characters (mojibake: ç becomes Ã§)
Security implications (charset sniffing attacks)

Best practice: Always specify charset=utf-8 for text content. UTF-8 is the universal standard that supports all languages.

HTML Charset Declaration

For HTML, you should also declare charset in the document:

<!DOCTYPE html>
<html>
<head>
    <meta charset="UTF-8">
    <!-- Or equivalently: -->
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>

The HTTP header takes precedence, but including both ensures correct handling in all scenarios (cached copies, local files, etc.).

JavaScript Content Type Evolution

# Historical (still valid)
application/javascript
application/x-javascript
text/javascript

# Modern standard (RFC 9239, 2022)
text/javascript

RFC 9239 officially obsoleted application/javascript in favor of text/javascript. All browsers accept both, but use text/javascript for new projects.

Proper Text Content-Type Usage
HTTP
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
# HTML Page
HTTP/1.1 200 OK
Content-Type: text/html; charset=utf-8
 
<!DOCTYPE html>
<html lang="en">
<head><meta charset="UTF-8">...</head>
<body>...</body>
</html>
 
# CSS Stylesheet
HTTP/1.1 200 OK
Content-Type: text/css; charset=utf-8
Cache-Control: public, max-age=31536000
 
body { font-family: 'Inter', sans-serif; }
 
# JavaScript Module
HTTP/1.1 200 OK
Content-Type: text/javascript; charset=utf-8
Cache-Control: public, max-age=31536000
 
export function greet() { console.log('Hello!'); }
 
# CSV Data Export
HTTP/1.1 200 OK
Content-Type: text/csv; charset=utf-8
Content-Disposition: attachment; filename="export.csv"
 
name,email,created_at
"John Doe",john@example.com,2025-01-18

Application Content Types

application/* types represent data that requires specific processing—from structured data formats to binary executables.

Common Application Content Types
Content Type	Use Case	Notes
`application/json`	JSON data (APIs)	Default for REST APIs; UTF-8 assumed
`application/xml`	XML data	Machine-oriented; see also text/xml
`application/pdf`	PDF documents	Always binary
`application/zip`	ZIP archives	Compressed files
`application/gzip`	Gzip-compressed data	Single-file compression
`application/octet-stream`	Generic binary	Unknown binary; triggers download
`application/x-www-form-urlencoded`	Form data	Simple form submissions

application/json — The API Standard

Content-Type: application/json

JSON (JavaScript Object Notation) is the dominant format for web APIs:

Characteristics:

UTF-8 encoding assumed (charset parameter optional)
Human-readable yet compact
Native JavaScript compatibility
Supported by virtually every programming language

Variants and Extensions:

Content Type	Purpose
`application/json`	Standard JSON
`application/vnd.api+json`	JSON:API specification
`application/hal+json`	HAL (Hypertext Application Language)
`application/ld+json`	JSON-LD (Linked Data)
`application/problem+json`	RFC 7807 error responses
`application/json-patch+json`	JSON Patch (RFC 6902)
`application/merge-patch+json`	JSON Merge Patch (RFC 7396)

application/octet-stream — The Binary Fallback

When the server doesn't know (or won't tell) the specific format:

Content-Type: application/octet-stream
Content-Disposition: attachment; filename="file.bin"

Browsers typically trigger a download dialog for this type. It's the safe default for unknown binary content.

application/x-www-form-urlencoded

The default format for HTML form submissions:

POST /login HTTP/1.1
Content-Type: application/x-www-form-urlencoded

username=john%40example.com&password=s%26cret!

Encoding rules:

Spaces become + or %20
Special characters URL-encoded (@ → %40, & → %26)
Key-value pairs joined by &

Limitation: Cannot handle binary data efficiently; use multipart/form-data for file uploads.

Application Content Types in Practice
HTTP
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
# Standard JSON API Response
HTTP/1.1 200 OK
Content-Type: application/json
 
{
  "data": {
    "id": "12345",
    "type": "users",
    "attributes": {
      "name": "John Doe",
      "email": "john@example.com"
    }
  }
}
 
# RFC 7807 Problem Details for Error Response
HTTP/1.1 400 Bad Request
Content-Type: application/problem+json
 
{
  "type": "https://api.example.com/errors/validation",
  "title": "Validation Error",
  "status": 400,
  "detail": "The email field contains an invalid email address",
  "instance": "/api/users",
  "errors": [
    {
      "field": "email",
      "message": "Invalid email format"
    }
  ]
}
 
# PDF Download
HTTP/1.1 200 OK
Content-Type: application/pdf
Content-Disposition: attachment; filename="report-2025.pdf"
Content-Length: 524288
 
(binary PDF data)

Use application/problem+json for API Errors

RFC 7807 (Problem Details for HTTP APIs) standardizes error responses:

{
  "type": "https://example.com/probs/validation",
  "title": "Validation Failed",
  "status": 422,
  "detail": "Username must be at least 3 characters",
  "instance": "/api/users"
}

Using application/problem+json makes errors machine-parseable across different APIs.

Image, Audio, and Video Content Types

Media types are crucial for web performance and compatibility. Choosing the right format affects loading speed, quality, and browser support.

Image Content Types
Content Type	Extension	Characteristics	Best For
`image/jpeg`	.jpg, .jpeg	Lossy compression, no transparency	Photos, complex images
`image/png`	.png	Lossless, transparency support	Graphics, screenshots, logos
`image/gif`	.gif	256 colors, animation support	Simple animations, memes
`image/webp`	.webp	Modern, lossy+lossless, transparency	General purpose (best tradeoff)
`image/avif`	.avif	Best compression, newest	When browser support sufficient
`image/svg+xml`	.svg	Vector, scalable, XML-based	Icons, logos, diagrams
`image/x-icon`	.ico	Legacy favicon format	Browser favicons

Image Format Selection

Photo? → JPEG (or WebP for modern browsers)
Graphic/Screenshot? → PNG (or WebP)
Need transparency? → PNG, WebP, or AVIF
Vector/Scalable? → SVG
Animation? → GIF (simple) or WebP/AVIF (modern)
Best compression? → AVIF > WebP > JPEG/PNG

Modern Image Delivery with Picture Element

<picture>
  <source srcset="image.avif" type="image/avif">
  <source srcset="image.webp" type="image/webp">
  <img src="image.jpg" alt="Fallback for older browsers">
</picture>

Servers can also use content negotiation with the Accept header:

# Client request
GET /image HTTP/1.1
Accept: image/avif, image/webp, image/jpeg

# Server response (best format for client)
HTTP/1.1 200 OK
Content-Type: image/avif
Vary: Accept

Audio and Video Content Types
Content Type	Extension	Notes
`video/mp4`	.mp4	H.264/AVC codec; universal support
`video/webm`	.webm	VP8/VP9 codec; open format
`video/ogg`	.ogv	Theora codec; open but less common
`audio/mpeg`	.mp3	MP3 audio; universal
`audio/ogg`	.ogg	Vorbis/Opus codec; high quality
`audio/wav`	.wav	Uncompressed; large files
`audio/webm`	.weba	Opus codec in WebM container

Video Streaming Considerations

For adaptive streaming, content types vary:

# HLS (HTTP Live Streaming)
Content-Type: application/vnd.apple.mpegurl  # .m3u8 playlist
Content-Type: video/mp2t  # .ts segments

# DASH (Dynamic Adaptive Streaming)
Content-Type: application/dash+xml  # .mpd manifest
Content-Type: video/mp4  # .m4s segments

Partial Content for Media Seeking

Media players request specific byte ranges:

GET /video.mp4 HTTP/1.1
Range: bytes=1000000-1999999

HTTP/1.1 206 Partial Content
Content-Type: video/mp4
Content-Range: bytes 1000000-1999999/50000000

This enables seeking without downloading the entire file.

SVG Security

SVG files are XML and can contain JavaScript:

<svg xmlns="http://www.w3.org/2000/svg">
  <script>alert('XSS!')</script>
</svg>

Security measures: • Sanitize SVG uploads (strip scripts) • Serve user-uploaded SVGs with Content-Security-Policy: script-src 'none' • Consider Content-Type: image/svg+xml with CSP headers • Or convert to PNG/WebP for user-uploaded images

Multipart Content Types

Multipart types allow a single HTTP message to contain multiple, distinct parts—essential for file uploads and complex data.

Multipart Content Types
Content Type	Use Case	When to Use
`multipart/form-data`	Form submissions with files	File uploads from HTML forms
`multipart/byteranges`	Multiple range responses	When client requests multiple ranges
`multipart/mixed`	Email attachments	Multiple parts of different types
`multipart/alternative`	Same content, different formats	Email: HTML + plain text versions

multipart/form-data Structure

This is the standard for file uploads from browsers:

POST /upload HTTP/1.1
Content-Type: multipart/form-data; boundary=----WebKitFormBoundary7MA4YWxkTrZu0gW

------WebKitFormBoundary7MA4YWxkTrZu0gW
Content-Disposition: form-data; name="title"

My Document
------WebKitFormBoundary7MA4YWxkTrZu0gW
Content-Disposition: form-data; name="file"; filename="report.pdf"
Content-Type: application/pdf

(binary PDF content)
------WebKitFormBoundary7MA4YWxkTrZu0gW--

Anatomy of Multipart Messages

Boundary: Unique string that separates parts
- Defined in Content-Type header
- Must not appear in the content
- Prefixed with -- when used
Part Headers: Each part has its own headers
- Content-Disposition: Field name and filename
- Content-Type: MIME type of this part
Terminator: Final boundary with trailing --

Complete Multipart Form Upload Example
HTTP
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
POST /api/documents HTTP/1.1
Host: api.example.com
Content-Type: multipart/form-data; boundary=---------------------------735323031399963166993862150
Content-Length: 834
Authorization: Bearer token123
 
-----------------------------735323031399963166993862150
Content-Disposition: form-data; name="metadata"
Content-Type: application/json
 
{"title": "Q4 Report", "department": "Sales", "confidential": true}
-----------------------------735323031399963166993862150
Content-Disposition: form-data; name="document"; filename="report.pdf"
Content-Type: application/pdf
 
%PDF-1.7
(... binary PDF content ...)
%%EOF
-----------------------------735323031399963166993862150
Content-Disposition: form-data; name="thumbnail"; filename="thumb.png"
Content-Type: image/png
 
PNG
(... binary image content ...)
-----------------------------735323031399963166993862150--

When to Use multipart/form-data vs. JSON

Use multipart/form-data:

File uploads (required)
Mixed content (files + metadata together)
Binary data

Use application/json:

Pure data APIs (no files)
When you need complex nested structures
When clients prefer JSON

Hybrid approaches:

Separate requests: POST JSON metadata, then POST file separately
Base64 in JSON: Encode files as base64 (inefficient but sometimes convenient)
Multipart with JSON part: Include JSON as one part of multipart

Maximum Upload Size

Servers typically limit request body size:

• Nginx default: 1MB (client_max_body_size) • Apache default: ~2GB theoretical, often limited lower • Express.js: ~100kb default for JSON

For large files, consider: • Direct upload to cloud storage (S3 presigned URLs) • Chunked uploads (resumable) • Streaming uploads without buffering

Content Negotiation

Content negotiation allows clients and servers to agree on the best representation of a resource. The client expresses preferences; the server selects the best match.

Accept Header for Content Type Negotiation

GET /api/user/123 HTTP/1.1
Accept: application/json, application/xml;q=0.9, */*;q=0.1

Parsing:

application/json — Quality 1.0 (default if not specified)
application/xml;q=0.9 — Quality 0.9 (slightly less preferred)
*/*;q=0.1 — Any other type at 0.1 (fallback)

Server Response Based on Accept

# If server supports JSON (preferred)
HTTP/1.1 200 OK
Content-Type: application/json
Vary: Accept

{"id": 123, "name": "John"}

# If server only supports XML
HTTP/1.1 200 OK 
Content-Type: application/xml
Vary: Accept

<user><id>123</id><name>John</name></user>

# If no match possible
HTTP/1.1 406 Not Acceptable
Content-Type: application/json

{"error": "Cannot produce a response matching Accept header"}

Content Negotiation Headers
Request Header	Negotiates	Response Header
`Accept`	Content type (MIME)	`Content-Type`
`Accept-Language`	Language	`Content-Language`
`Accept-Encoding`	Compression	`Content-Encoding`
`Accept-Charset`	Character set	charset in `Content-Type`

Language Negotiation

GET /page HTTP/1.1
Accept-Language: es, en-US;q=0.9, en;q=0.8

Server priority:

Spanish (es) — q=1.0 (default)
American English (en-US) — q=0.9
Any English (en) — q=0.8

HTTP/1.1 200 OK
Content-Language: es
Vary: Accept-Language

<html lang="es">...</html>

Encoding Negotiation

GET /script.js HTTP/1.1
Accept-Encoding: gzip, deflate, br

Server returns:

HTTP/1.1 200 OK
Content-Encoding: br
Vary: Accept-Encoding

(brotli-compressed content)

The Vary Header

Critically important for caching:

Vary: Accept, Accept-Encoding, Accept-Language

What Vary tells caches: "This response varies based on these request headers. Cache separate versions for different Accept values."

Without Vary: A cache might serve a JSON response to a client that requested XML, or vice versa.

Content Negotiation Pitfalls

1. Forgetting Vary Always include Vary when content negotiation is used. Otherwise, CDNs may cache the wrong representation.

2. Over-negotiating Too many dimensions (type + language + encoding) creates a cache explosion. Each combination is cached separately.

3. Ignoring Quality Values Clients specify preferences with q-values. Ignoring them leads to suboptimal responses.

4. Not supporting fallbacks If a client requests */*, return something rather than 406.

Content-Disposition and File Downloads

The Content-Disposition header controls how browsers handle responses—particularly whether to display inline or trigger a download.

Content-Disposition Examples
HTTP
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# Inline (display in browser)
Content-Disposition: inline
Content-Disposition: inline; filename="document.pdf"
 
# Attachment (trigger download)
Content-Disposition: attachment
Content-Disposition: attachment; filename="report-2025.pdf"
 
# Filename with special characters (RFC 5987)
Content-Disposition: attachment; filename="report.pdf"; filename*=UTF-8''%E6%8A%A5%E5%91%8A.pdf
 
# Complete download response
HTTP/1.1 200 OK
Content-Type: application/pdf
Content-Disposition: attachment; filename="quarterly-report.pdf"
Content-Length: 1048576
Cache-Control: private, no-cache

Content-Disposition Values
Value	Behavior	Use Case
`inline`	Display in browser	PDFs to view, images, HTML
`attachment`	Trigger download dialog	Files to save, exports, binaries
`filename="..."`	Suggested filename	User-friendly download names
`filename*=UTF-8''...`	UTF-8 encoded filename	Non-ASCII filenames

Handling Filename Encoding

Filenames with non-ASCII characters or special characters require careful encoding:

# ASCII filename (simple)
Content-Disposition: attachment; filename="report.pdf"

# Non-ASCII filename (RFC 5987)
Content-Disposition: attachment; filename="report.pdf"; filename*=UTF-8''%E5%A0%B1%E5%91%8A.pdf

# Filename with quotes (escape with backslash)
Content-Disposition: attachment; filename="John's \"Special\" Report.pdf"

Browser compatibility:

Modern browsers: Use filename* with UTF-8 encoding
Include both filename and filename* for compatibility
The filename* takes precedence when both are present

Security Considerations

Path traversal prevention:

# Dangerous - might attempt path traversal
Content-Disposition: attachment; filename="../../../etc/passwd"

# Server should validate:
- Strip path components
- Sanitize to basename only
- Whitelist allowed characters

Content-Type matters for inline:

# Safe - browser knows how to display
Content-Type: image/png
Content-Disposition: inline

# Dangerous - might execute
Content-Type: text/html
Content-Disposition: inline
# User-uploaded HTML could contain XSS!

Best practice for user uploads:

Validate Content-Type on upload
Store with controlled filenames (not user-provided)
Serve with attachment by default
Add Content-Security-Policy headers

Summary: Mastering Content Types

Content types form the foundation of HTTP's ability to transfer any kind of data. Understanding them is essential for building robust web applications and APIs.

Key Takeaways

•MIME types define data formats — Structure: type/subtype; parameters
•Always specify charset for text — Use UTF-8 as the universal standard
•application/json is the API standard — Consider problem+json for errors
•Modern image formats save bandwidth — WebP/AVIF with fallbacks for compatibility
•multipart/form-data for file uploads — Required for binary file transmission
•Content negotiation enables flexibility — Always include Vary header for caching
•Content-Disposition controls downloads — Use attachment for file downloads, encode filenames properly

Page Complete

You now have comprehensive knowledge of HTTP content types—from basic MIME structure to advanced content negotiation and secure file downloads. In the next page, we'll explore the complete anatomy of HTTP request and response formats, understanding how all these components fit together in the wire protocol.