A Complete Guide to HTTP Caching

Web Performance

Explore this comprehensive guide to HTTP caching, its impact on speed, resilience, cost, and SEO, and critical headers, strategies, and debugging techniques for optimizing web performance.

Caching is the invisible backbone of the web, crucial for making sites feel fast, reliable, and affordable. When implemented effectively, it dramatically reduces latency, eases server load, and enables even fragile infrastructure to withstand sudden surges in demand. Conversely, poor or absent caching results in slow, unstable, and expensive websites.

The Business Case for Caching
Mental Model: Who Caches What?
Cache Keys and Variants
- Side note on No-Vary-Search
Freshness vs. Validation
Core HTTP Caching Headers
Observability Helpers
Freshness & Age Calculations
Common Misconceptions & Gotchas
Patterns & Recipes
Beyond Headers: Browser Behaviors
CDNs in Practice: Cloudflare
Other Caching Layers in the Stack
Debugging & Verification
Caching in the AI-mediated Web
Wrapping up: Caching as Strategy

Caching, at its core, is about minimizing unnecessary work. Each time a browser, CDN, or proxy requests an unchanged resource from your server, time and bandwidth are wasted. Every instance your server rebuilds or re-serves identical content, you incur additional load and cost. Under heavy traffic – be it during Black Friday, a viral news event, or a DDoS attack – these inefficiencies accumulate, potentially causing the entire system to collapse.

Yet, despite its fundamental nature, caching remains one of the most misunderstood aspects of web performance. Many developers:

Confuse no-cache with "don’t cache," when it actually means "store, but revalidate."
Opt for no-store as a "safe" default, inadvertently disabling caching entirely.
Misunderstand how Expires interacts with Cache-Control: max-age.
Fail to distinguish between public and private, leading to security or performance issues.
Ignore advanced directives like s-maxage or stale-while-revalidate.
Don’t realize that CDNs, browsers, proxies, and application caches all layer their own rules on top.

The result is countless sites launched with fragile, inconsistent, or outright broken caching policies. They miss out on significant infrastructure cost savings, frustrate users with sluggish performance, and buckle under loads that better-configured systems would easily handle.

This guide aims to rectify that. In the following sections, we will thoroughly explore the HTTP caching ecosystem:

How headers like Cache-Control, Expires, ETag, and Age function, both individually and collectively.
How browsers, CDNs, and application-level caches interpret and enforce these headers.
The common pitfalls and misconceptions that can challenge even experienced developers.
Practical strategies for static assets, HTML documents, APIs, and more.
Modern browser behaviors, including BFCache, speculation rules, and signed exchanges.
The realities of CDNs, with a detailed examination of Cloudflare’s defaults, peculiarities, and advanced capabilities.
How to effectively debug and verify caching in real-world scenarios.

By the end, you will not only grasp the nuanced interplay of HTTP caching headers but also gain the knowledge to design and implement a caching strategy that makes your sites faster, more cost-effective, and more reliable.

The Business Case for Caching

Caching is critical because it directly influences four fundamental aspects of a site's performance and scalability:

Speed

Caching eliminates unnecessary network requests. A memory-cache hit in the browser is virtually instantaneous, a stark contrast to the 100–300ms typical for a handshake and first byte without caching. Multiply this across dozens of assets, and the result is smoother page loads, improved Core Web Vitals, and more satisfied users.

Resilience

During demand surges, cache hits significantly amplify capacity. If 80% of traffic is absorbed by a CDN edge, your origin servers only need to manage the remaining 20%. This difference can determine whether your site successfully navigates high-traffic events like Black Friday or collapses under a viral traffic spike.

Cost

Every cache hit means one less expensive origin request. CDN bandwidth is generally affordable; however, uncached origin hits consume CPU, necessitate database queries, and generate outbound traffic, all of which incur costs. Even a 5–10% improvement in cache hit ratio can directly translate into thousands of dollars saved at scale. This doesn't even account for requests cached in users’ browsers, which bypass the CDN entirely!

SEO

Caching enhances both speed and efficiency for search engines. Bots are less aggressive when they detect effective caching headers, allowing them to conserve crawl budget for fresher and deeper content. Furthermore, faster pages directly contribute to Google’s performance signals.

Real-world Scenarios

A news website successfully avoids a meltdown during a breaking story because 95% of requests are served from the CDN cache.
An API under sustained load continues to respond consistently due to stale-if-error and validator-based revalidation.
An e-commerce platform smoothly handles Black Friday traffic because static assets and category pages are configured for long-lived caching at the edge.

Side note on the philosophy of caching

It is worth noting that a subtle anti-culture exists around caching. Some developers perceive it as a makeshift solution – a band-aid applied to slow systems, masking deeper architectural or design flaws. In an ideal world, every request would be inexpensive, every response instantaneous, rendering caching unnecessary. This vision has merit: designing inherently fast systems can indeed circumvent the complexity and fragility that caching can introduce.

In reality, most systems operate in an unpredictable environment, facing sudden demand spikes, long geographic distances, and variable loads. Even the most meticulously architected applications benefit from caching as an amplifier. The key is balance: caching should never excuse poor underlying performance, but it should consistently be a component of your strategy to scale and maintain resilience during traffic surges.

Mental Model: Who Caches What?

Before delving into the intricate details of headers and directives, it is helpful to understand the landscape of who is actually caching your content. Caching isn't a singular event occurring in one place; it's an ecosystem of layers, each with its own rules, scope, and peculiarities.

Browsers

Every browser maintains both a memory cache and a disk cache. The memory cache is exceptionally fast but short-lived, lasting only while a page is open. It aims to prevent redundant network fetches during a single session and is not governed by HTTP caching headers; even resources marked no-store may be reused from memory if requested again within the same page. The disk cache, in contrast, persists across tabs and sessions, can store much larger resources, and does respect HTTP caching headers (though browsers may still apply their own heuristics when metadata is missing).

Proxies

Between the browser and the broader internet, requests often traverse proxies – particularly in corporate environments or ISP-managed networks. These proxies can function as shared caches, storing responses to reduce bandwidth costs or to enforce organizational policies. Unlike CDNs, you typically do not configure them yourself, and their behavior can be opaque.

For example, a corporate proxy might cache software downloads to avoid repeated gigabyte transfers across the same office connection. An ISP might cache popular news images to enhance load times for its customers. The challenge is that these proxies do not always perfectly respect HTTP caching headers; they may apply their own heuristics or overrides. This can lead to inconsistencies, such as a user behind a proxy viewing a stale or stripped-down response long after it should have expired.

While less visible than browser or CDN caches, proxies remain a significant part of the ecosystem. They serve as a reminder that caching is not always under the site owner’s direct control, and that network intermediaries can influence freshness, reuse, and even correctness.

Side note on transparent ISP proxies

In the early 2000s, many ISPs deployed “transparent” proxies that cached popular resources without users or site owners even being aware. These proxies still appear in some regions today. They sit silently between the browser and the origin, caching opportunistically to conserve bandwidth. The drawback is that they sometimes completely disregard cache headers, delivering outdated or inconsistent content. If you have ever noticed a site behaving differently at home versus on mobile data, a transparent proxy might have been the reason.

Shared Caches

Between users and origin servers exist a variety of shared caches – CDNs like Cloudflare or Akamai, ISP-level proxies, corporate gateways, or reverse proxies. These shared layers can dramatically reduce origin load, but they operate with their own logic and can sometimes override or reinterpret origin instructions.

Reverse Proxies

Technologies such as Varnish or NGINX can serve as local accelerators positioned in front of your application servers. They intercept and cache responses close to the origin, effectively smoothing traffic spikes and offloading significant work from your application or database.

Application and Database Caches

Within your stack, systems like Redis or Memcached store fragments of rendered pages, precomputed query results, or sessions. These are not governed by HTTP headers – you define their keys and TTLs yourself – but they are crucial components of the overall caching ecosystem.

Cache Keys and Variants

Every cache requires a method to determine whether two requests are "the same thing." This decision is made using a cache key – essentially, the unique identifier for a stored response.

By default, a cache key is based on the scheme, host, path, and query string of the requested resource. However, in practice, browsers add more dimensions. Most implement double-keyed caching, where the top-level browsing context (the site you are currently on) is also part of the key. This explains why your browser cannot reuse a Google Font downloaded while visiting one site when another, unrelated site requests the exact same font file – each receives its own cache entry, even with an identical URL.

Modern browsers are evolving towards triple-keyed caching, which also incorporates subframe context into the key. This means a resource requested within an embedded iframe may have its own cache entry, distinct from the same resource requested by the top-level page or another iframe. This design enhances privacy (by limiting cross-site tracking via shared cache entries) but also reduces opportunities for cache reuse.

Adding another layer of complexity, HTTP introduces the Vary header. This instructs caches that certain request headers should also be part of the cache key.

Examples:

Vary: Accept-Encoding → store one copy compressed with gzip, another with brotli.
Vary: Accept-Language → store separate versions for en-US versus de-DE.
Vary: Cookie → every unique cookie value creates a separate cache entry (often catastrophic).
Vary: * → means "you can’t safely reuse this for anyone else," which effectively eliminates cacheability.

This mechanism is powerful and sometimes essential. If your server alters image formats based on Accept headers, or serves AVIF to browsers that support it, you must use Vary: Accept to avoid sending incompatible responses to clients unable to process them. However, Vary is easily misused. Carelessly adding Vary: User-Agent, Vary: Cookie, or Vary: * can explode your cache into thousands of nearly identical entries. The key is to vary only on headers that genuinely alter the response – nothing more.

This is where normalization becomes important. Intelligent CDNs and proxies can simplify cache keys by disregarding differences that are inconsequential. For instance:

Ignoring analytics query parameters (e.g., ?utm_source=...).
Treating all iPhones as the same “mobile” variant, rather than keying on every distinct device string.

The goal is to vary only on factors that genuinely change the response. Anything else results in wasted fragmentation and lower hit ratios.

Side note on No-Vary-Search

A new experimental header, No-Vary-Search, allows servers to instruct caches to ignore specific query parameters when determining cache keys. For example, you could treat ?utm_source= or ?fbclid= as irrelevant, thereby preventing your cache from fragmenting into thousands of variants. Currently, support is limited – Chrome only uses it in conjunction with speculation rules – but if adopted more widely, it could offer a standards-based method to normalize cache keys without relying on CDN configuration.

Freshness vs. Validation

Knowing who is caching your content and how they determine if two requests are identical only answers part of the question. The other crucial aspect is when a stored response can be reused.

Every cache, whether a browser or a CDN, must decide:

Is this copy still fresh enough to serve as-is?
Or has it become stale, requiring me to check with the origin?

This is the core trade-off in caching: freshness (serve immediately, fast but risky if outdated) versus validation (double-check with the origin, slower but guaranteed correct).

All the headers we will explore next – HTTP headers like Cache-Control, Expires, ETag, and Last-Modified – help guide this decision-making process.

Core HTTP Caching Headers

Now that we understand who caches content and how they make fundamental decisions, it’s time to examine the core components: the headers that control caching. These are the levers you use to influence every layer of the system – browsers, CDNs, proxies, and beyond.

Broadly, these fall into three categories:

Freshness controls: instruct caches on how long a response can be served without revalidation.
Validators: provide an efficient way to check if something has changed.
Metadata: describe how the response should be stored, keyed, or observed.

Let’s break them down.

The Date Header

Every response should carry a Date header. It represents the server’s timestamp for when the response was generated and serves as the baseline for all freshness and age calculations. If Date is missing or skewed, caches will make their own assumptions.

The Cache-Control (response) Header

This is the most critical header – the control panel for how content should be cached. It includes multiple directives, broadly categorized into two groups:

Freshness directives:

max-age: Specifies how long (in seconds) the response is fresh.
s-maxage: Similar to max-age, but applies exclusively to shared caches (e.g., CDNs). It overrides max-age in these contexts.
immutable: Signals that the resource will never change (ideal for versioned static assets).
stale-while-revalidate: Allows serving a stale response while a fresh one is fetched in the background.
stale-if-error: Permits serving stale content if the origin is unavailable or returns errors.

Storage/use directives:

public: The response may be stored by any cache, including shared ones.
private: The response may be cached only by the browser, not by shared caches.
no-cache: Store the response, but revalidate it before serving.
no-store: Do not store the response at all.
must-revalidate: Once stale, the response must be revalidated before use.
proxy-revalidate: Similar to must-revalidate, but specifically targets shared caches.

The Cache-Control (request) Header

Browsers and clients can also send caching directives. These do not alter the server’s headers but influence how caches along the request path behave.

no-cache: Forces revalidation (but permits the use of stored entries).
no-store: Bypasses caching entirely.
only-if-cached: Instructs to return a cached response if available; otherwise, return an error (useful for offline scenarios).
max-age, min-fresh, max-stale: Fine-tune tolerance for staleness.

The Expires Header

An older method of defining freshness, relying on an absolute date/timestamp.

Example: Expires: Wed, 29 Aug 2025 12:00:00 GMT.

Ignored if Cache-Control: max-age is present.
Vulnerable to clock skew between servers and clients.
Still widely observed, often for backward compatibility.

The Pragma Header

The Pragma header dates back to HTTP 1.0 and was used to prevent caching before Cache-Control existed (specifically on requests, asking intermediaries to revalidate content before reuse). Modern browsers and CDNs now rely on Cache-Control, but some intermediaries and older systems still respect Pragma. Theoretically, it could accept any arbitrary name/value pairs; in practice, only one ever mattered:

Pragma: no-cache

For maximum compatibility – especially when dealing with mixed or legacy infrastructure – it is harmless to include both.

The Age Header

Age indicates how old the response is (in seconds) when delivered. It is intended to be set by shared caches, though not every intermediary implements it consistently. Browsers never set it. Treat it as a helpful signal, not an absolute truth.

Side note on Age

You will only ever observe Age headers from shared caches like CDNs or proxies. Why? Because browsers do not expose their internal cache state to the network – they simply serve responses directly to the user. Shared caches, conversely, need to communicate freshness downstream (to other proxies or to browsers), so they append Age. This is why you will often see Age: 0 on a fresh CDN hit, but never on a pure browser cache hit.

Validator Headers: ETag and Last-Modified

When freshness expires, caches use validators to avoid re-downloading the entire resource.

ETag:
- A unique identifier (opaque string) for a specific version of a resource.
- Strong ETags ("abc123") signify byte-for-byte identical content.
- Weak ETags (W/"abc123") indicate semantically identical content, though bytes may differ (e.g., re-gzipped).
Last-Modified:
- Timestamp of when the resource last changed.
- Less precise, but still valuable.
- Supports heuristic freshness when max-age/Expires are absent.

Conditional requests:

If-None-Match (with ETag) → server replies 304 Not Modified if unchanged.
If-Modified-Since (with Last-Modified) → similar, but based on date.

Both methods conserve bandwidth and reduce load, as only headers are exchanged.

Side note on strong vs weak ETags

An ETag serves as an identifier for a specific version of a resource. A strong ETag ("abc123") signifies byte-for-byte identical content – if even a single bit changes (such as whitespace), the ETag must change. A weak ETag (W/"abc123") means "semantically the same" – the content may have trivial differences (e.g., compressed differently, reordered attributes) but is still valid for reuse.

Strong ETags offer greater precision but can lead to cache misses if your infrastructure (e.g., different servers behind a load balancer) generates slightly varied outputs. Weak ETags are more forgiving but less strict. Both function with conditional requests; the choice balances precision against practicality.

Side note on ETags vs Cache-Control headers

Cache-Control directives are processed before the ETag. If Cache-Control determines that a resource is stale, the cache then uses the ETag (or Last-Modified) to revalidate with the origin. Consider it this way:

While fresh: The cache serves the copy immediately, with no validation.
When stale: The cache sends If-None-Match: "etag-value".

If the origin replies 304 Not Modified, the cache can continue using the stored copy without re-downloading the entire resource. Without Cache-Control, the ETag might be used for heuristic freshness or unconditional revalidation – but this typically involves more frequent trips back to the origin. The two are designed to work in conjunction: Cache-Control defines the lifetime, and ETags handle the check-ups.

The Vary Header

The Vary header instructs caches which request headers should be incorporated into the cache key. It enables a single URL to have multiple valid cached variants. For example, if a server responds with Vary: Accept-Encoding, the cache will store one copy compressed with gzip and another compressed with brotli. Each encoding is treated as a distinct object, with the appropriate one selected based on the subsequent request.

This flexibility is powerful but easily misused. Setting Vary: * is effectively equivalent to stating "this response can never be safely reused for anyone else," rendering it uncacheable in shared caches. Similarly, Vary: Cookie is notorious for plummeting hit rates because every unique cookie value generates a separate cache entry.

The optimal approach is to keep Vary minimal and intentional. Only vary on headers that genuinely alter the response in a meaningful way. Anything else merely fragments your cache, diminishes efficiency, and introduces unnecessary complexity.

Observability Helpers

Modern caches don't just make silent decisions – they often append their own debugging headers to help you understand what occurred. The most significant of these is Cache-Status, a new standard that reports whether a response was a HIT or a MISS, how long it resided in cache, and sometimes even why it was revalidated. Many CDNs and proxies also employ the older X-Cache header for the same purpose, typically indicating a simple HIT or MISS flag. Cloudflare further distinguishes with its cf-cache-status header, which differentiates between HIT, MISS, EXPIRED, BYPASS, and DYNAMIC (among other values).

These headers are invaluable for tuning and debugging, as they reveal the cache’s actual decision-making rather than just echoing your origin’s intent. A response might appear cacheable on paper, but if you consistently observe MISS or DYNAMIC, it likely means the intermediary is not following your headers as expected.

Freshness & Age Calculations

Once you grasp who caches content and which headers govern their behavior, the next step is to observe how these elements converge in practice. Every cache – be it a browser, a CDN, or a reverse proxy – adheres to the same fundamental logic:

Determine how long the response should be considered fresh.
Calculate the response's current age.
Compare the two, and decide whether to serve, revalidate, or fetch anew.

This is the hidden mathematics driving every "cache hit" or "cache miss" you will ever encounter.

Freshness Lifetime

The freshness lifetime indicates how long a cache can serve a response without re-checking with the origin. To calculate this for a given request, caches prioritize the following HTTP response headers in a strict order of precedence:

Cache-Control: max-age (or s-maxage) → overrides everything else.
Expires → an absolute date, used only if max-age is absent.
Heuristic freshness → if neither of these directives is present, caches guess.

Example 1: max-age

Date: Tue, 29 Aug 2025 12:00:00 GMT
Cache-Control: max-age=300

Here, the server explicitly instructs caches, "This response is valid for 300 seconds after the Date." This means the response can be considered fresh until 12:05:00 GMT. After that, it becomes stale unless revalidated.

Example 2: Expires

Date: Tue, 29 Aug 2025 12:00:00 GMT
Expires: Tue, 29 Aug 2025 12:10:00 GMT

There is no max-age, but Expires provides an absolute cutoff. Caches compare the Date (12:00:00) with the Expires time (12:10:00). This establishes a 10-minute freshness window: the response is fresh until 12:10:00, then stale.

Example 3: Heuristic

Date: Tue, 29 Aug 2025 12:00:00 GMT
Last-Modified: Mon, 28 Aug 2025 12:00:00 GMT

With no max-age or Expires, caches resort to heuristics. Browsers employ varying approaches; Chrome typically uses 10% of the time since the last modification. In this case, the resource was last modified 24 hours ago, so the cache would be considered fresh for 2.4 hours (until approximately 14:24:00 GMT), after which revalidation is triggered.

Current Age

The current age is the cache’s estimate of how old the response is at this moment. The specification provides a formula, but we can simplify it into steps:

Apparent age = now – Date (if positive).
Corrected age = max(Apparent age, Age header).
Resident time = how long it has been stored in the cache.
Current age = Corrected age + Resident time.

Example 4: Simple case

Date: Tue, 29 Aug 2025 12:00:00 GMT
Cache-Control: max-age=60

The response was generated at 12:00:00 and reached the cache at 12:00:05, so it already appeared to be 5 seconds old upon arrival. With no Age header present, the cache then retained it for an additional 15 seconds, making the total current age 20 seconds. Since the response had a max-age of 60 seconds, it was still considered fresh.

Example 5: With Age header

Date: Tue, 29 Aug 2025 12:00:00 GMT
Age: 30
Cache-Control: max-age=60

The origin sends a response stamped with Date: 12:00:00 and also includes Age: 30, indicating that an upstream cache had already held it for 30 seconds. When a downstream cache receives it at 12:00:40, it appears 40 seconds old. The cache takes the higher of the two values (40 vs 30) and then adds the 20 seconds it resides locally until 12:01:00. This results in a total current age of 60 seconds – precisely matching the max-age=60 limit. At this point, the response is no longer fresh and must be revalidated.

Decision Tree

Once a cache has both numbers:

If current age < freshness lifetime → Serve immediately (fresh hit).
If current age ≥ freshness lifetime →
- If stale-while-revalidate → Serve stale content now, revalidate it in the background.
- If stale-if-error and origin is failing → Serve stale content.
- Else → Revalidate with origin (conditional GET/HEAD).

Example 6: stale-while-revalidate

Cache-Control: max-age=60, stale-while-revalidate=30

A response has Cache-Control: max-age=60, stale-while-revalidate=30. At 12:01:10, the cache’s copy is 70 seconds old – 10 seconds beyond its freshness window. Ordinarily, this would necessitate revalidation before serving. However, stale-while-revalidate permits the cache to instantly serve the stale copy while refreshing it in the background. Because the copy is only 10 seconds into its 30-second stale allowance, the cache can safely serve it while updating in parallel.

Example 7: stale-if-error

Cache-Control: max-age=60, stale-if-error=600

Another response has Cache-Control: max-age=60, stale-if-error=600. At 12:02:00, the copy is 120 seconds old – well past its 60-second freshness lifetime. The cache attempts to fetch a fresh version, but the origin returns a 500 error. Thanks to stale-if-error, the cache is allowed to fall back to its stale copy for up to 600 seconds while the origin remains unavailable, ensuring the user still receives a response.

Why this Matters

Understanding this underlying math clarifies many instances of "unusual" behavior:

A resource expiring "too soon" might be due to a short max-age or a non-zero Age header.
A response that appears stale but is served anyway may be covered by stale-while-revalidate or stale-if-error.
A 304 Not Modified response doesn’t signify caching failure – it indicates that the cache correctly revalidated and conserved bandwidth.

Caches are not mysterious black boxes. They are simply executing these calculations thousands of times per second, across millions of resources. Once you comprehend the math, their behavior becomes predictable – and controllable. However, in practice, developers often stumble over subtle defaults and misleading directive names. Let’s address those misconceptions directly.

Common Misconceptions & Gotchas

Even experienced developers frequently misconfigure caching. The directives are subtle, the defaults are quirky, and the interactions are easily misunderstood. Here are some of the most common pitfalls.

`no-cache` ≠ “don’t cache”

The name is misleading. no-cache actually means "store this, but revalidate before reusing it." Browsers and CDNs will readily retain a copy, but they will always check back with the origin before serving it. If you truly wish nothing to be stored, you need no-store.

`no-store` means nothing is kept

no-store is the nuclear option. It instructs every cache – browser, proxy, CDN – not to retain a copy at all. Every single request goes directly to the origin. While appropriate for highly sensitive data (e.g., banking information), it is overkill for most use cases. Many sites employ it reflexively, forfeiting significant performance gains.

`max-age=0` vs `must-revalidate`

They appear similar but are not identical. max-age=0 means "this response is immediately stale." Without must-revalidate, caches are technically permitted to reuse it briefly under certain conditions (e.g., if the origin is temporarily unavailable). Adding must-revalidate removes that leeway, compelling caches to always verify with the origin once freshness has expired.

`s-maxage` vs `max-age`

max-age applies universally – to both browsers and shared caches. s-maxage, however, applies only to shared caches like CDNs or proxies, and it overrides max-age in those contexts. This allows you to set a short freshness window for browsers (e.g., max-age=60) but a longer one at the CDN (s-maxage=600). Many developers are unaware that s-maxage even exists.

`immutable` misuse

immutable instructs browsers "this resource will never change, don’t bother revalidating it." This is excellent for fingerprinted assets (like app.9f2d1.js) that are versioned by filename. However, it is dangerous for HTML or any resource that might change under the same URL. Using it incorrectly can lock users into stale content for months.

Redirect and Error Caching

Caches can and do store redirects and even error responses. A 301 redirect is cacheable by default (often permanently). Even a 404 or 500 error may be cached briefly, depending on headers and CDN settings. Developers are frequently surprised when "temporary" outages persist because an error response was cached.

Clock Skew and Heuristic Surprises

Caches compare Date, Expires, and Age headers to determine freshness. If clocks are not perfectly synchronized, or if explicit headers are absent, caches revert to heuristics. This can lead to unexpected expiry behavior. Explicit freshness directives are always a safer choice.

Cache Fragmentation: Devices & Geography

Caching is straightforward when one URL maps to one response. It becomes complex when responses vary by device or region.

Device splits: Sites often deliver different HTML or JavaScript for desktop versus mobile. If keyed on User-Agent, every browser/version combination creates a separate cache entry, resulting in a collapse of cache hit rates. Safer options include normalizing User-Agent strings at the CDN, or utilizing Client Hints (Sec-CH-UA, DPR) with carefully controlled Vary headers.
Geo splits: Serving different content by region (e.g., India vs. Germany) often involves Accept-Language or GeoIP rules. However, every language combination (en, en-US, en-GB) generates a new cache key. Unless you normalize by region/ruleset, your cache will fragment into thousands of variants.

The trade-off is clear: increased personalization generally leads to decreased caching efficiency. Once these traps are understood, we can transition from theory to practice. Here are the caching "recipes" you can use for different content types.

Patterns & Recipes

Now that we have covered the mechanics and common pitfalls, let’s examine how to implement caching effectively. These are the patterns you will frequently utilize, adapted for various content types.

Static Assets (JS, CSS, fonts)

Goal: Serve instantly, never revalidate, safe to cache for a very long time.

Typical headers:

Cache-Control: public, max-age=31536000, immutable

Why:

Fingerprinted filenames (e.g., app.9f2d1.js) guarantee uniqueness, allowing old versions to remain cached indefinitely.
A long max-age ensures they practically never expire.
immutable prevents browsers from wasting time revalidating.

HTML Documents

The appropriate TTL depends on how frequently your HTML changes and how quickly those changes must appear. Use one of these profiles, and pair long edge TTLs with event-driven purging upon publication or update.

Profile A: High-change (news/homepages):

Cache-Control: public, max-age=60, s-maxage=300, stale-while-revalidate=60, stale-if-error=600
ETag: "abc123"

Rationale: Keep browsers very fresh (1 minute), allow the CDN to cushion load for 5 minutes, serve content briefly stale while revalidating for a snappy user experience, and withstand origin wobbles.

Profile B – Low-change (blogs/docs):

Cache-Control: public, max-age=300, s-maxage=86400, stale-while-revalidate=300, stale-if-error=3600
ETag: "abc123"

Rationale: Browsers can reuse content for a few minutes; the CDN can hold it for a day to significantly reduce origin traffic. Upon publication or edit, purge the specific page (and related pages) to implement changes instantaneously.

Logged-in / personalized pages:

Cache-Control: private, no-cache
ETag: "abc123"

Rationale: Allow browser storage but enforce revalidation every time; never share at the CDN.

Side note on long HTML TTLs are safe with event-driven purge

You can set very long CDN cache expiration times (hours, even days) for HTML, provided you actively bust the cache on important events: publish, update, unpublish. Utilize CDN features like Cache Tags / surrogate keys to purge collections (e.g., “post-123”, “author-jono”), and trigger purges from your CMS. This provides the best of both worlds: instant updates when necessary, and rock-solid performance otherwise.

If updates must appear within seconds with no manual purge → keep short CDN TTLs (≤5m) + stale-while-revalidate.
If updates are event-driven (publish/edit) → use long CDN TTLs (hours/days) + automatic purge by tag.
If content is personalized → don’t share (use private, no-cache + validators).

APIs

Goal: Balance freshness with performance and resilience.

Typical headers:

Cache-Control: public, s-maxage=30, stale-while-revalidate=30, stale-if-error=300
ETag: "def456"

Why:

Shared caches (CDNs) can serve results for 30 seconds, reducing load.
stale-while-revalidate maintains low latency even as responses are refreshed.
stale-if-error ensures reliability if the backend fails.
Clients can efficiently revalidate with ETags.

Side note on why APIs use short `s-maxage` + `stale-while-revalidate`

APIs often provide data that changes frequently, but not every single second. A short s-maxage (e.g., 30s) allows shared caches like CDNs to absorb most requests, while still ensuring data remains reasonably fresh. Adding stale-while-revalidate smooths the edges: even if the cache needs to fetch a new copy, it can instantly serve the slightly stale one while revalidating in the background. This keeps latency low for users. The combination offers a sweet spot: low origin load, fast responses, and data that is "fresh enough" for most real-world use cases.

Authenticated Dashboards & User-specific Pages

Goal: Prevent shared caching, but allow browser reuse.

Typical headers:

Cache-Control: private, no-cache
ETag: "ghi789"

Why:

private ensures only the end-user’s browser caches the response.
no-cache permits reuse, but mandates validation first.
ETags prevent full downloads on every request.

Side note on the omission of `max-age`

For user-specific content, you cannot risk serving stale data. This is why the recipe employs private, no-cache but omits max-age.

no-cache signifies that the browser may retain a local copy, but must revalidate it with the origin before reusing it.

If you were to add max-age, you would be instructing the browser that it is safe to serve without checking – which could expose users to outdated account information or shopping cart contents.

Pairing no-cache with an ETag provides the best of both worlds: safety (always validated) and efficiency (cost-effective 304 Not Modified responses instead of re-downloading everything).

Side note on security

When handling or presenting sensitive data, you may wish to use private, no-store instead, to prevent the browser from storing a locally available cached version. This reduces the likelihood of leaks on devices used by multiple users, for example.

Images & Media

Goal: Cache efficiently across devices, while serving the correct variant.

Typical headers:

Cache-Control: public, max-age=86400
Vary: Accept-Encoding, DPR, Width

Why:

A one-day freshness window balances speed with flexibility – images can change, but not as frequently as HTML.
Vary enables different versions to be cached for diverse devices or display densities.
CDNs can normalize query parameters (e.g., ignore utm_*) and intelligently collapse variants to prevent fragmentation.

Side note on client hints

Modern browsers send Client Hints like DPR (device pixel ratio) and Width (intended display width) when requesting images. If your server or CDN supports responsive images, it can generate and return different variants – e.g., a high-resolution version for a retina phone, a smaller one for a low-resolution laptop.

By including Vary: DPR, Width, you are instructing caches: "Store separate copies depending on these hints." This ensures the correct variant is reused for future requests with identical device characteristics.

The catch? Every new DPR or Width value creates a new cache key. If you do not normalize (e.g., bucket widths into sensible breakpoints), your cache can fragment into hundreds of variants. CDNs often provide built-in rules to manage this.

Beyond Headers: Browser Behaviors

HTTP headers establish the rules, but browsers possess their own layers of optimization that can mimic "caching" – or interfere with it. These behaviors do not adhere to the same rules as Cache-Control or ETag, and they frequently confuse developers during debugging.

Back/Forward Cache (BFCache)

What it is: A full-page snapshot (DOM, JavaScript state, scroll position) retained in memory when a user navigates away.

Why it matters: Navigating "back" or "forward" feels instantaneous because the browser restores the page without even touching HTTP caches.

Gotchas: Many pages are not BFCache-eligible. Common blockers include unload handlers, long-lived connections, or the use of certain browser APIs. Another subtle but important factor is Cache-Control: no-store on the document itself – this instructs the browser not to retain any copy, which extends to BFCache. Chrome has recently introduced a small set of exceptions where no-store pages can still enter BFCache in safe cases, but for the most part, if BFCache eligibility is desired, you should avoid no-store on documents.

Side note on BFCache vs HTTP Cache

BFCache is akin to pausing and resuming a tab – the entire page state is frozen and restored. HTTP caching, conversely, only stores network resources. A page might fail BFCache but still load quickly thanks to HTTP cache hits (or vice versa).

Hard Refresh vs Soft Reload

Soft reload (e.g., pressing the browser’s reload button): The browser will use cached responses if they are still fresh. If stale, it revalidates.
Hard refresh (e.g., opening DevTools and right-clicking the reload button for a fuller reload, or enabling the “disable cache” option): The browser bypasses the cache, re-fetching all resources from the origin.

Gotcha: Users may assume that "refresh" always retrieves new content – but unless it is a hard refresh, caches still apply.

Speculation Rules & Link Relations

Browsers offer developers tools to (pre)load resources before the user explicitly requests them. These mechanisms do not alter how caching works, but they can influence what ends up in the cache ahead of time.

Prefetch: The browser may speculatively fetch resources and place them in cache, but only for a short duration. If not used quickly, they will be evicted.
Preload: Resources are fetched early and inserted into the cache so they are ready by the time the parser requires them.
Prerender: The entire page and its subresources are loaded and cached in advance. When a user navigates, everything loads directly from the cache rather than the network.

Speculation rules API: Eviction, freshness, and validation generally adhere to normal caching rules – but prerendering introduces some exceptions. For example, Chrome may prerender a page even if it is marked with Cache-Control: no-store or no-cache. In such cases, the prerendered copy resides in a temporary store that is not part of the standard HTTP cache and is discarded once the prerender session concludes (though this behavior may vary by browser).

The key takeaway: speculation rules pertain to cache timing, not cache policy. They front-load work to warm the cache, but freshness and expiry remain governed by your headers.

Signed Exchanges (SXG)

Signed exchanges also do not alter cache mechanics, but they change who can serve cached content while preserving origin authenticity. An SXG is a package containing a response and a cryptographic signature from the origin. Intermediaries (such as Google Search) can store and serve this package from their own caches. When the browser receives it, it can trust the content as if it originated from your domain, while still applying your headers for freshness and validation.

The catch: SXGs have their own signature expiry in addition to your normal caching headers. Even if your Cache-Control permits reuse, the SXG may be discarded once its signature is outdated. SXGs also support varying by cookie, meaning they can package and serve different signed variants based on cookie values. While this enables personalized experiences to be cached and distributed via SXG, it heavily fragments the cache – every cookie combination creates a new variant.

Key takeaway: SXG introduces an additional clock (signature lifetime) and, if cookie variation is used, another source of cache fragmentation. Your headers still govern freshness, but these extra layers can shorten reuse windows and multiply cache entries.

CDNs in Practice: Cloudflare

Thus far, we have concentrated on how browsers handle caching and the directives that control freshness and validation. However, for most modern websites, the first and most important cache your traffic will encounter is not the browser – it is the CDN.

Cloudflare is one of the most widely used CDNs, fronting millions of sites. It serves as an excellent example of how shared caches do not simply passively adhere to your headers. They introduce defaults, overrides, and proprietary features that can fundamentally alter how caching operates in practice. Understanding these peculiarities is essential for aligning your origin headers with your CDN’s behavior.

Defaults and HTML Caching

By default, Cloudflare does not cache HTML at all. Static assets like CSS, JavaScript, and images are happily stored at the edge, but documents are always passed through to the origin unless you explicitly enable “Cache Everything.” This default often catches site owners by surprise: they assume Cloudflare is protecting their servers, when in reality their most expensive requests – the HTML pages themselves – are still hitting the backend every time.

The temptation then arises to activate “Cache Everything.” However, this blunt instrument applies indiscriminately, even to pages that vary by cookie or authentication state. In such scenarios, Cloudflare can end up serving cached private dashboards or logged-in user data to the wrong individuals.

The safer approach is more nuanced: bypass the cache when a session cookie is present, but cache aggressively when the user is anonymous. This strategy ensures that public pages benefit from edge caching, while private content is always fetched fresh from the origin.

Side note on Cloudflare’s APO addon

Cloudflare’s Automatic Platform Optimization (APO) addon integrates with WordPress websites and rewrites caching behavior so HTML can be cached safely while respecting logged-in cookies. It’s a good example of CDNs layering platform-specific heuristics on top of standard HTTP logic.

Edge vs Browser Lifetimes

Your origin headers – such as Cache-Control and Expires – define how long a browser should retain a resource. However, CDNs like Cloudflare introduce another layer of control with their own settings, such as “Edge Cache TTL” and s-maxage. These apply only to what Cloudflare stores at its edge servers, and they can override what the origin specifies without altering browser behavior.

This separation is both powerful and confusing. From the browser’s perspective, you might observe max-age=60 and assume the content is cached for just a minute. Meanwhile, Cloudflare could continue serving the same cached copy for ten minutes because its edge cache TTL is set to 600 seconds. The result is a split reality: browsers refresh often, but Cloudflare still shields the origin from repetitive requests.

Cache Keys and Fragmentation

Cloudflare uses the full URL as its cache key. This means every distinct query parameter – whether it is a tracking token like ?utm_source=… or something trivial like ?v=123 – generates a separate cache entry. Left unchecked, this behavior quickly fragments your cache into hundreds of nearly identical variants, each consuming space while simultaneously reducing the hit rate.

It is important to note that canonical URLs are not helpful here. Cloudflare does not regard what your HTML declares as the “true” version of a page; it caches based on the literal request URL it receives. To prevent fragmentation, you need to explicitly normalize or disregard unnecessary parameters in Cloudflare’s configuration, ensuring that trivial differences do not splinter your cache.

Side note on normalising cache keys

Cloudflare allows you to define which query parameters to ignore or how to collapse variants. Stripping out analytics parameters, for example, can dramatically improve cache hit ratios.

Device and Geography Splits

Cloudflare also enables you to customize cache keys by including request headers, such as User-Agent or geo-based values. In theory, this facilitates fine-grained caching – one version of a page for mobile devices, another for desktop, or distinct versions for visitors in different countries.

However, in practice, unless you aggressively normalize these inputs, it can lead to massive fragmentation. Caching by raw User-Agent means every browser and version string generates its own entry, instead of collapsing them into a simple “mobile vs desktop” split. The same issue arises with geographic rules: caching by full Accept-Language headers, for example, can create thousands of variants when only a handful of languages are truly necessary.

When managed carefully, device and geography splits allow you to serve tailored content from the cache. When managed carelessly, they decimate your hit rate and multiply origin load.

Cache Tags

Cloudflare also supports tagging cached objects with labels – for example, tagging every page of a blog post with blog-post-123. These tags enable you to purge or revalidate entire groups of resources at once, rather than expiring them individually.

For CMS-driven sites, this is a powerful tool: when an article is updated, the site can trigger a purge for its tag and instantly invalidate every related URL. However, over-tagging – attaching too many labels to too many resources – is common and can undermine efficiency, making purge operations slower or less predictable.

Other Caching Layers in the Stack

So far, we have concentrated on browser caches, HTTP directives, and CDNs like Cloudflare. However, many sites incorporate even more layers between the user and the origin. Reverse proxies, application caches, and database caches all contribute to what a "cached" response actually means.

These layers do not always communicate via HTTP – Redis does not concern itself with Cache-Control, and Varnish can readily override your origin headers. Yet, they profoundly shape the user experience, infrastructure load, and the challenges of cache invalidation. To comprehend caching in a real-world context, you must understand how these components stack and interact.

Application & Database Caches

Within the application tier, technologies like Redis and Memcached are frequently employed to store session data, fragments of rendered pages, or precomputed query results. An e-commerce site, for instance, might cache its "Top 10 Products" list in Redis for sixty seconds, saving hundreds of database queries each time a page loads. This is remarkably efficient – until it isn't.

One common failure mode occurs when the database updates, but the Redis key is not cleared at the opportune moment. In such a scenario, the HTTP layer happily serves "fresh" pages that are already outdated because they are pulling from stale Redis data underneath.

Conversely, the inverse problem happens just as often. Imagine the application has correctly refreshed Redis with a new product price, but the CDN or reverse proxy still holds an HTML page cached with the old price. The origin had informed the outer cache that the page was valid for five minutes, so until the TTL expires (or someone manually purges it), users continue to see stale HTML – even though Redis already contains the update.

In essence: sometimes HTTP caches appear fresh while Redis is stale, and sometimes Redis is fresh while HTTP caches are stale. Both failure modes stem from the same core issue – multiple caching layers, each with its own logic, falling out of sync.

Reverse Proxy Caches

One layer closer to the edge, reverse proxies such as Varnish or NGINX frequently sit in front of application servers, caching entire responses. In principle, they respect HTTP headers, but in practice, they are typically configured to enforce their own rules. A Varnish configuration might, for example, impose a five-minute lifetime on all HTML pages, regardless of what the origin headers dictate. This is excellent for resilience during a traffic spike but dangerous if the content is time-sensitive. Developers frequently encounter this mismatch: they open DevTools, inspect the origin’s headers, and assume they understand what is occurring – not realizing that Varnish is rewriting the rules one hop earlier.

Service Workers

Service Workers introduce another cache layer within the browser, situated between the network and the page. Unlike the built-in HTTP cache, which merely follows headers, the Service Worker Cache API is programmable. This means developers can intercept requests and, using JavaScript, decide whether to serve from cache, fetch from the network, or perform an entirely different action.

This capability is powerful: a Service Worker can precache assets during installation, create custom caching strategies (stale-while-revalidate, network-first, cache-first), or even rewrite responses before delivering them back to the page. It forms the foundation of Progressive Web Apps (PWAs) and offline support.

However, it comes with pitfalls. Because Service Workers can disregard origin headers and devise their own logic, they can drift out of sync with the HTTP caching layer. For example, you might set Cache-Control: max-age=60 on an API, but a Service Worker programmed to “cache forever” will happily serve stale results long after they should have expired. Debugging also becomes more intricate: responses might appear cacheable in DevTools but actually be served from a Service Worker’s script.

The key takeaway: Service Workers do not replace HTTP caching – they build upon it. They provide developers with fine-grained control but also add another layer where issues can arise if caching strategies conflict.

Layer Interactions

The true complexity emerges when all these layers interact. A single request might traverse the browser cache, then Cloudflare, then Varnish, and finally Redis. Each layer possesses its own rules regarding freshness and invalidation, and they do not always align perfectly. You might purge the CDN and believe you have resolved an issue, only for the reverse proxy to continue serving its stale copy. Or you might flush Redis and repopulate the data, only to discover the CDN is still delivering the “old” version it cached earlier. These types of mismatches are the fundamental cause of many mysterious “cache bugs” that manifest in production.

Debugging & Verification

With numerous caching layers in play – browsers, CDNs, reverse proxies, application stores – the most challenging aspect of working with caching is often determining which cache served a response and why. Debugging caching is not about examining a single header; it involves tracing requests through the stack and verifying the behavior of each layer.

Inspecting Headers

The initial step is to meticulously examine the headers. Standard fields like Cache-Control, Age, ETag, Last-Modified, and Expires convey the origin’s intent. However, they do not reveal what the caches actually did. For that, you need the debugging signals added along the path:

Age shows how long a response has resided in a shared cache. If it’s 0, the response likely came directly from the origin. If it’s 300, you know a cache has been serving the same object for five minutes.
X-Cache (used by many proxies) or cf-cache-status (Cloudflare) indicate whether a cache hit or miss occurred.
Cache-Status is the emerging standard, adopted by CDNs like Fastly, which reports not only HIT/MISS but also why a decision was made.

Collectively, these headers form the breadcrumb trail that reveals the response’s journey.

Using Browser DevTools

The Network panel in Chrome or Firefox’s DevTools is indispensable for observing cache behavior from the user’s perspective. It shows whether a resource originated from disk cache, memory cache, or over the network.

Memory cache hits are nearly instantaneous but short-lived, surviving only within the current tab/session.
Disk cache hits persist across sessions but may be evicted.
304 Not Modified responses indicate that the browser successfully revalidated the cached copy with the origin.

It is also beneficial to test with different reload types. A normal reload (Ctrl+R) may utilize cached entries, while a hard reload (Ctrl+Shift+R) completely bypasses them. Understanding which type of reload you are performing helps avoid false assumptions about the cache’s actions.

CDN Logs and Headers

If you are using a CDN, its logs and headers are often the most reliable source of truth. Cloudflare’s cf-cache-status, Akamai’s X-Cache, and Fastly’s Cache-Status headers all disclose edge decisions. Most providers also offer logs or dashboards where you can observe hit/miss ratios and TTL behavior at scale.

For example, if you consistently see cf-cache-status: MISS or BYPASS on every request, it typically means Cloudflare is not storing your HTML at all – either because it is following defaults (no HTML caching), or because a cookie is bypassing the cache. Debugging at the edge often boils down to correlating what your origin sent, what the CDN reports it did, and what the browser ultimately received.

Reverse Proxies and Custom Headers

Reverse proxies like Varnish or NGINX can be more opaque. Many deployments add custom headers such as X-Cache: HIT or X-Cache: MISS to reveal proxy behavior. If these are unavailable, logs are your fallback: Varnish’s varnishlog and NGINX’s access logs can both indicate whether a request was served from cache or passed through.

The tricky aspect is remembering that reverse proxies can silently override headers. If you observe Cache-Control: no-cache from the origin but a five-minute TTL in Varnish, the headers in DevTools will not tell the complete story. You require the proxy’s own debugging signals for verification.

Following the Request Path

When in doubt, systematically trace the request chain:

Browser → Check DevTools: was it memory, disk, or network?
CDN → Inspect cf-cache-status, Cache-Status, or X-Cache.
Proxy → Look for custom headers or logs to confirm whether the request hit the local cache.
Application → Determine if Redis/Memcached served the data.
Database → If all else fails, confirm the query executed.

Walking through layer by layer helps isolate where the stale copy resides. It is rarely the case that “the cache is broken.” More often, one cache is misaligned while the others are functioning perfectly.

Common Debugging Mistakes

Developers repeatedly fall into a few traps:

Only looking at browser headers: These indicate what the origin intended, not what the CDN actually did.
Assuming 304 Not Modified means no caching: In fact, it signifies that the cache did store the response and successfully revalidated it.
Forgetting about cookies: A stray cookie can cause a CDN to bypass caching entirely.
Testing with hard reloads: A hard reload bypasses the cache, so it does not reflect normal user experience. The same applies if you enable the “Disable cache” checkbox in DevTools – that setting forces every request to skip caching entirely while DevTools is open. Both are useful for troubleshooting but provide an artificial view of performance that real users will never encounter.
Ignoring multi-layer conflicts: Purging the CDN but forgetting to clear Varnish, or clearing Redis but leaving a stale copy at the edge.

Effective debugging is less about clever tricks and more about being systematic: check each layer, verify its decision, and compare it against your expectations from the headers.

Caching in the AI-mediated Web

Until now, we have treated caching as a dialogue between websites, browsers, and CDNs. However, increasingly, the consumers of your site are not human users at all – they are search engine crawlers, LLM training pipelines, and agentic assistants. These systems heavily rely on caching, and your headers can shape not only performance but also how your brand and content are represented in machine-mediated contexts.

Crawl & Scrape Efficiency

Search engines and scrapers depend on HTTP caching to avoid re-downloading the entire web daily. Misconfigured caching can cause crawlers to unnecessarily burden your origin, or worse, lead them to abandon deeper pages if revalidation proves too costly. Well-tuned headers maintain efficient crawling and ensure that fresh updates are rapidly discovered.

Training Data Freshness

LLMs and recommendation systems ingest web content at scale. If your resources are consistently marked no-store or no-cache, they may be re-fetched inconsistently, resulting in patchy or outdated snapshots of your site within training corpora. Conversely, stable cache policies help ensure that the data incorporated into these models is consistent and representative.

Agentic Consumption

In an AI-mediated web, agents may act on behalf of users – shopping bots, research assistants, travel planners. For these agents, speed and reliability are paramount signals. A site with poor caching may appear slower or less consistent than its competitors, potentially biasing agents against recommending it. In this sense, caching is not merely about performance for humans – it is about competitiveness in machine-driven decision-making.

Fragmentation Risks

If caches serve inconsistent or fragmented variants – split by query strings, cookies, or geography – that noise propagates into machine understanding. A crawler or model might encounter dozens of subtly different versions of the same page. The outcome is not just poor cache efficiency; it is a fractured representation of your brand in training data and agent outputs.

Wrapping up: Caching as Strategy

Caching is often treated as a technical detail, an afterthought, or a temporary fix for performance issues. However, the truth is more profound: caching is infrastructure. It is the nervous system that keeps the web responsive under load, shields brittle origins, and shapes how both humans and machines experience your brand.

When poorly configured, caching makes sites slower, more fragile, and more expensive. It fragments the user experience, confuses crawlers, and contaminates the data well for AI systems already struggling to comprehend the web. When configured effectively, it is invisible – things simply feel fast, resilient, and trustworthy.

That is why caching cannot be left to chance or to defaults. It demands a deliberate strategy, as fundamental to digital performance as security or accessibility. A strategy that spans multiple layers – browser, CDN, proxy, application, database. A strategy that understands not just how to shave milliseconds for a single user, but how to present a coherent, consistent version of your site to millions of users, crawlers, and agents simultaneously.

The web is not becoming simpler. It is becoming faster, more distributed, more automated, and more machine-mediated. In this evolving world, caching is not a relic of old performance playbooks. It is the bedrock upon which your site will scale, how it will be perceived, and how it will compete.

Caching is not an optimization. It’s a strategy.

A Complete Guide to HTTP Caching

Table of Contents

The Business Case for Caching

Speed

Resilience

Cost

SEO

Real-world Scenarios

Side note on the philosophy of caching

Mental Model: Who Caches What?

Browsers

Proxies

Side note on transparent ISP proxies

Shared Caches

Reverse Proxies

Application and Database Caches

Cache Keys and Variants

Side note on No-Vary-Search

Freshness vs. Validation

Core HTTP Caching Headers

The Date Header

The Cache-Control (response) Header

The Cache-Control (request) Header

The Expires Header

The Pragma Header

The Age Header

Side note on Age

Validator Headers: ETag and Last-Modified

Side note on strong vs weak ETags

Side note on ETags vs Cache-Control headers

The Vary Header

Observability Helpers

Freshness & Age Calculations

Freshness Lifetime

Current Age

Decision Tree

Why this Matters

Common Misconceptions & Gotchas

no-cache ≠ “don’t cache”

no-store means nothing is kept

max-age=0 vs must-revalidate

s-maxage vs max-age

immutable misuse

Redirect and Error Caching

Clock Skew and Heuristic Surprises

Cache Fragmentation: Devices & Geography

Patterns & Recipes

Static Assets (JS, CSS, fonts)

HTML Documents

Side note on long HTML TTLs are safe with event-driven purge

APIs

Side note on why APIs use short s-maxage + stale-while-revalidate

Authenticated Dashboards & User-specific Pages

Side note on the omission of max-age

Side note on security

Images & Media

Side note on client hints

Beyond Headers: Browser Behaviors

Back/Forward Cache (BFCache)

Side note on BFCache vs HTTP Cache

Hard Refresh vs Soft Reload

Speculation Rules & Link Relations

Signed Exchanges (SXG)

CDNs in Practice: Cloudflare

Defaults and HTML Caching

Side note on Cloudflare’s APO addon

Edge vs Browser Lifetimes

Cache Keys and Fragmentation

Side note on normalising cache keys

Device and Geography Splits

Cache Tags

Other Caching Layers in the Stack

Application & Database Caches

Reverse Proxy Caches

Service Workers

Layer Interactions

Debugging & Verification

Inspecting Headers

Using Browser DevTools

CDN Logs and Headers

`no-cache` ≠ “don’t cache”

`no-store` means nothing is kept

`max-age=0` vs `must-revalidate`

`s-maxage` vs `max-age`

`immutable` misuse

Side note on why APIs use short `s-maxage` + `stale-while-revalidate`

Side note on the omission of `max-age`