Exploring the usage of prefetch headers

Lion Ralfs — Posted on

On the web, prefetching describes the practice of downloading something before the user actually needs it. This is done to speed up the time it takes to display something when the user really needs it. This can be anything from images, scripts or stylesheets to HTML-documents itself. Especially for documents, it might be wise to somehow indicate that a certain request is not a "real" user request, but instead a prefetch request. This can in turn instruct advertising or analytics software to filter out those requests, which are otherwise indifferentiable. This post explores approaches of how requests are marked as prefetch requests.

link rel=prefetch

Prefetch links look like this: <link rel="prefetch" href="/about.html" /> and are a standardized way to tell browsers about a resource that might be required soon.[1][2] The browser uses its own heuristics (preferably when it's idling) of when to send the request. The response is then stored in a cache for later usage. Unfortunately, browsers use different methods of tagging prefetch requests.

To the best of my knowledge, Firefox was the first browser to document their use of a specific header that is sent along with a prefetch request. They chose to use X-Moz: prefetch, at least since 2002[3]. It has not changed since then.

In 2010, Safari added a X-Purpose: prefetch header to all <link rel="prefetch"> requests[4][5]. They considered using Mozilla's already established X-Moz: prefetch header, but decided against it[6]. In the discussion, they also mention that the HTML5 working group should probably standardize and register the header[7].

However, only 3 months later, Safari dropped the X- prefix[8], based on an IETF draft outlining the deprecation of X- headers[9].

Side note: Who knew that X- stands for eXperimental/eXtension? Not me. It makes sense to deprecate a syntax for experimental headers when servers either use them anyways or have to keep supporting them for backwards compatibility.

Today, <link rel="prefetch"> is disabled by default in Safari, sitting behind a feature flag. When enabled, the Purpose: prefetch header is used.

Google Chrome currently uses Purpose: prefetch. This has been the case since 2018[10] as part of NoStatePrefetch, which also powers <link rel="prerender">[11]. Before that, just like Safari, Chrome used the X-Purpose: prefetch header since 2010[12].

Because of CORS issues, when the "purpose" header is added, the Chrome Team is looking to revisit the name of this header, considering both "Sec-Purpose" and "Sec-Fetch-Purpose"[13][14].

Since Edge is just a fancy Chromium these days, it also uses Purpose: prefetch.

To summarize, this is what browsers currently use:

BrowserPrefetch HeaderPrerender Header
FirefoxX-Moz: prefetchprerender not supported[15]
ChromePurpose: prefetchPurpose: prefetch
SafariPurpose: prefetchprerender not supported[15]
EdgePurpose: prefetchPurpose: prefetch

I've added prerendering headers (originating from <link rel="prerender">) to the table even though this post is about prefetching, to highlight that they generally use the same headers.

Previews

Chrome offers an "instant search" feature, where it preemptively fetches websites as a user types into the address bar. This, in combination with autocomplete and suggestions, offers a great user experience, as the website is already downloaded. For this feature, apparently a X-Purpose: preview header is used[16]. Safari's "Top Sites" also made use of this header[17].

Facebook's in-app browser/renderer is also capable of downloading HTML as a link appears in a user's viewport. They add an X-Purpose: preview header, claiming it is "standard industry practice"[18].

Additional thoughts

From what I've gathered, browsers paid attention to server-side tracking and gave developers a way to distinguish requests based on certain headers. Unfortunately, they are not standardized and therefore browsers have different opinions on which header to use.

A problem that you might encounter is the following: if you track web traffic, but filter out requests based on the prefetch headers, your "real" requests also disappear. Because when a resource is prefetched, and a user eventually uses that resource, no extra request is fired, so the server has no knowledge over the actual usage of prefetched resources. Ignoring the fact that tracking purely based on access logs screams "error-prone", Mozilla outlines an approach to avoid this issue[19]:

  1. For a prefetch request, respond with a Cache-control: must-revalidate header.
  2. When a browser uses the prefetched resource, it is allowed to load it from cache, but it must send a validation request to the server. These are conditional requests, where the browser sends a If-Modified-Since or If-None-Match header (along with an ETag), to which the server can respond differently (200 vs. 304 status code for example).
  3. If the validation request results in a 304, use the prefetched version. If not, use the response from the validation request.

This adds a bit of extra delay, as validation requests have to use the network. Also it is not always feasible to add must-revalidate to HTTP-Requests. However, they show up in our logs and allow us to continue using server-side traffic analysis.

From my own testing, it looks like Safari and Firefox follow this approach. Chrome/Edge seem to not send a validation request and serve straight from the (prefetch) cache.

Just a few things to consider when employing prefetching in conjunction with server-side traffic analysis.

Also, if you're interested in adding prefetching to your site, quicklink is a fantastic little library[20], for automatic prefetching of links in a users viewport.

Update Nov. 28, 2021: Added prerender headers.

References