Highlights of the HTTP Archive Web Almanac

Published at
Updated at
Reading time
8 min

I spent my Sunday reading the HTTP archive web almanac and shared the surprising and interesting pieces in a Twitter thread. I like to own my data – so here we have the thread on my own site. Enjoy!


Spending my Sunday morning reading the Web Almanac sharing internet stats and analyzing HTTP Archive data for 2019.

I'll share facts and stats that I think are interesting in a thread. 👇 :)

#

Only ~80% of sites compress their JavaScript files. 😲

Chart showing how many sites compress their JavaScript files using gzip (~65%) or brotli (~15%)

Source

Edited: I tweeted initially that it's 65% because I missed the fact that gzip and brotli should count together. 🙈

#

jQuery still powers 85% of the crawled sites

It always feels like React/Vue/Angular are all over the internet – they're not... jQuery still powers 85% of the crawled sites. 😲

Table showing the usage of libraries – jQuery is on first position with being included on 85% of the crawled sites

Source

#

React powers 5% of the crawled sites

The numbers for sites using "cutting-edge" frameworks are relatively low with React being the most popular with ~5% on desktop.

Distribution showing how many sites are using React, Angular or others. Only ~5% of sites use react.

Source

#

ES modules are barely used

Even though the ES Module support is quite good these days they are not really used.

~1% is surprisingly low because you can use a fallback strategy shipping a single bundle using the `nomodule` attribute and use modules for supporting browsers today.

Chart showing the usage of ES Modules. Only roughly 1% of crawled sites use modules

Source

#

Low usage of source maps in production

Only ~20% of sites use source maps? 😲

Chart showing the usage of source maps – roughly 20%

Source

#

CSS flexbox and grid usage

Roughly 50% of sites use flexbox – only 2% use grid.

Chart showing that 50% of crawled sites use flexbox

Graphic showing that only 2% use CSS grid

Source

#

The highest found z-index value

incredible high z-index value with dozens of 9s and an !important

Source

#

Usage of responsive images

Only 20% of sites make use of responsive images...

Chart showing the usage of responsive image markup: `sizes` 18%, `srcset` 21% and the picture element 2% usage

Source

#

Usage of the alt attribute on images

No surprise here, but yeah... image alt attributes are not used as much as they should. :/

Paragraph from the almanac highlighting the following sentence: Only 39% of images use alt text that is longer than six characters.

Source

Edited: As Boris Schapira pointed out, images can be hidden from assistive technology by providing an empty alt attribute (alt=""). This fact was not taken into consideration by the Almanac and makes the statistic meaningless.

#

Usage of font-display

26% of the pages use font-display. 😲 That's surprisingly high in my opinion. Because the support is not super-duper yet. I wonder how big google fonts' influence is in this trend. 🙈

Graphic: 26% of pages use font-display

Source

#

HTTPS adoption

Honestly, I expected fewer sites being served over a secure connection. 80% of sites ship with https these days.

Chart showing that 80% of sites are served via HTTPS (mobile and desktop)

Source

#

HSTS adoption

12 - 14% of sites use HSTS to ensure they are only accessible by supporting browsers via HTTPS. This is also higher then I expected. 😲

Table showing the usage of HSTS: 12 (mobile) / 14 (desktop) percent use the `max-age` directive, 3 percent use `include-subdomains` and 2 percent use `preload`

Source

#

CSP adoption

I got this statistic by myself recently, but it's still sooooo low. 😿

Only roughly 5% of crawled sites use Content-Security-Policy (CSP).

Paragraph from the article: We find that only 5.51% of desktop pages include a CSP and only 4.73% of mobile pages include a CSP, likely due to the complexity of deployment.

Source

#

The state of contrast issues

4 of 5 sites ship with color contrast issues. I really wish that we get better at this. :/

Paragraph of the article: Only 22.04% of sites gave all of their text sufficient color contrast. Or in other words: 4 out of every 5 sites have text which easily blends into the background, making it unreadable

Source

#

The often missing language attribute

26% of the pages don't specify the language of their content. This can trouble text-to-speech technology like screenreaders.

Paragraph of the article: Of the pages analyzed, 26.13% do not specify a language with the lang attribute.

Source

#

Accessible forms

4 of 5 forms don't ship with labels for their input elements. :/ I'm used to these bad numbers, but well... filling out forms can be tough for everybody (even tech people), we really have to get better at this. :/

Paragraph of the article with highlighted text: Sadly, only 22.33% of pages provide labels for all their form inputs, meaning 4 out of every 5 pages have forms that may be very difficult to fill out.

Source

#

Missing headings

10% of sites ship without headings at all. 😲

Paragraph of the article with highlighted text: Despite the importance of headings, 10.67% of pages have no heading tags at all.

Source

#

Too short title elements

Google shows 50-60 characters in their search results. Generally speaking, the used title length is not optimal across the web. (at least for google)

Graph showing the distribution of title length: median value shows 20 characters for the title and 10 characters on the 25 percentile

Source

#

Service worker adoption

Service workers are mainstream, right? 🙈 Not really... Only 0.44% of the crawled sites register a service worker.

Graphic showing that 0.44% of the crawled sites register a service worker

Source

#

The jumpy state of websites

How often do we click the wrong thing because something moved around? Too often.. Jumpy pages are the standard... :/

2 of 3 pages have a huge content shift while loading.

CLS stands for Cumulative Layout Shift – more info.

Paragraph of the article: Nearly two out of every three sites (65.32%) have medium or large CLS (Cumulative Layout Shift) for 50% or more of all user experiences.

Source

#

Big enough touch targets

Speaking about tapping the wrong thing. Only 34% of the pages include big enough buttons and links...

Graphic showing sufficient tap targets... With explanation: As of now, 34.43% of sites have sufficiently sized tap targets. So we have quite a ways to go until 'fat fingering' is a thing of the past

Source

#

Wordpress' CMS domination

Wordpress usage is still massive. 75% of sites using a CMS are running on wordpress.

Graphic showing CMS distribution: Wordpress is on top with 75% followed by Drupal and Joomla (both below 10%)

Source

#

Page weight and number of requests of CMS sites

CMS pages are heavy and make many requests... I did Wordpress development in the past and that makes sense thinking of the audience and users of e.g. wordpress. "Just install another plugin"...

Resource consumption of CMS sites: median page weight is 2.3mb and median request count is ~85

Source

#

CDN adoption

HTML is mainly served from its origin server (80%). Most used CDN is cloudflare (10%). 😲

Distribution of CDN usage for HTML: - 80% origin - 9.61% cloudflare - 5.54% google

Source

#

Overall page weight

I thought the median value for page weight would be higher these days. :D On desktop it's 1.9MB and on mobile, it's 1.7MB. It's still fairly high though imo. 🙈 (and median is clearly only one piece of the puzzle)

Tables showing the distribution of page weight across mobile and desktop: - median page weight for desktop is 1.9MB and for mobile it's 1.7MB - 90 percentile for page weight is for desktop 6.9MB and for mobile it's 6.2MB

Source


And that's it. I highly recommend to check it out! It's a very fascinating and interesting read about the state of the internet. :)

Related Topics

See null comment.