About The Author

Chris Ashton is a Developer working for the Government Digital Service. When he’s not busy coding, he enjoys singing as a tenor in the BBC Symphony Chorus.
More about
Chris
Ashton

Data can be prohibitly expensive, especially in developing countries. Chris Ashton puts himself in the shoes of someone on a tight data budget and offers practical tips for reducing our websites’ data footprint.

This article is part of a series in which I attempt to use the web under various constraints, representing a given demographic of user. I hope to raise the profile of difficulties faced by real people, which are avoidable if we design and develop in a way that is sympathetic to their needs.

Last time, I navigated the web for a day using Internet Explorer 8. This time, I browsed the web for a day on a 50 MB budget.

Why 50 MB?

Many of us are lucky enough to be on mobile plans which allow several gigabytes of data transfer per month. Failing that, we are usually able to connect to home or public WiFi networks that are on fast broadband connections and have effectively unlimited data.

But there are parts of the world where mobile data is prohibitively expensive, and where there is little or no broadband infrastructure.

People often buy data packages of just tens of megabytes at a time, making a gigabyte a relatively large and therefore expensive amount of data to buy.

— Dan Howdle, consumer telecoms analyst at Cable.co.uk

Just how expensive are we talking?

The Cost Of Mobile Data

A 2018 study by cable.co.uk found that Zimbabwe was the most expensive country in the world for mobile data, where 1 GB cost an average of $75.20, ranging from $12.50 to $138.46. The enormous range in price is due to smaller amounts of data being very expensive, getting proportionally cheaper the bigger the data plan you commit to. You can read the study methodology for more information.

Zimbabwe is by no means a one-off. Equatorial Guinea, Saint Helena and the Falkland Islands are next in line, with 1 GB of data costing $65.83, $55.47 and $47.39 respectively. These countries generally have a combination of poor technical infrastructure and low adoption, meaning data is both costly to deliver and doesn’t have the economy of scale to drive costs down.

Data is expensive in parts of Europe too. A gigabyte of data in Greece will set you back $32.71; in Switzerland, $20.22. For comparison, the same amount of data costs $6.66 in the UK, or $12.37 in the USA. On the other end of the scale, India is the cheapest place in the world for data, at an average cost of $0.26. Kyrgyzstan, Kazakhstan and Ukraine follow at $0.27, $0.49 and $0.51 per GB respectively.

The speed of mobile networks, too, varies considerably between countries. Perhaps surprisingly, users experience faster speeds over a mobile network than WiFi in at least 30 countries worldwide, including Australia and France. South Korea has the fastest mobile download speed, averaging 52.4 Mbps, but Iraq has the slowest, averaging 1.6 Mbps download and 0.7 Mbps upload. The USA ranks 40th in the world for mobile download speeds, at around 34 Mbps, and is at risk of falling further behind as the world moves towards 5G.

As for mobile network connection type, 84.7% of user connections in the UK are on 4G, compared to 93% in the USA, and 97.5% in South Korea. This compares with less than 50% in Uzbekistan and less than 60% in Algeria, Ecuador, Nepal and Iraq.

The Cost Of Broadband Data

Meanwhile, a study of the cost of broadband in 2018 shows that a broadband connection in Niger costs $263 ‘per megabit per month’. This metric is a little difficult to comprehend, so here’s an example: if the average cost of broadband packages in a country is $22, and the average download speed offered by the packages is 10 Mbps, then the cost ‘per megabit per month’  would be $2.20.

It’s an interesting metric, and one that acknowledges that broadband speed is as important a factor as the data cap. A cost of $263 suggests a combination of extremely slow and extremely expensive broadband. For reference, the metric is $1.19 in the UK and $1.26 in the USA.

What’s perhaps easier to comprehend is the average cost of a broadband package. Note that this study was looking for the cheapest broadband packages on offer, ignoring whether or not these packages had a data cap, so provides a useful ballpark figure rather than the cost of data per se.

On package cost alone, Mauritania has the most expensive broadband in the world, at an average of $768.16 (a range of $307.26 to $1,368.72). This enormous cost includes building physical lines to the property, since few already exist in Mauritania. At 0.7 Mbps, Mauritania also has one of the slowest broadband networks in the world.

Taiwan has the fastest broadband in the world, at a mean speed of 85 Mbps. Yemen has the slowest, at 0.38 Mbps. But even countries with good established broadband infrastructure have so-called ‘not-spots’. The United Kingdom is ranked 34th out of 207 countries for broadband speed, but in July 2019 there was still a school in the UK without broadband.

The average cost of a broadband package in the UK is $39.58, and in the USA is $67.69. The cheapest average in the world is Ukraine’s, at just $5, although the cheapest broadband deal of them all was found in Kyrgystan ($1.27 — against the country average of $108.22).

Zimbabwe was the most costly country for mobile data, and the statistics aren’t much better for its broadband, with an average cost of $128.71 and a ‘per megabit per month’ cost of $6.89.

Absolute Cost vs Cost In Real Terms

All of the costs outlined so far are the absolute costs in USD, based on the exchange rates at the time of the study. These costs have not been accounted for cost of living, meaning that for many countries the cost is actually far higher in real terms.

I’m going to limit my browsing today to 50 MB, which in Zimbabwe would cost around $3.67 on a mobile data tariff. That may not sound like much, but teachers in Zimbabwe were striking this year because their salaries had fallen to just $2.50 a day.

For comparison, $3.67 is around half the $7.25 minimum wage in the USA. As a Zimbabwean, I’d have to work for around a day and a half to earn the money to buy this 50MB data, compared to just half an hour in the USA. It’s not easy to compare cost of living between countries, but on wages alone the $3.67 cost of 50 MB of data in Zimbabwe would feel like $52 to an American on minimum wage.

Setting Up The Experiment

I launched Chrome and opened the dev tools, where I throttled the network to a slow 3G connection. I wanted to simulate a slow connection like those experienced by users in Uzbekistan, to see what kind of experience websites would give me. I also throttled my CPU to simulate being on a lower end device.

I opted to throttle my network to Slow 3G and my CPU to 6x slowdown. (Large preview)

I installed ModHeader and set the ‘Save-Data’ header to let websites know I want to minimise my data usage. This is also the header set by Chrome for Android’s ‘Lite mode’, which I’ll cover in more detail later.

I downloaded TripMode; an application for Mac which gives you control over which apps on your Mac can access the internet. Any other application’s internet access is automatically blocked.

Screenshot of Tripmode settings; Chrome is enabled, Mail is disabled
You can enable/disable individual apps from connecting to the internet with TripMode. I enabled Chrome. (Large preview)

How far do I predict my 50 MB budget will take me? With the average weight of a web page being almost 1.7 MB, that suggests I’ve got around 29 pages in my budget, although probably a few more than that if I’m able to stay on the same sites and leverage browser caching.

Throughout the experiment I will suggest performance tips to speed up the first contentful paint and perceived loading time of the page. Some of these tips may not affect the amount of data transferred directly, but do generally involve deferring the download of less important resources, which on slow connections may mean the resources are never downloaded and data is saved.

The Experiment

Without any further ado, I loaded google.com, using 402 KB of my budget and spending $0.03 (around 1% of my Zimbabwe budget).

402 KB transferred, 1.1 MB resources, 24 requests
402 KB transferred, 1.1 MB resources, 24 requests. (Large preview)

All in all, not a bad page size, but I wondered where those 24 network requests were coming from and whether or not the page could be made any lighter.

Google Homepage — DOM

Chrome devtools screenshot of the DOM, where I’ve expanded one inline style tag. (Large preview)

Looking at the page markup, there are no external stylesheets — all of the CSS is inline.

Performance Tip #1: Inline Critical CSS

This is good for performance as it saves the browser having to make an additional network request in order to fetch an external stylesheet, so the styles can be parsed and applied immediately for the first contentful paint. There’s a trade-off to be made here, as external stylesheets can be cached but inline ones cannot (unless you get clever with JavaScript).

The general advice is for your critical styles (anything above-the-fold) to be inline, and for the rest of your styling to be external and loaded asynchronously. Asynchronous loading of CSS can be achieved in one remarkably clever line of HTML:

The devtools show a prettified version of the DOM. If you want to see what was actually downloaded to the browser, switch to the Sources tab and find the document.

A wall of minified code.
Switching to Sources and finding the index shows the ‘raw’ HTML that was delivered to the browser. What a mess! (Large preview)

You can see there is a LOT of inline JavaScript here. It’s worth noting that it has been uglified rather than merely minified.

Performance Tip #2: Minify And Uglify Your Assets

Minification removes unnecessary spaces and characters, but uglification actually ‘mangles’ the code to be shorter. The tell-tale sign is that the code contains short, machine-generated variable names rather than untouched source code. This is good as it means the script is smaller and quicker to download.

Even so, inline scripts look to be roughly 120 KB of the 210 KB page resource (about half the 60 KB gzipped size). In addition, there are five external JavaScript files amounting to 291 KB of the 402 KB downloaded:

Network tab of DevTools showing the external javascript files
Five external JavaScript files in the Network tab of the devtools. (Large preview)

This means that JavaScript accounts for about 80 percent of the overall page weight.

This isn’t useless JavaScript; Google has to have some in order to display suggestions as you type. But I suspect a lot of it is tracking code and advertising setup.

For comparison, I disabled JavaScript and reloaded the page:

DevTools showing only 5 network requests
The disabled JS version of Google search was only 102 KB and had just 5 network requests. (Large preview)

The JS-disabled version of Google search is just 102 KB, as opposed to 402 KB. Although Google can’t provide autosuggestions under these conditions, the site is still functional, and I’ve just cut my data usage down to a quarter of what it was. If I really did have to limit my data usage in the long term, one of the first things I’d do is disable JavaScript. It’s not as bad as it sounds.

Performance Tip #3: Less Is More

Inlining, uglifying and minifying assets is all well and good, but the best performance comes from not sending down the assets in the first place.

  • Before adding any new features, do you have a performance budget in place?
  • Before adding JavaScript to your site, can your feature be accomplished using plain HTML? (For example, HTML5 form validation).
  • Before pulling a large JavaScript or CSS library into your application, use something like bundlephobia.com to measure how big it is. Is the convenience worth the weight? Can you accomplish the same thing using vanilla code at a much smaller data size?

Analysing The Resource Info

There’s a lot to unpack here, so let’s get cracking. I’ve only got 50 MB to play with, so I’m going to milk every bit of this page load. Settle in for a short Chrome Devtools tutorial.

402 KB transferred, but 1.1 MB of resources: what does that actually mean?

It means 402 KB of content was actually downloaded, but in its compressed form (using a compression algorithm such as gzip or brotli). The browser then had to do some work to unpack it into something meaningful. The total size of the unpacked data is 1.1 MB.

This unpacking isn’t free — there are a few milliseconds of overhead in decompressing the resources. But that’s a negligible overhead compared to sending 1.1MB down the wire.

Performance Tip #4: Compress Text-based Assets

As a general rule, always compress your assets, using something like gzip. But don’t use compression on your images and other binary files — you should optimize these in advance at source. Compression could actually end up making them bigger.

And, if you can, avoid compressing files that are 1500 bytes or smaller. The smallest TCP packet size is 1500 bytes, so by compressing to, say, 800 bytes, you save nothing, as it’s still transmitted in the same byte packet. Again, the cost is negligible, but wastes some compression CPU time on the server and decompression CPU time on the client.

Now back to the Network tab in Chrome: let’s dig into those priorities. Notice that resources have priority “Highest” to “Lowest” — these are the browser’s best guess as to what are the more important resources to download. The higher the priority, the sooner the browser will try to download the asset.

Performance Tip #5: Give Resource Hints To The Browser

The browser will guess at what the highest priority assets are, but you can provide a resource hint using the  tag, instructing the browser to download the asset as soon as possible. It’s a good idea to preload fonts, logos and anything else that appears above the fold.

Let’s talk about caching. I’m going to hold ALT and right-click to change my column headers to unlock some more juicy information. We’re going to check out Cache-Control.

Screenshot showing how to display cache-control information
There are lots of interesting fields tucked away behind ALT. (Large preview)

Cache-Control denotes whether or not a resource can be cached, how long it can be cached for, and what rules it should follow around revalidating. Setting proper cache values is crucial to keeping the data cost of repeat visits down.

Performance Tip #6: Set cache-control Headers On All Cacheable Assets

Note that the cache-control value begins with a directive of public or private, followed by an expiration value (e.g. max-age=31536000). What does the directive mean, and why the oddly specific max-age value?

Screenshot of Google network tab with cache-control column visible
A mixture of max-age values and public/private. (Large preview)

The value 31536000 is the number of seconds there are in a year, and is the theoretical maximum value allowed by the cache-control specification. It is common to see this value applied to all static assets and effectively means “this resource isn’t going to change”. In practice, no browser is going to cache for an entire year, but it will cache the asset for as long as makes sense.

To explain the public/private directive, we must explain the two main caches that exist off the server. First, there is the traditional browser cache, where the resource is stored on the user’s machine (the ‘client’). And then there is the CDN cache, which sits between the client and the server; resources are cached at the CDN level to prevent the CDN from requesting the resource from the origin server over and over again.

Diagram showing how caches interact with the server
Source. (Large preview)

A Cache-Control directive of public allows the resource to be cached in both the client and the CDN. A value of private means only the client can cache it; the CDN is not supposed to. This latter value is typically used for pages or assets that exist behind authentication, where it is fine to be cached on the client but we wouldn’t want to leak private information by caching it in the CDN and delivering it to other users.

Screenshot of Google logo cache-control setting: private, max-age=31536000
A mixture of max-age values and public/private. (Large preview)

One thing that got my attention was that the Google logo has a cache control of “private”. Other images on the page do have a public cache, and I don’t know why the logo would be treated any differently. If you have any ideas, let me know in the comments!

I refreshed the page and most of the resources were served from cache, apart from the page itself, which as you’ve seen already is private, max-age=0, meaning it cannot be cached. This is normal for dynamic web pages where it is important that the user always gets the very latest page when they refresh.

It was at this point I accidentally clicked on an ‘Explanation’ URL in the devtools, which took me to the network analysis reference, costing me about 5 MB of my budget. Oops.

Google Dev Docs

4.2 MB of this new 5 MB page was down to images; specifically SVGs. The weightiest of these was 186 KB, which isn’t particularly big — there were just so many of them, and they all downloaded at once.

Gif scrolling down the very long dev docs page
This is a loooong page. All the images downloaded on page load. (Large preview)

That 5 MB page load was 10% of my budget for today. So far I’ve used 5.5 MB, including the no-JavaScript reload of the Google homepage, and spent $0.40. I didn’t even mean to open this page.

What would have been a better user experience here?

Performance Tip #7: Lazy-load Your Images

Ordinarily, if I accidentally clicked on a link, I would hit the back button in my browser. I’d have received no benefit whatsoever from downloading those images — what a waste of 4.2 MB!

Apart from video, where you generally know what you’re getting yourself into, images are by far the biggest culprit to data usage on the web. A study of the world’s top 500 websites found that images take up to 53% of the average page weight. “This means they have a big impact on page-loading times and subsequently overall performance”.

Instead of downloading all of the images on page load, it is good practice to lazy-load the images so that only users who are engaged with the page pay the cost of downloading them. Users who choose not to scroll below the fold therefore don’t waste any unnecessary bandwidth downloading images they’ll never see.

There’s a great css-tricks.com guide to rolling out lazy-loading for images which offers a good balance between those on good connections, those on poor connections, and those with JavaScript disabled.

If this page had implemented lazy loading as per the guide above, each of the 38 SVGs would have been represented by a 1 KB placeholder image by default, and only loaded into view on scroll.

Performance Tip #8: Use The Right Format For Your Images

I thought that Google had missed a trick by not using WebP, which is an image format that is 26% smaller in size compared to PNGs (with no loss in quality) and 25-34% smaller in size compared to JPEGs (and of a comparable quality). I thought I’d have a go at converting SVG to WebP.

Converting to WebP did bring one of the SVGs down from 186 KB to just 65 KB, but actually, looking at the images side by side, the WebP came out grainy:

Comparison of the two images
The SVG (left) is noticeably crisper than the WebP (right). (Large preview)

I then tried converting one of the PNGs to WebP, which is supposed to be lossless and should come out smaller. However, the WebP output was *heavier* (127 KB, from 109 KB)!

Comparison of the two images
The PNG (left) is a similar quality to the WebP (right) but is smaller at 109 KB compared to 127 KB. (Large preview)

This surprised me. WebP isn’t necessarily the silver bullet we think it is, and even Google have neglected to use it on this page.

So my advice would be: where possible, experiment with different image formats on a per-image basis. The format that keeps the best quality for the smallest size may not be the one you expect.

Now back to the DOM. I came across this:

Screenshot of code
(Large preview)

Notice the async keyword on the Google analytics script?

Screenshot of performance analysis output of devtools
Google analytics has ‘low’ priority. (Large preview)

Despite being one of the first things in the head of the document, this was given a low priority, as we’ve explicitly opted out of being a blocking request by using the async keyword.

A blocking request is one that stops the rendering of the page. A

error: Content is protected !!