Skip to main content

To subset or not subset fonts

One of the niche things you can do to improve the performance on your website or web app is to subset your fonts.

If you are not familiar, subsetting is the act of removing glyphs and other associated information from a font file. You can cherrypick individual glyphs, as well as specifying entire ranges of information. For example, you could take a copy of Arial and yank out the glyphs used for box drawing.

A selection of glyphs from a typeface, with the diamond, spade, and club glyphs removed.

Removing only a few characters will have a near-imperceptible effect on performance. However, removing large swaths of information from a typeface can have a dramatic impact. A common example of where you’d want to do this is removing support for different languages from a typeface.

Languages

Most workhorse typefaces include glyphs that let them express not only English, but multiple other languages. Not every font covers every language, so it’s something you’ll need to consider if you’re thinking about implementing localization.

Noto

Google’s Noto project is an admirable attempt to create typefaces that captures every written language, to ensure that language can be expressed digitally in a well-designed way.

It is also a very popular typeface, with Noto Sans claiming 409,037,860,189 total views since being added to Google Fonts at the time this post is published.

Noto Sans also clocks in at 12.2 MB. That’s ~591% the size of the median desktop website size in 2020. Now before you go panicking, know that the version of Noto served up on Google Fonts is a lot different than the font you download on the Noto project page.

Hosting and optimization

There is a ton of optimization that goes on behind the scenes for cloud-hosted fonts, things like dynamic subsetting and other tricks people far smarter than me have set up. These features ensure small font files are dispatched quickly to the browsers requesting them, without developers having to think too hard about it.

By my calculation, Google Font’s version of Noto weighs in at ~32 kb when served to an evergreen desktop browser in the Northeast United States with a system language set to English. That’s not bad at all.

Locally hosted fonts

Hosting your fonts is usually faster than using a cloud service. However, one of the tradeoffs that comes with this approach is that you sacrifice all that smart optimization magic that cloud-hosted font engineers have done behind the scenes. This means the served font size usually bumps back up.

Choosing to host your fonts locally means that you need to pay close attention to the total file size created by every font you add. This includes each weight, as well as its italic counterpart.

Variable fonts have a lot of potential for this situation, provided the size of a variable font is smaller than using a careful application of only one or two weights. Variable fonts may also have net-less HTTP requests, another important performance aspect to consider.

Inclusive Design

Responsive Design is designing for an unknown browser. Inclusive Design is designing for an unknown user.

I don’t know who is visiting my website, why, or what they’ll do to it while they’re there. As a responsible host, I choose to include those foreign language glyphs so that if the page is translated it can be read.

This is also why I’m wary of approaches that subset fonts based on what characters are visible on your website or web app. I want my content to look the way it’s intended to, regardless of who is visiting my site and what languages they speak. I especially don’t want a situation where content is replaced with tofu.

Screenshot of the Netflix homepage. Each letter has been replaced by a square tofu glyph character.
Oh no.

Budgeting

One of the reasons I feel comfortable serving non-subsetted fonts is that I maintain a performance budget for my site. Because I serve lightweight HTML and CSS and minimal JavaScript, I have the opportunity to spend more bandwidth points loading nice-looking fonts.

Update: I now serve a local font stack for even more fabulous savings.

My site is predominately written words, so this prioritization makes sense. It is an intentional choice that emphasizes typography.

That being said

An approach that is built from a disciplined performance budget is great for small projects where I have a lot of control over the environment. I’m aware of the consequences of decisions at scale, so I try to be more flexible for clients I consult with about performance decisions.

Fabulous savings

For larger clients, removing glyphs can have a dramatic impact not only on load times, but on hosting bills. This can translate to tens, if not hundreds of thousands of dollars in savings.

Larger clients also tend to have more infrastructure for these kinds of concerns, especially if they have localization set up. I feel a lot more comfortable making the subsetting call in this sort of situation, in that I’m more confident there will be developers who can understand and maintain something like this after I leave.

Inclusive Design, revisited

In some situations subsetting might not be ideal. But in other circumstances it could be a net-better approach. What I’m thinking of here is low and no income and unhoused populations.

Client websites and app tend to be, er, more heavy and fragile than what I make on my own time. I might not be able to convince them to move away from a SPA-based architecture and yank out three generations’ worth of framework bloat, but I can get them to replace a handful of files.

Every visit a website or web app eats away at someone’s data plan, so every little bit you can lower your payload size counts. Again, since you can’t know who is visiting your site or why, you shouldn’t be making assumptions about if your content is the kind of content these populations will be using.

To be more blunt about this: a poor person should be able to look at a luxury good on their phone without it killing their data plan, and it’s none of our damn business as to why they’re there.

Future state

The CSS Media Queries Level 5 Spec includes a new user query I’m quite excited about: prefers-reduced-data.

Much as how someone can currently express a desire for a reduced motion experience, we’ll soon be able to conditionally target people who want to save on data.

Here's some pseudocode of what that might look like, although the actual implementation might be a lot more complicated:

/* Only load Noto Sans if data saver mode is not enabled */
@media (prefers-reduced-data: no-preference) {
@font-face {
font-family: "Noto Sans";
font-weight: 400;
src:
url("noto-sans.woff2") format("woff2"),
url("noto-sans.woff") format("woff");
}
}

/*
Use Noto Sans if it's available, otherwise
fallback to another available sans-serif font
*/

body {
font-family: "Noto Sans", "Helvetica Neue", Helvetica, Arial, sans-serif;
}

This means that the barrier to being kinder to the people using our website or web app is lowered. Configuring a server is tricky. I can convince a client to let me add a few lines of CSS the same way I’m able to swap some font files out.

Why we need this user query in the first place, as well as questioning our defaults is a Whole Other Thing.

It depends

Tech teaches you to think in binary. The real world is anything but.

Technical decisions aren’t done in a vacuum, and there’s many parameters to consider. The most important of these is the people who will be affected by them.