Why Does Webpack Mangle My Filenames?

I've been writing about Webpack for well over a week now, and we haven't talked about the elephant in the room.

Why does Webpack take nice, human-readable filenames like index.js and kitty-lolz.jpg and change them to main-a572965581558bddf992.js and 16ce1537430d8823fcd6.jpg?

Those long strings of hexadecimal digits are hashes. They're deterministic strings, based on the content of the file. If the content of a file changes, then its hash changes. If the content of the file does not change, its hash remains the same between builds. Why's that useful? Remember that your browser is highly optimized for loading and displaying content. When the browser loads a resource, it will save a cached version for a period of time. If it encounters a request for that resource again, it will load the cached version instead of making a request to the server and re-downloading it.

<head>
  <script src="/main-a572965581558bddf992.js" defer></script>
  <link href="/styles-30d8823fcd616ce15374.css" rel="stylesheet"></link>
</head>

Here's a hypothetical fragment of HTML. When you visit this page the first time, your browser will download the html, but also the JavaScript and CSS files that are linked here. When you return to this page later in the day, your browser will serve the script and stylesheet from cache, saving your poor overtaxed ISP a few kilobytes.

Now say you make an update to your script and redeploy. Now when your browser loads this page, it sees a different filename for the script tag, but the same for the stylesheet:

<head>
  <script src="/main-1558b6558f992dda5729.js" defer></script>
  <link href="/styles-30d8823fcd616ce15374.css" rel="stylesheet"></link>
</head>

The browser will download the new script, but load the stylesheet from cache.

That's pretty cool, but do we really need to force the user to download a whole new main bundle, which could be multiple megabytes for a single page app, just because we've made some tiny change?

No. No we do not.

Webpack gives us the ability to configure our bundles into what are known as chunks. There are many strategies and opinions about the right way of chunking your app, but a common way to do it is to have a main bundle that is all of your custom application code and a vendor bundle which is all of the code that you install from other sources (ie. npm), since vendor code will change much less frequently. This way, a small change to one file will only trigger a new download to the chunk that contains that file, rather than the entire application. Tomorrow we'll look at some of these strategies.

For those of you who've made it to the end, I'll leave you with a conundrum to ponder: We can't append a hash to index.html, because that has to be reachable by a user typing it in directly. And yet the browser caches that HTML document. On a subsequent visit, if there is a new JS file called main-d2e3a6d3b2e3ef.js, how does the browser even see that updated script tag? It has to know somehow to download the updated HTML document to get it, but the HTML doc hasn't change its filename. How does your browser know that the HTML has changed and it should be re-downloaded?

Email me back if you have a theory.

Next Up:
Webpack Bundle Splitting

Previously:
Where Do All These Files Come From?


Want to impress your boss?

Useful articles delivered to your inbox. Learn to to think about software development like a professional.

Icon