There is a format called 'mhtml' that inlines assets

2 min read

This post is part of my Today I learned series in which I share all my learnings regarding web development.

Last week I attended my first IndieWebCamp in Berlin. In case you don't know IndieWeb on their site is written that IndieWeb is "a people-focused alternative to the 'corporate web'".

But what does that mean? It focuses on three things:

  • owning your own data
  • connecting of websites without third parties
  • giving control to people doing things in the web

These three points are huge so when you're interested in learning more check their site. IndieWeb has very interesting ideas and technologies that are definitely worth a look.

But back to topic... at the event I learned about MHTML. What is that?

MTHML stands for MIME Encapsulation of Aggregate HTML Documents and it's purpose is to archive a particular site. The idea of this format is to include all needed resources in a single document so that you really have a complete state when saving it as .mhtml. This way you can e.g. document the progress or the evolution of a site quite nicely.

I gave it a shot and what I first discovered is that .mthml files are actually saved as HTML emails which is interesting.

From: 
X-Snapshot-Version: 1.0
X-Snapshot-Title: =?utf-8?Q?Stefan's web dev journey?=
X-Snapshot-Content-Location: https://www.stefanjudis.com/
Subject: Stefan's web dev journey
Date: Mon, 12 Nov 2017 13:17:49 -0000
MIME-Version: 1.0
Content-Type: multipart/related;
    type="text/html";
    boundary="----MultipartBoundary--YzyWG99RMsqjUz8wY8LIZ2io3lmh8mmIKJAd7bejyV----"

------MultipartBoundary--YzyWG99RMsqjUz8wY8LIZ2io3lmh8mmIKJAd7bejyV----
Content-Type: text/html
Content-ID: 
Content-Transfer-Encoding: quoted-printable
Content-Location: https://www.stefanjudis.com/

<!DOCTYPE html><html data-n-head=3D"lang" lang=3D"en" class=3D"gr__stefanju=
dis_com">...</html>

Going further I checked how images are included in the document.

<!-- I have no idea how this works -->
<img src=3D"https://images.contentful.=
com/f20lfrunubsq/qZrrGql6VwaoE2YU8CUuE/a4276082432402bd90933ab3de335bf7/sto=
ck-photo-227882431.jpg?w=3D150" alt=3D"Stefan Judis" class=3D"c-person__ima=
ge" width=3D"142" height=3D"142">

I expected images to be inlined in the image elements but they are not. You can find the images inlined at the end of the file.

------MultipartBoundary--YzyWG99RMsqjUz8wY8LIZ2io3lmh8mmIKJAd7bejyV----
Content-Type: image/jpeg
Content-Transfer-Encoding: base64
Content-Location: https://images.contentful.com/f20lfrunubsq/qZrrGql6VwaoE2YU8CUuE/a4276082432402bd90933ab3de335bf7/stock-photo-227882431.jpg?w=150

/9j/4AAQSkZJRgABAQEASABIAAD/4gxYSUNDX1BST0ZJTEUAAQEAAAxITGlubwIQAABtbnRyUkdC...

I don't really understand how this works technically but I think it's good to know that you can safe a complete website in a single document. Maybe I start archiving my own site every month? :D

If you want to start using .mhtml check the Wikipedia entry. It describes how you can use it in your browser of choice.

Tags

Load time