There is a format called 'mhtml' that inlines assets
- Published at
- Updated at
- Reading time
Last week I attended my first IndieWebCamp in Berlin. In case you don't know IndieWeb on their site is written that IndieWeb is "a people-focused alternative to the 'corporate web'".
But what does that mean? It focuses on three things:
- owning your own data
- connecting of websites without third parties
- giving control to people doing things in the web
These three points are huge so when you're interested in learning more check their site. IndieWeb has very interesting ideas and technologies that are definitely worth a look.
But back to topic... at the event I learned about
MHTML. What is that?
MTHML stands for MIME Encapsulation of Aggregate HTML Documents and it's purpose is to archive a particular site. The idea of this format is to include all needed resources in a single document so that you really have a complete state when saving it as
. This way you can e.g. document the progress or the evolution of a site quite nicely.
I gave it a shot and what I first discovered is that
files are actually saved as HTML emails which is interesting.
From: <Saved by Blink> X-Snapshot-Version: 1.0 X-Snapshot-Title: =?utf-8?Q?Stefan's web dev journey?= X-Snapshot-Content-Location: https://www.stefanjudis.com/ Subject: Stefan's web dev journey Date: Mon, 12 Nov 2017 13:17:49 -0000 MIME-Version: 1.0 Content-Type: multipart/related; type="text/html"; boundary="----MultipartBoundary--YzyWG99RMsqjUz8wY8LIZ2io3lmh8mmIKJAd7bejyV----" ------MultipartBoundary--YzyWG99RMsqjUz8wY8LIZ2io3lmh8mmIKJAd7bejyV---- Content-Type: text/html Content-ID: <firstname.lastname@example.org> Content-Transfer-Encoding: quoted-printable Content-Location: https://www.stefanjudis.com/ <!DOCTYPE html><html data-n-head=3D"lang" lang=3D"en" class=3D"gr__stefanju= dis_com">...</html>
Going further I checked how images are included in the document.
<!-- I have no idea how this works --> <img src=3D"https://images.contentful.= com/f20lfrunubsq/qZrrGql6VwaoE2YU8CUuE/a4276082432402bd90933ab3de335bf7/sto= ck-photo-227882431.jpg?w=3D150" alt=3D"Stefan Judis" class=3D"c-person__ima= ge" width=3D"142" height=3D"142">
I expected images to be inlined in the image elements but they are not. You can find the images inlined at the end of the file.
------MultipartBoundary--YzyWG99RMsqjUz8wY8LIZ2io3lmh8mmIKJAd7bejyV---- Content-Type: image/jpeg Content-Transfer-Encoding: base64 Content-Location: https://images.contentful.com/f20lfrunubsq/qZrrGql6VwaoE2YU8CUuE/a4276082432402bd90933ab3de335bf7/stock-photo-227882431.jpg?w=150 /9j/4AAQSkZJRgABAQEASABIAAD/4gxYSUNDX1BST0ZJTEUAAQEAAAxITGlubwIQAABtbnRyUkdC...
I don't really understand how this works technically but I think it's good to know that you can safe a complete website in a single document. Maybe I start archiving my own site every month? :D
If you want to start using
check the Wikipedia entry. It describes how you can use it in your browser of choice.
Yes? Cool! You might want to check out Web Weekly for more quick learnings. The last edition went out 8 days ago.