Zig Reproduced Without Binaries
I decided to bootstrap Zig without using binaries that are checked in the
repository and
answer if the resulting zig1.wasm
in the latest Zig release (0.13.0) is the
same the one bootstrapped without using those binaries.
TLDR: yes, they are the same:
$ sha256sum code/zig{,2}/stage1/zig1.wasm
127909fb8c9610ce3f296d8a48014546c0f85055115002fb3aba4d865dcdbb27 code/zig/stage1/zig1.wasm
127909fb8c9610ce3f296d8a48014546c0f85055115002fb3aba4d865dcdbb27 code/zig2/stage1/zig1.wasm
I can now confidently say (and you can also check, you don’t need to trust
me) that there is nothing hiding in zig1.wasm
that hasn’t been
checked-in as a source file.
Many, many thanks to Hilton Chain for reasons I that will become clear later. The rest of this post walks through how I arrived to this claim.
Official zig1.wasm
Steps to acquire the official incarnation of zig1.wasm
are straightforward:
download Zig, build zig3
using the official instructions, use it to
update-zig1
:
git clone https://github.com/ziglang/zig; cd zig
git checkout 0.13.0
mkdir build; pushd build
cmake ..
make -j$(nproc) install
popd
build/stage3/bin/zig build update-zig1
Which results in an updated code/zig/stage1/zig1.wasm
:
$ git diff --stat
stage1/zig1.wasm | Bin 2675178 -> 2800926 bytes
1 file changed, 0 insertions(+), 0 deletions(-)
We will be comparing this file to the one bootstrapped in the next section.
Binary-free zig1.wasm
Building Zig 0.13.0 without binaries is tricky, because to build Zig 0.13.0, we
need a zig1.wasm
, which has been checked in and continuously updated since
late 2022:
commit 20d86d9c63476b6312b87dc5b0e4aa4822eb7717
Author: Andrew Kelley <andrew@ziglang.org>
Date: 2022-11-13T01:35:20+02:00
add zig1.wasm.zst
This commit adds a 637 KB binary file to the source repository. This
commit does nothing else, so it should be replaced with a different
commit before this branch is merged to avoid bloating the git
repository.
stage1/zig1.wasm.zst | Bin 0 -> 652012 bytes
1 file changed, 0 insertions(+), 0 deletions(-)
Andrew’s motivation is reasonable from a Zig developer’s perspective. However, checked-in binary blobs have trust issues, regardless of what we think about the author.
The last commit that can1 be built without using binary blobs is the parent of this one:
commit 28514476ef8c824c3d189d98f23d0f8d23e496ea
Author: Andrew Kelley <andrew@ziglang.org>
Date: 2022-11-01T05:29:55+02:00
remove `-fstage1` option
After this commit, the self-hosted compiler does not offer the option to
use stage1 as a backend anymore.
After this, Zig is required to build Zig. This is a cyclic dependency, which
Zig Core team breaks by continuously checking in a Zig compiler in
wasm, the zig1.wasm
file, which is used to build the compiler.
Andrew suggests a motivated third-party to implement a Zig interpreter in non-Zig that could break this chain. While that would be certainly be ideal, nobody has built it yet 🤷.
The steps to build “trusted”2 Zig are roughly:
- Build Zig from the C++ implementation of the commit above (with hacks and tricks to make it actually compile).
- Use previous step to build the first Zig self-hosted.
- Proceed to the next step. When the updated Zig does not build, find creative ways to build it anyway (or, when really stuck, ask @mlugg).
- Goto 2 for 45+ times.
After reaching 0.11.0-1894-gb92e30ff0b
, which is two zig1.wasm
updates away
from 0.12.0, I received an email from Hilton Chain, titled Thank you for the work on bootstrapping Zig!
, where they took my PoC, re-created all of it in
Guix DSL and ran all the way to 0.13.03. This made me flabbergasted.
I audited their script to see if it really deletes zig1.wasm
at every
checkout, ran it to produce zig1.wasm
of 0.13.0
myself:
$ ./pre-inst-env guix build zig@0.13
< ... a few hours ... >
/gnu/store/mz95707dd7qmycpr1f0ndxhkmx3vdy1c-zig-0.13.0
/gnu/store/kqwq8sjgwi561sp78vfi6xkgm9i3wysk-zig-0.13.0-zig1
$ ls -l /gnu/store/kqwq8sjgwi561sp78vfi6xkgm9i3wysk-zig-0.13.0-zig1/bin/zig1.wasm
-r--r--r-- 5 root root 2661492 Jan 1 1970 /gnu/store/kqwq8sjgwi561sp78vfi6xkgm9i3wysk-zig-0.13.0-zig1/bin/zig1.wasm
Once I had zig1.wasm
of 0.13.0, I did the same as I did in the official
zig1.wasm
: built zig3
, used it to build zig1.wasm
, and voilà, the hashes of
the official zig1.wasm
and the one built here match.
Conclusions and open questions
I am looking forward to Hilton landing this to Guix, so anyone can audit the build script and reproduce this exercise by themselves with an otherwise bootstrappable system. If you don’t trust Guix, what and whom do you trust?
If anyone can trace origins of zig1.wasm
by producing an identical version
themselves, perhaps it’s not too bad to trust it and have it checked in?
-
Not exactly. Some reverts and code movement is necessary. See the
run
script for details. ↩︎ -
We trust no-one except ourselves and our little machine on our desk. ↩︎
-
Their work is on a branch in Guix repository, which has
zig
in the title. I will not link it here, as it will be removed when it lands, but it should be easy to find for determined readers before it does. ↩︎