Reproducibility & Verification
InterGenOS is built from source, and the build is designed so that what you install can be checked rather than taken on faith. This page explains what the system verifies today, what reproducibility means here, and how you can confirm an install is the artifact it claims to be.
The guiding posture is simple: security is not first. It is only. A reproducible, verifiable build is what lets you run a machine you understand, can modify, and can trust.
This page describes InterGenOS 1.0-dev (build id v1.0-dev1). Some of the reproducibility work below is shipped today; the full byte-identical-rebuild milestone is a stated goal of the 1.x series and is marked as such throughout. Nothing here is presented as finished unless it is.
What “verification” covers today
A from-source distribution earns trust in layers. These are the verification properties InterGenOS ships now:
- Pinned, checksum-verified sources. Every upstream source is pinned to a specific version and verified against a recorded SHA-256 before it is used. A source that does not match its recorded hash stops the build; it is never silently substituted.
- A signed Secure Boot chain. The bootloader and the unified kernel images (UKIs) are signed at a hardware-token signing step. Signing only appends a signature; it never alters the payload, and the signed outputs are verified after signing.
- dm-verity integrity for the system image. A verified-integrity hash tree is generated over the read-only system image. Each kernel image’s command line carries the verity root hash of that image, so with Secure Boot enabled, a tampered system image cannot boot under a validly signed kernel.
- A signed package index. Packages are published to the mirror with a fully signed index. Every publish regenerates and re-signs the complete index, and clients verify the entire index against one signature. Data transfer is incremental, but the signed index is never partial, and the live repository is promoted by an atomic swap so clients never see a half-published state.
Together these mean a booted InterGenOS system can prove its boot chain is signed, its system image is integrity-checked, and its packages came from a signed index — without trusting the network path they arrived over.
The build pipeline, and where trust is established
The image is produced by an ordered pipeline of phases. It is worth seeing where in that pipeline each trust property is actually established, because the answer is “at a specific, named phase, with a fail-closed check” rather than “somewhere in a large opaque build.” Sources are hash-checked before anything is compiled; the package index is signed before the image is sealed; the system image’s integrity hash is sealed into the signed kernel image; and only then is a bootable ISO emitted.
flowchart TB
SRC[Upstream sources<br/>each pinned to a version] --> VS{verify-sources}
VS -.->|" hash mismatch "| HALT[Build halts]
VS ==>|" SHA-256 matches "| TIERS["Build from source in a clean chroot<br/>six tiers, built in order:<br/>toolchain → core → base → kernel → desktop → ai → extra"]
TIERS --> BL[bootloader<br/>signed GRUB + shim chain + SBAT]
BL --> IMG[image<br/>assemble the root tree]
IMG --> MAN[manifest<br/>index signed by the release key]
MAN --> SQ[squashfs<br/>read-only system image]
SQ --> UV[ukis-verity<br/>dm-verity root hash sealed into the signed UKI]
UV --> ISO[(Signed ISO image)]
The dashed branch is the point of the whole design: a source that does not match its recorded hash does not get patched around or downloaded from a fallback — the build stops. The same fail-closed posture runs through the sealing phases, where a missing signature or a verity mismatch halts the release rather than shipping it.
A worked example: the boot anchor’s bill of materials
The Secure Boot shim is the binary the entire boot chain is verified through, so its provenance is published as a concrete, inspectable artifact rather than described only in prose. The source tree carries an SPDX 2.3 JSON SBOM for the shim under docs/sboms/, generated by scripts/shim-sbom-gen.py. It records the exact inputs that go into shimx64.efi: the pinned upstream shim source at a specific commit, the digest-pinned base build image and its package snapshot, the embedded InterGenOS Secure Boot CA certificate (in both DER and PEM form), the SBAT vendor entry, and the shim binary’s own SHA-256 and size.
Two properties make it evidence rather than a label:
- It is deterministic. The generator reads the repo-resident inputs from the committed git tree — not the working copy — and emits byte-identical SPDX JSON for an unchanged input set. Anyone can re-run it and diff the result against the published file; a match confirms the recorded provenance, and a mismatch is a real signal. The generator can additionally emit a detached signature at release time, anchored to the hardware signing key.
- It is inspectable. It is plain SPDX 2.3 JSON, readable with ordinary tools, listing every input alongside its hash — no special tooling required to audit what the boot anchor is built from.
Coverage today is the shim specifically: the root of trust the rest of the chain depends on, recorded as a machine-readable artifact you can read and reproduce.
What reproducibility means here
A reproducible build means that starting from the same source code, the same toolchain, and the same documented build environment, two independent builders produce byte-identical output: bit-for-bit identical package archives and, ultimately, a bit-for-bit identical filesystem image, with cryptographic hashes that match exactly.
Why this matters: a reproducible build is a security primitive. If an independent party fetches the published source, applies the published recipes, and rebuilds, they should get an archive byte-identical to the one on the mirror. If it does not match, then either the published recipe does not describe the real build, the infrastructure was tampered with, or there is a non-determinism bug. All three are findable with reproducibility and invisible without it. This is the shift from “we signed the binary” to “the binary is independently verifiable from source.”
The InterGenOS commitment, stated precisely
Reproducibility is layered, and the layers land on different timelines:
- Per-package archives (1.x target). Given the same pinned source, the same compiled toolchain, and the same build environment (same chroot, same source date, same locale and timezone), two builders produce byte-identical package archives — same content, same modes, owner and group normalized to root, normalized timestamps, normalized archive metadata.
- The signed index (follows from per-package). Given the same archive set, the published index is byte-identical across builders, because the index generator is already deterministic.
- The ISO image (later 1.x target). A byte-identical final ISO layers on top of the above and additionally requires deterministic system-image compression and bootable-image metadata. This is scoped now so the per-package work does not preclude it, and it is a later 1.x target rather than an initial 1.0 claim.
- Bootstrap toolchain (separate, longer effort). Making the build toolchain itself reproducible is a bootstrap problem addressed as its own separate effort beyond the per-package work. The posture is that every package above the bootstrap layer is reproducible, while the bootstrap is held fixed at known-good versions and verified by hash.
Current state, honestly
The build is reproducible by construction in its design — a fixed source date and deterministic phase ordering are part of the lifecycle — and pinned, checksum-verified sources are enforced today. Full byte-identical per-package output across independent builders is the documented 1.x goal and is partially in place:
- Source-date handling is honored in the manifest path.
- The cargo vendoring pipeline already produces reproducible vendor tarballs using the standard normalized-tar recipe, which demonstrates the technique works in this tree.
- The general package-archive emitter and full source-date propagation into the build environment are the remaining work for the per-package milestone.
The reproducibility recipe
InterGenOS follows the well-established reproducible-builds approach (the same envelope used across the major distributions), because third-party verifiers already know what to look for. The work that is specific to InterGenOS is the integration layer — where these settings enter the pipeline and what the audit checks validate. The recipe suppresses every input that varies between builders:
- A canonical source date consumed by build systems and compilers, so embedded build timestamps do not leak live clock values.
- Normalized archives — sorted entries, owner and group set to root, normalized timestamps, and a long-path-safe archive format — with the compression layer written so it carries no embedded filename or timestamp.
- Build-path normalization so absolute build paths do not get baked into binaries, debug info, or assertion messages.
- Pinned locale and timezone so locale-dependent ordering and timezone-shifted timestamps cannot vary.
- Deterministic link order, which most modern build systems and linkers already provide, confirmed per package by audit rather than assumed.
The principle behind all of it: the recipe declares the inputs that should matter, and everything else is pinned, normalized, or stripped.
How a build proves itself
Reproducibility is one half of trust; the other half is that a candidate is never trusted on the strength of a clean build alone. A build whose packages all compiled is not a build that is known-good. The defects that matter most in a from-source distribution compile fine, package fine, and only surface when the artifact is installed and booted on real hardware — a locked-down service unit with no writable path, a directory missing from a package’s file list, a wrong hardware-detection heuristic, a first-boot timing race.
So the path to a trusted release always runs the full chain: build the candidate, sign the boot chain, assemble the ISO, boot the live image on real hardware, install it, reboot into the installed system, and read every log on every boot for failed units and a clean trust record. A fix is finished only when it lives in the source tree and a clean from-scratch build reproduces the corrected behavior with zero manual steps. A fix that works only because of a hand edit on a running box is a note about a fix, not a fix.
A candidate becomes a stable, golden release only when a full from-scratch cycle runs end to end with zero triggers — nothing required a fix — validated on representative lower-end hardware, with timing-sensitive items cleared over several consecutive cold boots.
Verifying your own install
What you can check today on a running or freshly installed system:
- Confirm the boot chain. The system boots through a signed bootloader and signed kernel images under Secure Boot, with the system image’s integrity root hash carried in the signed kernel command line.
- Confirm package provenance. Packages come from a mirror whose entire index is signed; the package manager (pkm) verifies the index signature before trusting any entry.
- Confirm a clean boot. A trusted install shows zero failed units and a clean integrity record on every boot.
What is coming with the per-package reproducibility milestone: the planned local build-from-source mode lets you rebuild a package on your own machine and bit-compare the result against the published hash for that package. Until per-package reproducibility is verified, that mode is advisory — it proves the source compiles, not that it produces the identical artifact a trusted builder would. Once reproducibility lands, a matching hash means your local rebuild is byte-equivalent to the published binary, and you have independently confirmed the recipe produces it — without trusting the build infrastructure at all.