Stream: cull-os

Topic: Realizer Protocol

RGBCube (Oct 24 2024 at 16:18):

Content Addressed Derivations

This won't be an explicit feature in the protocol, but something that will be implementable with the following feature.

Graph Mutation

A realizer will be able to communicate back that the derivation graph. So instead of:

fn realize(&Graph<Derivation>, &[Realizer]) -> Result<(), RealizationError>

The interface of telling a realizer to realize a derivation will be similar to;:

fn realize(&mut Graph<Derivation>, &[Realizer]) -> Result<(), RealizationError>

Where the realizer will be able to modify the graph to change paths / files / whatever.

This will also allow a theoretical realizer to:

Receive a derivation graph.
Start realizing from the roots.
After every derivation realization:
- If the realization self references, we replace the self reference with a fixed magic value of the same size. And then we calculate the hash, and rename the store path to it and replace our magic value with the hash.
- If not, we simply rename the store path to the hash of the contents.
Now we have a completely content addressed realization. For trivially content addressed ones, one can verify it by hashing the contents and comparing it with the path. For non-trivial ones, one will need to calculate the hash of the contents after replacing the self-references with our magic value.
We now modify the realization graph of the client to reflect our model.

Derivation Recalling From Mutated Realizations

Considering that if our realizer mutates our graphs, we cannot know the store path of the final realization without realizing. This is catastrophic for caching, as using store paths to fetch will not work.

Unless we also link the every input-addressed store path to the content-addressed equivalent. This way, we also ensure that we cannot have more than a single canonical content-addressed version of an input addressed derivation.

Though, none of this will be exposed through the realizer protocol. This is simply an implementation detail, the real API will look more like this:

fn fetch(&mut Graph<Derivation>, &[Cacher]) -> Result<(), FetchError>

Did you notice how similar this is to the realize function above? Yes, we did too. And so we can just unify these, making "realizing" and "caching" the exact same concept.

The only difference between a "cacher" and a general "realizer" will be that the cacher refuses to actually realize derivations, instead opting to only serve already realized ones.

Communication Between Realizers

The way realizers will communicate with each other is that the client will only send the derivations it wants them to realize one by one. No realizer will see the whole build graph at once.

Every derivation that has already been realized (for example, dependencies of a derivation which is ready to be realized) will be accompanied by a realizer where the realizing realizer can fetch it from (the sent realizer being the realizer we're sending it to means that the dependency was built on the same realizer).

This way, we will not mess up our state.

RGBCube (Oct 24 2024 at 16:21):

Note: The client will never store anything at all. The way we will ensure that the realization result is copied locally (if configured to) is by telling our local realizer to realize the final derivation. This is why we can conflate caching and realizing.

RGBCube (Oct 25 2024 at 13:26):

System? What's that?

When you think about it, the system field of Nix doesn't make sense, as it's just a feature.

Thus, the realizer protocol will not be special cased for Arch + OS combinations.

The way it will be is as follows: Every realizer will append the Arch and OS to its feature set. For example if I configure my realizer to support WASM execution, it will advertise its features set as [ "x86_64", "linux", "wasm" ].

Last updated: Jul 09 2025 at 08:26 UTC