Stream: cull-os

Topic: Where is the Language in the Computability Hierarchy?


view this post on Zulip Tim DeHerrera (Jul 09 2024 at 17:24):

One thing I have come to realize as I study Nix's architecture more, and after years of experience and just dealing with the pain points, is that Nix may have actually missed a critical opportunity for itself by being turing-complete.

In order to address some of the inefficiency concerns (changing hashes over added whitespace, reformatting, etc), the most formal solution would be to use a total programming language that can be reliably reduced to normal form (since it's not turing complete). You can then compute the hash over this normal form and use it to verify the "semantic integrity" of the code (as Dhall puts it).

This hash could also equally function as a cache key, for more granular and correct expression caching. Also keeps the complexity of expressions capped, which puts a cap on the overall complexity of evaluation (which keeps getting worse and worse in nixpkgs, for example).

Nix has already missed the boat on this, since any changes of this magnitude would break all backward compatibility, but when starting a new project completely from scratch you have the opportunity to design a language with these things in mind up front, before you get locked in.

For that reason, I just wanted to suggest, if you haven't thought of it already, specifying the language as a "total" programming language, which would give you a compelling advantage over Nix. Yes there is already Dhall, but that lang has some of its own problems (too rigid of typing, mostly).

view this post on Zulip RGBCube (Jul 09 2024 at 17:42):

  1. When cab eval is run, it hashes the file. If that hash already exists in the hash table that maps file hashes to structure hashes, it skips to step 3. If not, step 2.
  2. It evaluates the structure of the file (since the language is reproducible and does not do IO, the same expression will always evaluate to the same thing) and places it inside the hash table that maps structure hashes to results.
  3. It then returns the result.

This way, changing whitespace won't do anything as it does not affect the AST. You don't need the language to not be turning complete to achieve this, either.

view this post on Zulip RGBCube (Jul 09 2024 at 17:43):

This could be performed for every single thunk evaluation, but initial evaluation would be quite super slow, so we need to find a middle ground.

view this post on Zulip RGBCube (Jul 09 2024 at 17:44):

Top-level is also quite bad, which is the way Nix does it. I'm thinking of doing it per island, which is either a git repository, a zip file, etc. Like flake inputs, one might say

view this post on Zulip Tim DeHerrera (Jul 09 2024 at 17:44):

Makes sense, but the benefit of reducing to normal form and then hashing would be that refactors would still maintain the same hash, basically any two expressions that are semantically equivalent would have the same hash, also I think a turing-complete language is a bit overkill for what should have been a DSL, no?

view this post on Zulip RGBCube (Jul 09 2024 at 17:45):

"reducing to normal form" is evaluation, though. And the result of that would still be the same derivation files, so you wouldn't end up building anything

view this post on Zulip RGBCube (Jul 09 2024 at 17:46):

I don't think we are going to have any performance issues, as I'm planning to implement the standard library in Rust, instead of Nix. That is what takes the most time in Nix to evaluate modules. Nix itself is just slow and shouldn't be used to create huge things, only use them as glue.

view this post on Zulip Tim DeHerrera (Jul 09 2024 at 17:49):

reducing to normal form would be the first step in eval sure. That isn't even technically always possible in turing complete languages though, due to arbitrary complexity, the point is that it is always possible and should be computationally fairly inexpensive in a total language.

The point about two expressions building the same derivation anyway is also true, but if you wanted to cache that evaluation, you'd have know way of knowing they are equivalent beforehand unless you first reduced them to normal form.

Anyway, just wanted to share my thoughts with you since you are doing this, as it's something I've been thinking about lately in my effort to understand how one might improve Nix. Ultimately I am looking for something that is backwards compatible, but for someone in your position it is more feasible.

view this post on Zulip Tim DeHerrera (Jul 09 2024 at 17:50):

if my ideas are fuzzy or useless, that is also fine, I didn't fully flesh it out

view this post on Zulip RGBCube (Jul 09 2024 at 19:29):

It should also be possible to do ast-hashing in Nix. Not sure why it doesn't do it though.

view this post on Zulip RGBCube (Jul 09 2024 at 19:29):

Maybe because in C++, you can't just #[derive(Hash)]?


Last updated: Oct 12 2024 at 22:45 UTC