Nix Hash: Difference between revisions

imported>Goorzhel
Explain SRI, which I had to Google to decode
I don't believe we need a "hashes in nix" heading here. Again, sections are folded by default on mobile, so the "0th section" better has some content.
Line 1: Line 1:
== Hashes in Nix ==


[https://en.wikipedia.org/wiki/Cryptographic_hash_function Cryptographic hashes] play an essential role in a lot of places in the Nix ecosystem. When using a hash somewhere, two criteria are essential to do so properly: the '''algorithm''' used and the '''encoding''' (and, to some extent, ''what'' is hashed).
[https://en.wikipedia.org/wiki/Cryptographic_hash_function Cryptographic hashes] play an essential role in a lot of places in the Nix ecosystem. When using a hash somewhere, two criteria are essential to do so properly: the '''algorithm''' used and the '''encoding''' (and, to some extent, ''what'' is hashed).
Line 7: Line 6:
A hash – which is simply a sequence of bytes – is usually encoded in order to be representable as string. Common encodings are <code>base16</code> (commonly called "hex"), <code>base32</code> and <code>base64</code>. Note that the base32 is a [https://github.com/NixOS/nix/blob/master/src/libutil/hash.cc#L83-L107 '''custom one'''] that is not documented nor standardized in any way! If possible, use the provided hashing tools to convert hashes to it (see below). base32 is used by Nix in a lot of places because it is shorter than hex but can still safely be part of a file path (as it contains no slashes).
A hash – which is simply a sequence of bytes – is usually encoded in order to be representable as string. Common encodings are <code>base16</code> (commonly called "hex"), <code>base32</code> and <code>base64</code>. Note that the base32 is a [https://github.com/NixOS/nix/blob/master/src/libutil/hash.cc#L83-L107 '''custom one'''] that is not documented nor standardized in any way! If possible, use the provided hashing tools to convert hashes to it (see below). base32 is used by Nix in a lot of places because it is shorter than hex but can still safely be part of a file path (as it contains no slashes).


=== Usage ===
== Usage ==


Many derivations are so-called ''fixed-output'' derivations, meaning that you need to know and specify the hash of the output in advance. As an example, let's look at nixpkgs function <code>fetchurl</code>:
Many derivations are so-called ''fixed-output'' derivations, meaning that you need to know and specify the hash of the output in advance. As an example, let's look at nixpkgs function <code>fetchurl</code>:
Line 20: Line 19:
The format of the hash follows the [https://www.w3.org/TR/SRI/#introduction SRI (Subresource Integrity)] specification.
The format of the hash follows the [https://www.w3.org/TR/SRI/#introduction SRI (Subresource Integrity)] specification.


=== Updating Packages ===
== Updating Packages ==


[https://nixos.org/manual/nixpkgs/stable/#chap-pkgs-fetchers-caveats Using TOFU to get the new hash]
[https://nixos.org/manual/nixpkgs/stable/#chap-pkgs-fetchers-caveats Using TOFU to get the new hash]


=== What exactly is hashed ===
== What exactly is hashed ==


Some content can either be hashed "flat" or "recursively". "flat" (sometimes also called "file") is simply taking the hash of the file, byte by byte, and will give you the same result as for example `sha256sum -b myfile.zip`. "recursive" (or sometimes "path") hashing takes multiple files, path names and metadata (attributes) into consideration. It works by NARing the input before hashing.
Some content can either be hashed "flat" or "recursively". "flat" (sometimes also called "file") is simply taking the hash of the file, byte by byte, and will give you the same result as for example `sha256sum -b myfile.zip`. "recursive" (or sometimes "path") hashing takes multiple files, path names and metadata (attributes) into consideration. It works by NARing the input before hashing.
Line 34: Line 33:
The motivation behind this is that sometimes, the content is always the same, but the archive may change. This is because zip files are inherently non-deterministic, and might be generated automatically. If they are regenerated, they'll have a different hash, although the content is the same. <code>recursiveHash</code> works around that.
The motivation behind this is that sometimes, the content is always the same, but the archive may change. This is because zip files are inherently non-deterministic, and might be generated automatically. If they are regenerated, they'll have a different hash, although the content is the same. <code>recursiveHash</code> works around that.


=== Tools ===
== Tools ==


[https://nixos.org/manual/nix/stable/command-ref/nix-hash.html nix-hash]
[https://nixos.org/manual/nix/stable/command-ref/nix-hash.html nix-hash]
Line 42: Line 41:
When dealing with remote files, <code>nix-prefetch-url</code> offers a handy shortcut for downloading the file into the Nix store and printing out its hash. (<code>nix-prefetch-url --unpack</code> is its <code>fetchzip</code> equivalent.)
When dealing with remote files, <code>nix-prefetch-url</code> offers a handy shortcut for downloading the file into the Nix store and printing out its hash. (<code>nix-prefetch-url --unpack</code> is its <code>fetchzip</code> equivalent.)


=== Libraries ===
== Libraries ==


* [https://github.com/NixOS/nix/blob/master/src/libutil/hash.cc#L83 Original C++ implementation]
* [https://github.com/NixOS/nix/blob/master/src/libutil/hash.cc#L83 Original C++ implementation]