ZFS: Difference between revisions
imported>Tyleroconnell m Adding a solution from Matrix when ZFS doesn't automatically discover zpools after reboot within a VM |
imported>Aidalgol Rewrite section on installing on ZFS root |
||
Line 179: | Line 179: | ||
This would disable only weekly snapshots on the given filesystem. | This would disable only weekly snapshots on the given filesystem. | ||
== | == Installing NixOS on a ZFS root filesystem == | ||
Another guide titled "Encrypted ZFS mirror with mirrored boot on NixOS" is available at https://elis.nu/blog/2019/08/encrypted-zfs-mirror-with-mirrored-boot-on-nixos/. | Another guide titled "Encrypted ZFS mirror with mirrored boot on NixOS" is available at https://elis.nu/blog/2019/08/encrypted-zfs-mirror-with-mirrored-boot-on-nixos/. | ||
Line 185: | Line 185: | ||
OpenZFS document for NixOS Root on ZFS is also available: | OpenZFS document for NixOS Root on ZFS is also available: | ||
https://openzfs.github.io/openzfs-docs/Getting%20Started/NixOS/Root%20on%20ZFS.html | https://openzfs.github.io/openzfs-docs/Getting%20Started/NixOS/Root%20on%20ZFS.html | ||
This guide is based on the above OpenZFS guide and the NixOS installation instructions in the [https://nixos.org/manual/nixos/stable/index.html#sec-installation NixOS manual]. | |||
=== Pool Layout Considerations === | === Pool Layout Considerations === | ||
Line 193: | Line 195: | ||
<syntaxhighlight lang="none"> | <syntaxhighlight lang="none"> | ||
rpool/ | rpool/ | ||
nixos/ | |||
nix mounted to /nix | nix mounted to /nix | ||
userdata/ | |||
root mounted to / | root mounted to / | ||
home mounted to /home | home mounted to /home | ||
Line 201: | Line 203: | ||
</syntaxhighlight> | </syntaxhighlight> | ||
the name of ` | the name of `nixos` and `userdata/` can change, but them being peers is important. | ||
is important. | |||
ZFS can take consistent and atomic snapshots recursively down a | ZFS can take consistent and atomic snapshots recursively down a dataset's hierarchy. Since Nix is good at being Nix, most users will want their server's ''data'' backed up, and don't mind reinstalling NixOS and then restoring data. If this is sufficient, only snapshot and back up the <code>userdata</code> hierarchy. Users who want to be able to restore a service with only ZFS snapshots will want to snapshot the entire tree, at the significant expense of snapshotting the Nix store. | ||
dataset's hierarchy. Since Nix is good at being Nix, most users will want their server's ''data'' backed up, and don't | |||
mind reinstalling NixOS and then restoring data. If this is sufficient, only snapshot and back up the <code> | |||
=== Dataset Properties === | === Dataset Properties === | ||
Line 212: | Line 211: | ||
The following is a list of recommended dataset properties which have no drawbacks under regular uses: | The following is a list of recommended dataset properties which have no drawbacks under regular uses: | ||
* <code>compression=lz4</code> | * <code>compression=lz4</code> (<code>zstd</code> for higher-end machines) | ||
* <code>xattr=sa</code> for Journald | * <code>xattr=sa</code> for Journald | ||
* <code>acltype=posixacl</code> also for Journald | * <code>acltype=posixacl</code> also for Journald | ||
* <code>relatime=on</code> for reduced stress on SSDs | |||
The following is a list of dataset properties which are often useful, but do have drawbacks: | The following is a list of dataset properties which are often useful, but do have drawbacks: | ||
Line 241: | Line 241: | ||
</syntaxhighlight> | </syntaxhighlight> | ||
=== | === Environment Setup === | ||
For convenience set a shell variable with the paths to your disk(s): | |||
For multiple disks: | |||
<syntaxhighlight lang="console"> | |||
$ disk=(/dev/disk/by-id/foo /dev/disk/by-id/bar) | |||
</syntaxhighlight> | |||
<syntaxhighlight lang=" | For a single disk: | ||
<syntaxhighlight lang="console"> | |||
$ disk=/dev/disk/by-id/foo | |||
</syntaxhighlight> | </syntaxhighlight> | ||
=== | === Partitioning the disks === | ||
<syntaxhighlight lang="bash"> | <syntaxhighlight lang="bash"> | ||
# | # Multiple disks | ||
for x in "${disk[@]}"; do | |||
sudo parted "$x" -- mklabel gpt | |||
sudo parted "$x" -- mkpart primary 512MiB -8GiB | |||
sudo parted "$x" -- mkpart primary linux-swap -8GiB 100% | |||
sudo parted "$x" -- mkpart ESP fat32 1MiB 512MiB | |||
sudo parted "$x" -- set 3 esp on | |||
sudo mkswap -L swap "${x}-part2" | |||
sudo mkfs.fat -F 32 -n EFI "${x}-part3" | |||
done | |||
# | # Single disk | ||
sudo parted "$disk" -- mklabel gpt | |||
sudo parted "$disk" -- mkpart primary 512MiB -8GiB | |||
sudo parted "$disk" -- mkpart primary linux-swap -8GiB 100% | |||
sudo parted "$disk" -- mkpart ESP fat32 1MiB 512MiB | |||
sudo parted "$disk" -- set 3 esp on | |||
sudo mkswap -L swap "${disk}-part2" | |||
sudo mkfs.fat -F 32 -n EFI "${disk}-part3" | |||
</syntaxhighlight> | |||
=== Laying out the filesystem hierarchy === | |||
In this guide, we will be using a <code>tmpfs</code> for <code>/</code>, since no system state will be stored outside of the ZFS datasets we will create. | |||
<syntaxhighlight lang="console"> | |||
$ sudo mount -t tmpfs none /mnt | |||
</syntaxhighlight> | |||
==== Create the ZFS pool ==== | |||
<syntaxhighlight lang="console"> | |||
$ sudo zpool create \ | |||
-o ashift=12 \ | |||
-o autotrim=on \ | |||
-R /mnt \ | |||
-O canmount=off \ | |||
-O mountpoint=none \ | |||
-O acltype=posixacl \ | |||
-O compression=zstd \ | |||
-O dnodesize=auto \ | |||
-O normalization=formD \ | |||
-O relatime=on \ | |||
-O xattr=sa \ | |||
-O encryption=aes-256-gcm \ | |||
-O keylocation=prompt \ | |||
-O keyformat=passphrase \ | |||
rpool \ | |||
mirror \ | |||
"${disk[@]/%/-part1}" | |||
</syntaxhighlight> | |||
For a single disk, remove <code>mirror</code> and specify just <code>"${disk}"</code> as the device. | |||
If you do not want the entire pool to be encrypted, remove the options <code>encryption</code> <code>keylocation</code> and <code>keyformat</code>. | |||
==== Create the ZFS datasets ==== | |||
Since zfs is a copy-on-write filesystem even for deleting files disk space is needed. Therefore it should be avoided to run out of disk space. Luckily it is possible to reserve disk space for datasets to prevent this. | |||
<syntaxhighlight lang="bash> | |||
sudo zfs create -o refreservation=1G -o mountpoint=none rpool/reserved | |||
</syntaxhighlight> | |||
Create the datasets for the operating system. | |||
nixos- | <syntaxhighlight lang="bash> | ||
sudo zfs create -o canmount=off -o mountpoint=/ rpool/nixos | |||
sudo zfs create -o canmount=on rpool/nixos/nix | |||
sudo zfs create -o canmount=on rpool/nixos/etc | |||
sudo zfs create -o canmount=on rpool/nixos/var | |||
sudo zfs create -o canmount=on rpool/nixos/var/lib | |||
sudo zfs create -o canmount=on rpool/nixos/var/log | |||
sudo zfs create -o canmount=on rpool/nixos/var/spool | |||
</syntaxhighlight> | |||
# | Create datasets for user home directories. If you opted to not encrypt the entire pool, you can encrypt just the userdata by specifying the same ZFS properties when creating rpool/userdata, and the child datasets will also be encrypted. | ||
<syntaxhighlight lang="bash> | |||
sudo zfs create -o canmount=off -o mountpoint=/ rpool/userdata | |||
sudo zfs create -o canmount=on rpool/userdata/home | |||
sudo zfs create -o canmount=on -o mountpoint=/root rpool/userdata/home/root | |||
# Create child datasets of home for users' home directories. | |||
sudo zfs create -o canmount=on rpool/userdata/home/alice | |||
sudo zfs create -o canmount=on rpool/userdata/home/bob | |||
sudo zfs create -o canmount=on rpool/userdata/home/... | |||
</syntaxhighlight> | |||
==== Mount /boot ==== | |||
We are going to use the default NixOS bootloader systemd-boot, which can install to only one device. You will want to periodically rsync <code>/mnt/boot</code> to <code>/mnt/boot2</code> so that you can always boot your system if either disk fails. | |||
<syntaxhighlight lang="bash> | |||
sudo mkdir /mnt/boot /mnt/boot2 | |||
sudo mount "${disk[0]}-part3" /mnt/boot | |||
sudo mount "${disk[1]}-part3" /mnt/boot2 | |||
</syntaxhighlight> | |||
Or for single-disk systems: | |||
<syntaxhighlight lang="bash> | |||
sudo mkdir /mnt/boot | |||
sudo mount "${disk}-part3" /mnt/boot | |||
</syntaxhighlight> | |||
=== Configure the NixOS system === | |||
Generate the base NixOS configuration files. | |||
<syntaxhighlight lang="bash"> | |||
$ nixos-generate-config --root /mnt | |||
</syntaxhighlight> | |||
Open <code>/mnt/etc/nixos/configuration.nix</code> in a text editor and change <code>imports</code> to include <code>hardware-configuration-zfs.nix</code> instead of the default <code>hardware-configuration.nix</code>. We will be editing this file later. | |||
Now Add the following block of code anywhere (how you organise your <code>configuration.nix</code> is up to you): | |||
<syntaxhighlight lang="nix"> | |||
# ZFS boot settings. | |||
boot. | |||
boot.supportedFilesystems = [ "zfs" ]; | boot.supportedFilesystems = [ "zfs" ]; | ||
boot.zfs.devNodes = "/dev/"; | |||
</syntaxhighlight> | |||
Now set <code>networking.hostName</code> and <code>networking.hostId</code>. The host ID must be an eight digit hexadecimal value. You can derive it from the <code>/etc/machine-id</code>, taking the first eight characters; from the hostname, by taking the first eight characters of the hostname's md5sum, | |||
<syntaxhighlight lang="console"> | |||
$ hostname | md5sum | head -c 8 | |||
</syntaxhighlight> | |||
or by taking eight hexadecimal characters from <code>/dev/urandom</code>, | |||
<syntaxhighlight lang="console"> | |||
$ tr -dc 0-9a-f < /dev/urandom | head -c 8 | |||
</syntaxhighlight> | </syntaxhighlight> | ||
Now add some ZFS maintenance settings: | |||
<syntaxhighlight lang="nix"> | |||
# ZFS maintenance settings. | |||
services.zfs.trim.enable = true; | |||
services.zfs.autoScrub.enable = true; | |||
<syntaxhighlight lang=" | services.zfs.autoScrub.pools = [ "rpool" ]; | ||
zfs | </syntaxhighlight> | ||
</ | |||
You may wish to also add <code>services.zfs.autoSnapshot.enable = true;</code> and set the ZFS property <code>com.sun:auto-snapshot</code> to <code>true</code> on <code>rpool/userdata</code> to have automatic snapshots. (See [[#How to use the auto-snapshotting service]] earlier on this page.) | |||
< | |||
</ | |||
Now open <code>/mnt/etc/nixos/hardware-configuration-zfs.nix</code>. | |||
* Add <code>options = [ "zfsutil" ];</code> to every ZFS <code>fileSystems</code> block. | |||
* Add <code>options = [ "X-mount.mkdir" ];</code> to <code>fileSystems."/boot"</code> and <code>fileSystems."/boot2"</code>. | |||
* Replace <code>swapDevices</code> with the following, replacing <code>DISK1</code> and <code>DISK2</code> with the names of your disks. | |||
<syntaxhighlight lang="nix"> | |||
<syntaxhighlight lang=" | swapDevices = [ | ||
{ device = "/dev/disk/by-id/foo-part2"; | |||
randomEncryption = true; | |||
} | |||
{ device = "/dev/disk/by-id/bar-part2"; | |||
randomEncryption = true; | |||
} | |||
]; | |||
</syntaxhighlight> | |||
For single-disk installs, remove the second entry of this array. | |||
==== Optional additional setup for encrypted ZFS ==== | |||
===== Unlock encrypted zfs via ssh on boot ===== | |||
In case you want unlock a machine remotely (after an update), having an ssh service in initrd for the password prompt is handy: | In case you want unlock a machine remotely (after an update), having an ssh service in initrd for the password prompt is handy: | ||
Line 486: | Line 450: | ||
* If your network card isn't started, you'll need to add the according kernel module to the initrd as well, e.g. <code>boot.initrd.kernelModules = [ "r8169" ];</code> | * If your network card isn't started, you'll need to add the according kernel module to the initrd as well, e.g. <code>boot.initrd.kernelModules = [ "r8169" ];</code> | ||
=== Import and unlock multiple encrypted pools/dataset at boot === | ===== Import and unlock multiple encrypted pools/dataset at boot ===== | ||
If you have not only one encrypted pool/dataset but multiple ones and you want to import and unlock them at boot, so that they can be automounted using the hardware-configuration.nix, you could just amend the <code>boot.initrd.network.postCommands</code> option. | If you have not only one encrypted pool/dataset but multiple ones and you want to import and unlock them at boot, so that they can be automounted using the hardware-configuration.nix, you could just amend the <code>boot.initrd.network.postCommands</code> option. | ||
Line 513: | Line 476: | ||
When you login by SSH into the box or when you have physical access to the machine itself, you will be prompted to supply the unlocking password for your zroot and tankXXX pools. | When you login by SSH into the box or when you have physical access to the machine itself, you will be prompted to supply the unlocking password for your zroot and tankXXX pools. | ||
=== Install NixOS === | |||
<syntaxhighlight lang="console"> | |||
$ nixos-install --show-trace --root /mnt | |||
</syntaxhighlight> | |||
<code>--show-trace</code> will show you where exactly things went wrong if <code>nixos-install</code> fails. To take advantage of all cores on your system, also specify <code>--max-jobs n</code> replacing <code>n</code> with the number of cores on your machine. | |||
== ZFS Trim Support for SSDs == | == ZFS Trim Support for SSDs == |