ZFS: Difference between revisions
Update notes on using SWAP and add warning about hibernation |
→Take snapshots automatically: add services.zfs.autoSnapshot |
||
(64 intermediate revisions by 19 users not shown) | |||
Line 1: | Line 1: | ||
[https://zfsonlinux.org/ {{PAGENAME}}] ([[wikipedia:en:{{PAGENAME}}]]) | [https://zfsonlinux.org/ {{PAGENAME}}] ([[wikipedia:en:{{PAGENAME}}]]), also known as [https://openzfs.org/ OpenZFS] ([[wikipedia:en:OpenZFS]]), is a modern filesystem which is well supported on [[NixOS]]. | ||
[[category:filesystem]] | |||
Besides the {{nixos:package|zfs}} package (''ZFS Filesystem Linux Kernel module'') itself, there are many packages in the ZFS ecosystem available. | |||
ZFS integrates into NixOS via the {{nixos:option|boot.zfs}} and {{nixos:option|services.zfs}} options. | |||
== Limitations == | == Limitations == | ||
==== | ==== Latest Kernel compatible with ZFS ==== | ||
ZFS often does not support the latest Kernel versions. It is recommended to use an LTS Kernel version whenever possible; the NixOS default Kernel is generally suitable. See [[Linux kernel|Linux Kernel]] for more information about configuring a specific Kernel version. | |||
If your config specifies a Kernel version that is not officially supported by upstream ZFS, the ZFS module will fail to evaluate with an error that the ZFS package is "broken". Upstream ZFS changed in 2.3 to refuse to build by default, regardless of Nixpkgs’ broken marking (or ignoring). | |||
==== | ===== Selecting the latest ZFS-compatible Kernel ===== | ||
{{Warning|This will often result in the Kernel version going backwards as Kernel versions become end-of-life and are removed from Nixpkgs. If you need more control over the Kernel version due to hardware requirements, consider simply pinning a specific version rather than calculating it as below.}} | |||
To use the latest ZFS-compatible Kernel currently available, the following configuration may be used. | |||
<syntaxhighlight lang="nix"> | |||
{ | |||
config, | |||
lib, | |||
pkgs, | |||
... | |||
}: | |||
== | let | ||
zfsCompatibleKernelPackages = lib.filterAttrs ( | |||
name: kernelPackages: | |||
(builtins.match "linux_[0-9]+_[0-9]+" name) != null | |||
&& (builtins.tryEval kernelPackages).success | |||
&& (!kernelPackages.${config.boot.zfs.package.kernelModuleAttribute}.meta.broken) | |||
) pkgs.linuxKernel.packages; | |||
latestKernelPackage = lib.last ( | |||
lib.sort (a: b: (lib.versionOlder a.kernel.version b.kernel.version)) ( | |||
builtins.attrValues zfsCompatibleKernelPackages | |||
) | |||
); | |||
in | |||
{ | |||
# Note this might jump back and forth as kernels are added or removed. | |||
boot.kernelPackages = latestKernelPackage; | |||
} | |||
</syntaxhighlight> | |||
===== Using unstable, pre-release ZFS ===== | |||
{{Warning|Pre-release ZFS versions may be less well-tested, and may have critical bugs that may cause data loss.}}{{Warning|Running ZFS with a Kernel unsupported by upstream “is considered EXPERIMENTAL by the OpenZFS project. Even if it appears to build and run correctly, there may be bugs that can cause SERIOUS DATA LOSS.”}} | |||
In some cases, a pre-release version of ZFS may be available that supports a newer Kernel. Use it with <code>boot.zfs.package = pkgs.zfs_unstable;</code>. Using zfs_unstable may allow the use of an unsupported Kernel; as warned above, [https://github.com/openzfs/zfs/blob/6a2f7b38442b42f4bc9a848f8de10fc792ce8d76/config/kernel.m4#L473-L487 upstream considers this experimental]. | |||
==== | ==== Partial support for swap on ZFS ==== | ||
ZFS does not support swapfiles. swap devices can be used instead. Additionally, hibernation is disabled by default due to a [https://github.com/NixOS/nixpkgs/pull/208037 high risk] of data corruption. Note that even if that pull request is merged, it does not fully mitigate the risk. If you wish to enable hibernation regardless and made sure that swapfiles on ZFS are not used, set <code>boot.zfs.allowHibernation = true</code>. | |||
== | ==== Zpool not found ==== | ||
==== | If NixOS fails to import the zpool on reboot, you may need to add <syntaxhighlight lang="nix" inline>boot.zfs.devNodes = "/dev/disk/by-path";</syntaxhighlight> or <syntaxhighlight lang="nix" inline>boot.zfs.devNodes = "/dev/disk/by-partuuid";</syntaxhighlight> to your configuration.nix file. | ||
The differences can be tested by running <code>zpool import -d /dev/disk/by-id</code> when none of the pools are discovered, eg. a live iso. | |||
==== ZFS conflicting with systemd ==== | |||
ZFS will manage mounting non-legacy ZFS filesystems, but NixOS tries to manage mounting with systemd. ZFS native mountpoints are not managed as part of the system configuration (but better support hibernation with a separate swap partition). This can lead to conflicts if the ZFS mount service is also enabled for the same datasets. | |||
Disable the mount service with <code>systemd.services.zfs-mount.enable = false;</code> or remove the <code>fileSystems</code> entries in hardware-configuration.nix. Otherwise, use legacy mountpoints (created with e.g. <code>zfs create -o mountpoint=legacy</code>). Mountpoints must be specified with <code>fileSystems."/mount/point" = {};</code> or with <code>nixos-generate-config</code>. | |||
== Guides == | |||
=== | === Root on ZFS with disko === | ||
disko[https://github.com/nix-community/disko/blob/master/example/zfs.nix] can partition disks declaratively and handle mount points at install time. | |||
Don't follow the Root on ZFS guide found in OpenZFS documentation. It was abandoned and has not been updated in years. See commit log for the openzfs-docs repo for details. | |||
=== Simple NixOS ZFS on root installation === | |||
Start from here in the NixOS manual: [https://nixos.org/manual/nixos/stable/#sec-installation-manual]. | Start from here in the NixOS manual: [https://nixos.org/manual/nixos/stable/#sec-installation-manual]. | ||
Under manual partitioning [https://nixos.org/manual/nixos/stable/#sec-installation-manual-partitioning] do this instead: | Under manual partitioning [https://nixos.org/manual/nixos/stable/#sec-installation-manual-partitioning] do this instead: | ||
==== Partition the disk ==== | |||
We need the following partitions: | |||
* 1G for boot partition with "boot" as the partition label (also called name in some tools) and ef00 as partition code | |||
* 4G for a swap partition with "swap" as the partition label and 8200 as partition code. We will encrypt this with a random secret on each boot. | |||
* The rest of disk space for zfs with "root" as the partition label and 8300 as partition code (default code) | |||
Reason for swap partition: ZFS does use a caching mechanism that is different from the normal Linux cache infrastructure. | |||
In low-memory situations, ZFS therefore might need a bit longer to free up memory from its cache. The swap partition will help with that. | |||
Example | Example with gdisk using <code>/dev/nvme0n1</code> as the device (use <code>lsblk</code> to find the device</code>): | ||
<syntaxhighlight lang="bash"> | <syntaxhighlight lang="bash"> | ||
sudo gdisk /dev/nvme0n1 | |||
GPT fdisk (gdisk) version 1.0.10 | |||
... | |||
# boot partition | |||
Command (? for help): n | |||
Partition number (1-128, default 1): | |||
First sector (2048-1000215182, default = 2048) or {+-}size{KMGTP}: | |||
Last sector (2048-1000215182, default = 1000215175) or {+-}size{KMGTP}: +1G | |||
Current type is 8300 (Linux filesystem) | |||
Hex code or GUID (L to show codes, Enter = 8300): ef00 | |||
Changed type of partition to 'EFI system partition' | |||
# Swap partition | |||
Command (? for help): n | |||
Partition number (2-128, default 2): | |||
First sector (2099200-1000215182, default = 2099200) or {+-}size{KMGTP}: | |||
Last sector (2099200-1000215182, default = 1000215175) or {+-}size{KMGTP}: +4G | |||
Current type is 8300 (Linux filesystem) | |||
Hex code or GUID (L to show codes, Enter = 8300): 8200 | |||
Changed type of partition to 'Linux swap' | |||
# root partition | |||
Command (? for help): n | |||
Partition number (3-128, default 3): | |||
First sector (10487808-1000215182, default = 10487808) or {+-}size{KMGTP}: | |||
Last sector (10487808-1000215182, default = 1000215175) or {+-}size{KMGTP}: | |||
Current type is 8300 (Linux filesystem) | |||
Hex code or GUID (L to show codes, Enter = 8300): | |||
Changed type of partition to 'Linux filesystem' | |||
# write changes | |||
Command (? for help): w | |||
Final checks complete. About to write GPT data. THIS WILL OVERWRITE EXISTING | |||
PARTITIONS!! | |||
Do you want to proceed? (Y/N): y | |||
. | OK; writing new GUID partition table (GPT) to /dev/nvme0n1. | ||
The operation has completed successfully. | |||
</syntaxhighlight> | |||
/dev/ | Final partition table (<code>fdisk -l /dev/nvme0n1</code>): | ||
<syntaxhighlight lang=bash> | |||
Number Start (sector) End (sector) Size Code Name | |||
1 2048 2099199 1024.0 MiB EF00 EFI system partition | |||
2 2099200 10487807 4.0 GiB 8200 Linux swap | |||
3 10487808 1000215175 471.9 GiB 8300 Linux filesystem | |||
</syntaxhighlight> | |||
'''Let's use variables from now on for simplicity.''' Get the device ID in <code>/dev/disk/by-id/</code> (using {{ic|blkid}}), in our case here it is <code>nvme-SKHynix_HFS512GDE9X081N_FNB6N634510106K5O</code> | |||
<syntaxhighlight lang=bash> | |||
BOOT=/dev/disk/by-id/nvme-SKHynix_HFS512GDE9X081N_FNB6N634510106K5O-part1 | |||
SWAP=/dev/disk/by-id/nvme-SKHynix_HFS512GDE9X081N_FNB6N634510106K5O-part2 | |||
DISK=/dev/disk/by-id/nvme-SKHynix_HFS512GDE9X081N_FNB6N634510106K5O-part3 | |||
</syntaxhighlight> | </syntaxhighlight> | ||
{{note|It is often recommended to specify the drive using the device ID/UUID to prevent incorrect configuration, but it is also possible to use the device name (e.g. /dev/sda). See also: [[#Zpool created with bus-based disk names]], [https://wiki.archlinux.org/title/Persistent_block_device_naming Persistent block device naming - ArchWiki]}} | |||
==== Make a ZFS pool with encryption and mount points ==== | |||
{{Note|zpool config can significantly affect performance (especially the ashift option) so you may want to do some research. The ZFS tuning cheatsheet or ArchWiki is a good place to start.}} | |||
<syntaxhighlight lang="bash"> | <syntaxhighlight lang="bash"> | ||
zpool create -O encryption=on -O keyformat=passphrase -O keylocation=prompt -O compression= | zpool create -O encryption=on -O keyformat=passphrase -O keylocation=prompt -O compression=zstd -O mountpoint=none -O xattr=sa -O acltype=posixacl -o ashift=12 zpool $DISK | ||
# enter the password to decrypt the pool at boot | |||
Enter new passphrase: | |||
Re-enter new passphrase: | |||
# Create datasets | |||
zfs create zpool/root | |||
zfs create zpool/nix | |||
zfs create zpool/var | |||
zfs create zpool/home | |||
# Mount root | |||
mkdir -p /mnt | |||
zfs | mount -t zfs zpool/root /mnt -o zfsutil | ||
# Mount nix, var, home | |||
mkdir /mnt/nix /mnt/var /mnt/home | mkdir /mnt/nix /mnt/var /mnt/home | ||
mount -t zfs zpool/nix /mnt/nix -o zfsutil | |||
mount -t zfs zpool/nix /mnt/nix | mount -t zfs zpool/var /mnt/var -o zfsutil | ||
mount -t zfs zpool/var /mnt/var | mount -t zfs zpool/home /mnt/home -o zfsutil | ||
mount -t zfs zpool/home /mnt/home | |||
</syntaxhighlight> | </syntaxhighlight> | ||
Line 100: | Line 187: | ||
</syntaxhighlight> | </syntaxhighlight> | ||
==== Format boot partition and enable swap ==== | |||
<syntaxhighlight lang="bash"> | <syntaxhighlight lang="bash"> | ||
mkfs.fat -F 32 -n boot | mkfs.fat -F 32 -n boot $BOOT | ||
</syntaxhighlight> | </syntaxhighlight> | ||
<syntaxhighlight lang="bash"> | |||
mkswap -L swap $SWAP | |||
swapon $SWAP | |||
</syntaxhighlight> | |||
==== Installation ==== | |||
<syntaxhighlight lang="bash"> | <syntaxhighlight lang="bash"> | ||
# Mount boot | |||
mkdir -p /mnt/boot | mkdir -p /mnt/boot | ||
mount / | mount $BOOT /mnt/boot | ||
# Generate the nixos config | |||
nixos-generate-config --root /mnt | |||
... | |||
writing /mnt/etc/nixos/hardware-configuration.nix... | |||
writing /mnt/etc/nixos/configuration.nix... | |||
For more hardware-specific settings, see https://github.com/NixOS/nixos-hardware. | |||
</syntaxhighlight> | </syntaxhighlight> | ||
Now edit the configuration.nix that was just created in <code>/mnt/etc/nixos/configuration.nix</code> and make sure to have at least the following content in it. | |||
{{file|/mnt/etc/nixos/configuration.nix|diff|3= | |||
{ | |||
... | |||
# Boot loader config for configuration.nix: | |||
boot.loader.systemd-boot.enable = true; | |||
# for local disks that are not shared over the network, we don't need this to be random | |||
# without this, "ZFS requires networking.hostId to be set" will be raised | |||
+ networking.hostId = "8425e349"; | |||
... | |||
} | |||
}} | |||
Now check the hardware-configuration.nix in <code>/mnt/etc/nixos/hardware-configuration.nix</code> and add whats missing e.g. <code>options = [ "zfsutil" ]</code> for all filesystems except boot and <code>randomEncryption = true;</code> for the swap partition. Also change the generated swap device to the partition we created e.g. <code>/dev/disk/by-id/nvme-SKHynix_HFS512GDE9X081N_FNB6N634510106K5O-part2</code> in this case and <code>/dev/disk/by-id/nvme-SKHynix_HFS512GDE9X081N_FNB6N634510106K5O-part1</code> for boot. | |||
{{file|/mnt/etc/nixos/configuration.nix|diff|3= | |||
{ | { | ||
... | |||
fileSystems."/" = { | |||
device = "zpool/root"; | |||
fsType = "zfs"; | |||
# the zfsutil option is needed when mounting zfs datasets without "legacy" mountpoints | |||
+ options = [ "zfsutil" ]; | |||
}; | }; | ||
fileSystems."/" = | fileSystems."/nix" = { | ||
device = "zpool/nix"; | |||
fsType = "zfs"; | |||
+ options = [ "zfsutil" ]; | |||
}; | |||
fileSystems."/ | fileSystems."/var" = { | ||
device = "zpool/var"; | |||
fsType = "zfs"; | |||
+ options = [ "zfsutil" ]; | |||
}; | |||
fileSystems."/ | fileSystems."/home" = { | ||
device = "zpool/home"; | |||
fsType = "zfs"; | |||
+ options = [ "zfsutil" ]; | |||
}; | |||
fileSystems."/ | fileSystems."/boot" = { | ||
device = "/dev/disk/by-id/nvme-SKHynix_HFS512GDE9X081N_FNB6N634510106K5O-part1"; | |||
fsType = "vfat"; | |||
}; | |||
swapDevices = [{ | |||
+ device = "/dev/disk/by-id/nvme-SKHynix_HFS512GDE9X081N_FNB6N634510106K5O-part2"; | |||
+ randomEncryption = true; | |||
}]; | |||
} | } | ||
}} | |||
Now you may install NixOS with <code>nixos-install</code>. | |||
== Importing on boot == | == Importing on boot == | ||
Line 182: | Line 288: | ||
}; | }; | ||
</syntaxhighlight> | </syntaxhighlight> | ||
=== Zpool created with bus-based disk names === | |||
If you used bus-based disk names in the <syntaxhighlight inline>zpool create</syntaxhighlight> command, e.g., <syntaxhighlight inline>/dev/sda</syntaxhighlight>, NixOS may run into issues importing the pool if the names change. Even if the pool is able to be mounted (with <syntaxhighlight lang="nix" inline>boot.zfs.devNodes = "/dev/disk/by-partuuid";</syntaxhighlight> set), this may manifest as a <syntaxhighlight inline>FAULTED</syntaxhighlight> disk and a <syntaxhighlight inline>DEGRADED</syntaxhighlight> pool reported by <syntaxhighlight inline>zpool status</syntaxhighlight>. The fix is to re-import the pool using disk IDs: | |||
<syntaxhighlight> | |||
# zpool export zpool_name | |||
# zpool import -d /dev/disk/by-id zpool_name | |||
</syntaxhighlight> | |||
The import setting is reflected in <syntaxhighlight inline="" lang="bash">/etc/zfs/zpool.cache</syntaxhighlight>, so it should persist through subsequent boots. | |||
=== Zpool created with disk IDs === | |||
If you used disk IDs to refer to disks in the <code>zpool create</code> command, e.g., <code>/dev/disk/by-id</code>, then NixOS may consistently fail to import the pool unless <code>boot.zfs.devNodes = "/dev/disk/by-id"</code> is also set. | |||
== Mount datasets at boot == | == Mount datasets at boot == | ||
Line 214: | Line 333: | ||
You can tweak the interval (defaults to once a week) and which pools should be scrubbed (defaults to all). | You can tweak the interval (defaults to once a week) and which pools should be scrubbed (defaults to all). | ||
== Remote unlock == | == Remote unlock == | ||
=== Unlock encrypted | === Unlock encrypted ZFS via SSH on boot === | ||
{{note|As of 22.05, rebuilding your config with the below directions may result in a situation where, if you want to revert the changes, you may need to do some pretty hairy nix-store manipulation to be able to successfully rebuild, see https://github.com/NixOS/nixpkgs/issues/101462#issuecomment-1172926129}} | {{note|As of 22.05, rebuilding your config with the below directions may result in a situation where, if you want to revert the changes, you may need to do some pretty hairy nix-store manipulation to be able to successfully rebuild, see https://github.com/NixOS/nixpkgs/issues/101462#issuecomment-1172926129}} | ||
Line 249: | Line 366: | ||
</syntaxhighlight> | </syntaxhighlight> | ||
* In order to use DHCP in the initrd, network manager must not be enabled and <syntaxhighlight lang="nix" inline>networking.useDHCP = true;</syntaxhighlight> must be set. | * In order to use DHCP in the initrd, network manager must not be enabled and <syntaxhighlight lang="nix" inline>networking.useDHCP = true;</syntaxhighlight> must be set. | ||
* If your network card isn't started, you'll need to add the according | * If your network card isn't started, you'll need to add the according Kernel module to the Kernel and initrd as well, e.g. <syntaxhighlight lang="nix"> | ||
boot.kernelModules = [ "r8169" ]; | boot.kernelModules = [ "r8169" ]; | ||
boot.initrd.kernelModules = [ "r8169" ];</syntaxhighlight> | boot.initrd.kernelModules = [ "r8169" ];</syntaxhighlight>To know what kernel modules are needed, run <code>nix shell nixpkgs#pciutils --command lspci -v | grep -iA8 'network\|ethernet'</code> . | ||
After that you can unlock your datasets using the following ssh command: | After that you can unlock your datasets using the following ssh command: | ||
Line 302: | Line 419: | ||
== Take snapshots automatically == | == Take snapshots automatically == | ||
See | See {{nixos:option|services.zfs.autoSnapshot}} or {{nixos:option|services.sanoid}} section in <code>man configuration.nix</code>. | ||
== NFS share == | == NFS share == | ||
Line 317: | Line 434: | ||
</syntaxhighlight> | </syntaxhighlight> | ||
Only this line is needed. Configure firewall if necessary, as described in [[NFS]] article. | Only this line is needed. Configure firewall if necessary, as described in [[NFS]] article. | ||
{{warning|<code>zfs share</code> or <code>sharenfs</code> does not work if the <code>mountpoint</code> is set to <code>legacy</code> (or <code>none</code>, of course). I was unable to find a source for this behaviour, but I was stuck on the problem for days, until I realized the problem. ::Reply: sharenfs controlls what | |||
is written into <code>/etc/exports</code>. If ZFS does not know the mountpoint, as is the case in | |||
mountpoint legacy or none, the contents of <code>/etc/exports</code> would be wrong}} | |||
Then, set <code>sharenfs</code> property: | Then, set <code>sharenfs</code> property: | ||
<syntaxhighlight lang="console"> | <syntaxhighlight lang="console"> | ||
zfs set sharenfs="ro=192.168.1.0/24,all_squash,anonuid=70,anongid=70" rpool/myData | |||
</syntaxhighlight> | </syntaxhighlight> | ||
For more options, see <code>man 5 exports</code>. | For more options, see <code>man 5 exports</code>. | ||
Line 326: | Line 447: | ||
Todo: sharesmb property for Samba. | Todo: sharesmb property for Samba. | ||
== Mail | == Mail notifications (ZFS Event Daemon) == | ||
ZFS Event Daemon (zed) monitors events generated by the ZFS | ZFS Event Daemon (zed) monitors events generated by the ZFS Kernel module and runs configured tasks. It can be configured to send an email when a pool scrub is finished or a disk has failed. [https://search.nixos.org/options?query=services.zfs.zed zed options] | ||
First, we need to configure a mail transfer agent, the program that sends email: | First, we need to configure a mail transfer agent, the program that sends email: | ||
<syntaxhighlight lang="nix"> | <syntaxhighlight lang="nix"> | ||
{ | { | ||
age.secrets.msmtp = { | |||
file = "${inputs.self.outPath}/secrets/msmtp.age"; | |||
}; | |||
# for zed enableMail, enable sendmailSetuidWrapper | |||
services.mail.sendmailSetuidWrapper.enable = true; | |||
programs.msmtp = { | programs.msmtp = { | ||
enable = true; | enable = true; | ||
Line 340: | Line 466: | ||
defaults = { | defaults = { | ||
aliases = "/etc/aliases"; | aliases = "/etc/aliases"; | ||
port = | port = 587; | ||
auth = "plain"; | |||
tls = "on"; | tls = "on"; | ||
tls_starttls = "on"; | |||
tls_starttls = " | |||
}; | }; | ||
accounts = { | accounts = { | ||
default = { | default = { | ||
host = "mail.example.com"; | host = "smtp.mail.example.com"; | ||
passwordeval = "cat | passwordeval = "cat ${config.age.secrets.msmtp.path}"; | ||
user = " | user = "myname@example.com"; | ||
from = " | from = "myname@example.com"; | ||
}; | }; | ||
}; | }; | ||
Line 360: | Line 485: | ||
Then, configure an alias for root account. With this alias configured, all mails sent to root, such as cron job results and failed sudo login events, will be redirected to the configured email account. | Then, configure an alias for root account. With this alias configured, all mails sent to root, such as cron job results and failed sudo login events, will be redirected to the configured email account. | ||
<syntaxhighlight lang=" | <syntaxhighlight lang="nix"> | ||
{ | |||
root: | environment.etc.aliases.text = '' | ||
root: admin@example.com | |||
''; | |||
} | |||
</syntaxhighlight> | </syntaxhighlight> | ||
Finally, | Finally, enable zed mail notification: | ||
<syntaxhighlight lang="nix"> | <syntaxhighlight lang="nix"> | ||
{ | { | ||
services.zfs.zed. | services.zfs.zed. = { | ||
enableMail = true; | |||
ZED_EMAIL_ADDR = [ "root" ]; | settings = { | ||
ZED_EMAIL_ADDR = [ "root" ]; | |||
# send notification if scrub succeeds | |||
ZED_NOTIFY_VERBOSE = true; | |||
}; | |||
}; | }; | ||
} | } | ||
</syntaxhighlight> | </syntaxhighlight> | ||
Line 391: | Line 512: | ||
</syntaxhighlight> | </syntaxhighlight> | ||
[[Category:Guide]] | [[Category:Guide]] |