ZFS: Difference between revisions

Tie-ling (talk | contribs)
update mail notification
Tie-ling (talk | contribs)
Line 5: Line 5:
ZFS integrates into NixOS via the {{nixos:option|boot.zfs}} and {{nixos:option|services.zfs}} options.
ZFS integrates into NixOS via the {{nixos:option|boot.zfs}} and {{nixos:option|services.zfs}} options.


== Limitations ==
Uninterruptible power supply (UPS), provides near-instantaneous
protection from input power interruptions by switching to energy
stored in battery packs, supercapacitors or flywheels. The on-battery
run-times of most UPSs are relatively short (only a few minutes) but
sufficient to "buy time" for initiating a standby power source or
properly shutting down the protected equipment. (source: Wikipedia)


==== Latest Kernel compatible with ZFS ====
Network UPS Tools (NUT) is a collection of software for managing power
ZFS often does not support the latest Kernel versions. It is recommended to use an LTS Kernel version whenever possible; the NixOS default Kernel is generally suitable. See [[Linux kernel|Linux Kernel]] for more information about configuring a specific Kernel version.
devices, mainly UPS units. This article describes the configuration of
NUT for a simple server with a single power supply, with no local
users and no additional equipment.


If your config specifies a Kernel version that is not officially supported by upstream ZFS, the ZFS module will fail to evaluate with an error that the ZFS package is "broken". Upstream ZFS changed in 2.3 to refuse to build by default, regardless of Nixpkgs’ broken marking (or ignoring).  
This article is mostly adapted from the excellent NUT ConfigExamples
book, version 3.0, by Roger
Price. [https://github.com/networkupstools/ConfigExamples/releases/latest/download/ConfigExamples.pdf]


===== Selecting the latest ZFS-compatible Kernel =====
= Compatible Hardware =
{{Warning|This will often result in the Kernel version going backwards as Kernel versions become end-of-life and are removed from Nixpkgs. If you need more control over the Kernel version due to hardware requirements, consider simply pinning a specific version rather than calculating it as below.}}
To use the latest ZFS-compatible Kernel currently available, the following configuration may be used.


Compatible hardware are listed at NUT website
[https://networkupstools.org/stable-hcl.html].  For best results,
choose a model with good driver support with battery replacement
notification.  Battery pack in UPS is designed to be replaced every
few years.  For example, Eaton Ellipse ECO 650
[https://www.eaton.com/content/dam/eaton/products/backup-power-ups-surge-it-power-distribution/backup-power-ups/eaton-ellipse-eco/eaton-ellipse-eco-userguides-en.pdf]
needs a new battery every four years, under optimal operating
conditions.
= Components of NUT Software =
One or more UPS's are attached to the <strong>attachment
daemon</strong> <code>upsd</code> via a UPS-specific <strong>driver
daemon</strong>.
The attachment daemon maintains an abstract image of the UPS in
memory.  The attachment daemon can be queried by the <code>upsc</code>
command. The driver daemon talks to the hardware and the attachment
daemon.  The driver daemon can be controlled by the
<code>upsdrvctl</code> command.
The <strong>management daemon</strong> <code>upsmon</code> is a client
of upsd.  It runs permanently, checks the status of UPS, and react to
status changes, such as initiating a shutdown.
= Configuration files in use =
In this simple standalone server setup, the following configuration
files are generated:
* ups.conf, declare UPS-specific driver information, power.ups.ups.* option
* upsd.conf, control access to upsd, power.ups.upsd option
* upsd.users, add user with access to upsd, power.ups.users option
* upsmon.conf, connect to upsd, power.ups.upsmon.monitor section;
* upsmon.conf, set how upsmon should react to status changes, power.ups.upsmon.settings section
* delayed UPS shutdown systemd unit, to make Restore Power on AC Return BIOS option functional, systemd.services.nut-delayed-ups-shutdown section
= Declare UPS units =
Corresponds to file ups.conf
<syntaxhighlight lang="nix">
<syntaxhighlight lang="nix">
{
  power.ups = {
  config,
    enable = true;
  lib,
    mode = "standalone";
  pkgs,
    # section: The upsd UPS declarations: ups.conf
  ...
    # this UPS device is named UPS-1.
}:
    ups."UPS-1" = {
      description = "Eaton Ellipse ECO 650 with 12V 7Ah Batt";


let
      # driver name from https://networkupstools.org/stable-hcl.html
  zfsCompatibleKernelPackages = lib.filterAttrs (
      driver = "usbhid-ups";
    name: kernelPackages:
 
    (builtins.match "linux_[0-9]+_[0-9]+" name) != null
      # usbhid-ups driver always use value "auto"
    && (builtins.tryEval kernelPackages).success
      port = "auto";
    && (!kernelPackages.${config.boot.zfs.package.kernelModuleAttribute}.meta.broken)
 
  ) pkgs.linuxKernel.packages;
      directives = [
  latestKernelPackage = lib.last (
        # "Restore power on AC" BIOS option needs power to be cut a few seconds to work;
    lib.sort (a: b: (lib.versionOlder a.kernel.version b.kernel.version)) (
        # this is achieved by the offdelay and ondelay directives.
      builtins.attrValues zfsCompatibleKernelPackages
 
    )
        # in the last stages of system shutdown, "upsdrvctl shutdown" is called to tell UPS that
  );
        # after offdelay seconds, the UPS power must be cut, even if
in
        # wall power returns.
{
        "offdelay = 60"
  # Note this might jump back and forth as kernels are added or removed.
 
  boot.kernelPackages = latestKernelPackage;
        # UPS power is now cut regardless of wall power. After (ondelay minus offdelay) seconds,
}
        # if wall power returns, turn on UPS power. The system has now been disconnected for a minimum of (ondelay minus offdelay) seconds,
        # "Restore power on AC" should now power on the system.
        # For reasons described above, ondelay value must be larger than offdelay value.
        "ondelay = 70"
 
        # set value for battery.charge.low,
        # upsmon initiate shutdown once this threshold is reached.
        "lowbatt = 40"
      ];
    };
</syntaxhighlight>
</syntaxhighlight>


===== Using unstable, pre-release ZFS =====
= Declare upsd listening ports =
{{Warning|Pre-release ZFS versions may be less well-tested, and may have critical bugs that may cause data loss.}}{{Warning|Running ZFS with a Kernel unsupported by upstream “is considered EXPERIMENTAL by the OpenZFS project. Even if it appears to build and run correctly, there may be bugs that can cause SERIOUS DATA LOSS.”}}
Corresponds to file upsd.conf. This file declares which ports the upsd daemon will listen to.
In some cases, a pre-release version of ZFS may be available that supports a newer Kernel. Use it with <code>boot.zfs.package = pkgs.zfs_unstable;</code>. Using zfs_unstable may allow the use of an unsupported Kernel; as warned above, [https://github.com/openzfs/zfs/blob/6a2f7b38442b42f4bc9a848f8de10fc792ce8d76/config/kernel.m4#L473-L487 upstream considers this experimental].


==== Partial support for swap on ZFS ====
<syntaxhighlight lang="nix">
  power.ups = {
    # section: The upsd daemon access control; upsd.conf
    upsd = {
      listen = [
        {
          address = "127.0.0.1";
          port = 3493;
        }
        {
          address = "::1";
          port = 3493;
        }
      ];
    };
  };
</syntaxhighlight>


ZFS does not support swapfiles. swap devices can be used instead. Additionally, hibernation is disabled by default due to a [https://github.com/NixOS/nixpkgs/pull/208037 high risk] of data corruption. Note that even if that pull request is merged, it does not fully mitigate the risk. If you wish to enable hibernation regardless and made sure that swapfiles on ZFS are not used, set <code>boot.zfs.allowHibernation = true</code>.
= Declare users with access to UPS =
Corresponds to file upsd.users. This file declares a virtual user (not related to /etc/passwd users) with write access to UPS. A password is also declared.


==== Zpool not found ====
<syntaxhighlight lang="nix">
  power.ups = {
    # section: Users that can access upsd. The upsd daemon user
    # declarations. upsd.users
    users."nut-admin" = {
      passwordFile = ../resources/ups-passwd.txt;
      upsmon = "primary";
    };
  };
</syntaxhighlight>


If NixOS fails to import the zpool on reboot, you may need to add <syntaxhighlight lang="nix" inline>boot.zfs.devNodes = "/dev/disk/by-path";</syntaxhighlight> or <syntaxhighlight lang="nix" inline>boot.zfs.devNodes = "/dev/disk/by-partuuid";</syntaxhighlight> to your configuration.nix file.
= Connect upsmon to upsd =
Corresponds to upsmon.conf.  This file declares how upsmon should connect to upsd
<syntaxhighlight lang="nix">
  power.ups = {
    # section: The upsmon daemon configuration: upsmon.conf
    upsmon.monitor."UPS-1" = {
      system = "UPS-1@localhost";
      powerValue = 1;
      user = "nut-admin";
      passwordFile = ../resources/ups-passwd.txt;
      type = "primary";
    };
  };
</syntaxhighlight>


The differences can be tested by running <code>zpool import -d /dev/disk/by-id</code> when none of the pools are discovered, eg. a live iso.
= Declare how upsmon should react to status changes =
Corresponds to upsmon.conf.  This file declares how upsmon is to handle NOTIFY events.


==== ZFS conflicting with systemd ====
<syntaxhighlight lang="nix">
  power.ups = {
    upsmon.settings = {
      # This configuration file declares how upsmon is to handle
      # NOTIFY events.


ZFS will manage mounting non-legacy ZFS filesystems, but NixOS tries to manage mounting with systemd. ZFS native mountpoints are not managed as part of the system configuration (but better support hibernation with a separate swap partition). This can lead to conflicts if the ZFS mount service is also enabled for the same datasets.
      # POWERDOWNFLAG and SHUTDOWNCMD is provided by NixOS default
      # values


Disable the mount service with <code>systemd.services.zfs-mount.enable = false;</code> or remove the <code>fileSystems</code> entries in hardware-configuration.nix. Otherwise, use legacy mountpoints (created with e.g. <code>zfs create -o mountpoint=legacy</code>). Mountpoints must be specified with <code>fileSystems."/mount/point" = {};</code> or with <code>nixos-generate-config</code>.
      # values provided by ConfigExamples 3.0 book
      NOTIFYMSG = [
        [ "ONLINE" ''"UPS %s: On line power."'' ]
        [ "ONBATT" ''"UPS %s: On battery."'' ]
        [ "LOWBATT" ''"UPS %s: Battery is low."'' ]
        [ "REPLBATT" ''"UPS %s: Battery needs to be replaced."'' ]
        [ "FSD" ''"UPS %s: Forced shutdown in progress."'' ]
        [ "SHUTDOWN" ''"Auto logout and shutdown proceeding."'' ]
        [ "COMMOK" ''"UPS %s: Communications (re-)established."'' ]
        [ "COMMBAD" ''"UPS %s: Communications lost."'' ]
        [ "NOCOMM" ''"UPS %s: Not available."'' ]
        [ "NOPARENT" ''"upsmon parent dead, shutdown impossible."'' ]
      ];
      NOTIFYFLAG = [
        [ "ONLINE" "SYSLOG+WALL" ]
        [ "ONBATT" "SYSLOG+WALL" ]
        [ "LOWBATT" "SYSLOG+WALL" ]
        [ "REPLBATT" "SYSLOG+WALL" ]
        [ "FSD" "SYSLOG+WALL" ]
        [ "SHUTDOWN" "SYSLOG+WALL" ]
        [ "COMMOK" "SYSLOG+WALL" ]
        [ "COMMBAD" "SYSLOG+WALL" ]
        [ "NOCOMM" "SYSLOG+WALL" ]
        [ "NOPARENT" "SYSLOG+WALL" ]
      ];
      # every RBWARNTIME seconds, upsmon will generate a replace
      # battery NOTIFY event
      RBWARNTIME = 216000;
      # every NOCOMMWARNTIME seconds, upsmon will generate a UPS
      # unreachable NOTIFY event
      NOCOMMWARNTIME = 300;
      # after sending SHUTDOWN NOTIFY event to warn users, upsmon
      # waits FINALDELAY seconds long before executing SHUTDOWNCMD
      # Some UPS's don't give much warning for low battery and will
      # require a value of 0 here for aq safe shutdown.
      FINALDELAY = 0;
    };
  };
</syntaxhighlight>


== Guides ==
== Guides ==