Swap: Difference between revisions
m →Zswap swap cache: add note of needing to enable systemd for using lz4 |
Axodentally (talk | contribs) Clarify swap on zvol situation. Add notes to zram (recommendation to use userspace oom-killer, zswap instead of zram with writeback). Expand zswap section. |
||
| (30 intermediate revisions by 6 users not shown) | |||
| Line 1: | Line 1: | ||
[[Category:Configuration]] | [[Category:Configuration]] | ||
Swap | Swap allows "cold" pages of virtual memory to be stored in places other than directly in the physical RAM, effectively allowing more pages to be stored. This can be accomplished by using space on disk, such as [[#Swap file|swap file]] or [[#Swap partition|swap partition]], or through compression based methods like [[#Zram swap|zram]]. Additionally, [[#Zswap swap cache|zswap]] can act as a RAM-based compressed cache sitting in front of a traditional disk-based swap device. | ||
= Configuration = | = Configuration = | ||
| Line 23: | Line 23: | ||
swapDevices = [{ | swapDevices = [{ | ||
device = "/var/lib/swapfile"; | device = "/var/lib/swapfile"; | ||
size = 16*1024; # 16 | size = 16*1024; # 16 GiB | ||
}]; | }]; | ||
</nowiki> | </nowiki> | ||
}} | }} | ||
This will create a 16GB swapfile at <code>/var/lib/swapfile</code>. The <code>size</code> value [https://search.nixos.org/options?show=swapDevices.*.size is specified in | This will create a 16GB swapfile at <code>/var/lib/swapfile</code>. The <code>size</code> value [https://search.nixos.org/options?show=swapDevices.*.size is specified in mebibytes]. This will cause a swap file to be generated and an entry to be set up in <code>/etc/fstab</code>. | ||
== Swap partition == | == Swap partition == | ||
Swap partitions are typically created during the initial disk partitioning phase of a NixOS installation. For instructions on creating swap partitions, see the relevant NixOS manual sections for [https://nixos.org/manual/nixos/stable/#sec-installation-manual-partitioning-UEFI UEFI]/[https://nixos.org/manual/nixos/stable/#sec-installation-manual-partitioning-MBR MBR] partition schemes and [https://nixos.org/manual/nixos/stable/#sec-installation-manual-partitioning-formatting formatting]. | Swap partitions are typically created during the initial disk partitioning phase of a NixOS installation. For instructions on creating swap partitions, see the relevant NixOS manual sections for [https://nixos.org/manual/nixos/stable/#sec-installation-manual-partitioning-UEFI UEFI]/[https://nixos.org/manual/nixos/stable/#sec-installation-manual-partitioning-MBR MBR] partition schemes and [https://nixos.org/manual/nixos/stable/#sec-installation-manual-partitioning-formatting formatting]. | ||
Swap partitions can be defined in <code>configuration.nix</code> like above or (if GPT) be automatically discovered by <code>systemd-gpt-auto-generator(8)</code>. Using the former allows you to have some control over swap mounting options and to enable features such as encrypted swap. | |||
== Zram swap == | == Zram swap == | ||
| Line 45: | Line 47: | ||
It is an alternative or complementary approach to swap disks, suitable for systems with enough RAM. In the event the system needs to swap it will move uncompressed RAM contents into the compressed area, saving RAM space while effectively increasing the available RAM at the cost of computational power for compression and decompression. | It is an alternative or complementary approach to swap disks, suitable for systems with enough RAM. In the event the system needs to swap it will move uncompressed RAM contents into the compressed area, saving RAM space while effectively increasing the available RAM at the cost of computational power for compression and decompression. | ||
{{Note|When using zram for swap, it is highly recommended to enable a userspace OOM killer such as systemd-oomd (via {{nixos:option|systemd.oomd.enable}}). Because zram acts as a block device with a hard capacity limit in RAM, the kernel's native OOM killer can sometimes fail to trigger in time under heavy memory pressure, leading to severe system lockups.<ref name="ChrisDown-zswapVsZram">https://chrisdown.name/2026/03/24/zswap-vs-zram-when-to-use-what.html</ref>}} | |||
See [https://search.nixos.org/options?query=zramSwap zramSwap] for a full list of available options and their descriptions. | See [https://search.nixos.org/options?query=zramSwap zramSwap] for a full list of available options and their descriptions. | ||
| Line 51: | Line 55: | ||
Zram supports writeback functionality, allowing idle or incompressible pages to be moved to a backing storage device rather than keeping it in memory. Currently, writeback can only use block storage devices (such as partitions) and does not support swap files. The backing partition must be manually created first, but does not require formatting. | Zram supports writeback functionality, allowing idle or incompressible pages to be moved to a backing storage device rather than keeping it in memory. Currently, writeback can only use block storage devices (such as partitions) and does not support swap files. The backing partition must be manually created first, but does not require formatting. | ||
{{Note|While zram writeback allows moving incompressible or idle pages to disk, it is not fully integrated into the kernel's memory management subsystem. If your goal is to have compressed RAM that automatically spills over to a physical disk when full, [[#Zswap swap cache|zswap]] is generally the better and more robust tool for the job <ref name="ChrisDown-zswapVsZram"/>.}} | |||
An example configuration: | An example configuration: | ||
| Line 59: | Line 65: | ||
enable = true; | enable = true; | ||
writebackDevice = "/dev/sda1" | writebackDevice = "/dev/sda1" | ||
}; | |||
</nowiki> | </nowiki> | ||
}} | }} | ||
| Line 67: | Line 74: | ||
cat /sys/block/zram0/backing_dev | cat /sys/block/zram0/backing_dev | ||
</syntaxhighlight> | </syntaxhighlight> | ||
If you see an error entry like | |||
<pre> | |||
Jul 08 17:14:50 COMPUTER zram-generator[3056]: Error: Failed to configure write-back device into /sys/block/zram0/backing_dev | |||
Jul 08 17:14:50 COMPUTER zram-generator[3056]: Caused by: | |||
Jul 08 17:14:50 COMPUTER zram-generator[3056]: Device or resource busy (os error 16) | |||
</pre> | |||
This is probably because the writeback device has already been mounted elsewhere (e.g. as swap). To avoid this you need to do as the [[#Disable swap]] section says and make sure your writeback device is not being mounted as swap (this can happen due to <code>systemd-gpt-auto-generator(8)</code>). Do note that zram writeback does ''not'' respect the swap on-disk format and will destroy your existing swap header. | |||
== Zswap swap cache == | == Zswap swap cache == | ||
[https://docs.kernel.org/admin-guide/mm/zswap.html Zswap] is a | [https://docs.kernel.org/admin-guide/mm/zswap.html Zswap] is a compressed RAM cache for swap pages. It acts as a middle layer between system memory and a traditional disk-based swap device, storing compressed pages in RAM before optionally writing them out to disk-based swap if necessary. Because zswap integrates directly into the kernel's memory management subsystem, it automatically and dynamically evicts the coldest pages to your backing disk when memory pressure rises. This graceful degradation provides a significant architectural advantage over zram for systems that have physical disk swap. | ||
Unlike zram, zswap requires a disk-based swap device to back it. | Unlike zram, zswap requires a disk-based swap device or file to back it. | ||
Zswap is controlled by kernel parameters and can be enabled in your NixOS configuration by setting appropriate options through <code>boot.kernelParams</code>. | Zswap is controlled by kernel parameters and can be enabled in your NixOS configuration by setting appropriate options through <code>boot.kernelParams</code>. | ||
| Line 82: | Line 98: | ||
"zswap.compressor=lz4" # compression algorithm | "zswap.compressor=lz4" # compression algorithm | ||
"zswap.max_pool_percent=20" # maximum percentage of RAM that zswap is allowed to use | "zswap.max_pool_percent=20" # maximum percentage of RAM that zswap is allowed to use | ||
"zswap.shrinker_enabled=1" # whether to shrink the pool proactively on high memory pressure | |||
]; | ]; | ||
</nowiki> | </nowiki> | ||
| Line 89: | Line 106: | ||
You can verify zswap's runtime status via <code>cat /sys/module/zswap/parameters/enabled</code> and inspect usage statistics with <code># grep -r . /sys/kernel/debug/zswap/</code> | You can verify zswap's runtime status via <code>cat /sys/module/zswap/parameters/enabled</code> and inspect usage statistics with <code># grep -r . /sys/kernel/debug/zswap/</code> | ||
A proper zswap configuration module is [https://github.com/NixOS/nixpkgs/pull/470366 currently under review]. | |||
== Disable swap == | == Disable swap == | ||
| Line 98: | Line 117: | ||
</syntaxhighlight> | </syntaxhighlight> | ||
If you are using GPT partitioning tables, <code>systemd-gpt-auto-generator(8)</code> will still mount your swap partition automatically. You must therefore turn on attribute 63 on | If you are using GPT partitioning tables, <code>systemd-gpt-auto-generator(8)</code> will still mount your swap partition automatically. You must therefore turn on attribute 63 ("no-auto") on ''each'' swap partition partition in the partition table. This can be done with gptfdisk or similar: | ||
<syntaxhighlight lang="console"> | <syntaxhighlight lang="console"> | ||
| Line 108: | Line 127: | ||
<enter> | <enter> | ||
w | w | ||
</syntaxhighlight> | |||
Alternatively, <code>systemd-gpt-auto-generator(8)</code> for swap can be disabled globally through a kernel cmdline <code>systemd.swap=0</code>: | |||
<syntaxhighlight lang="nix"> | |||
boot.kernelParams = [ "systemd.swap=0" ]; | |||
</syntaxhighlight> | </syntaxhighlight> | ||
= Tips and Tricks = | = Tips and Tricks = | ||
== | == Mount options == | ||
=== discard === | |||
Solid state drives have fast random access times, which make them great for swap if you ignore the limited lifespan. Enabling TRIM (discard) on the swap files can help avoid unnecessary copy actions on the SSD, reducing wear and potentially helping increase performance. | |||
<syntaxhighlight lang="nix"> | <syntaxhighlight lang="nix"> | ||
swapDevices = [{ | swapDevices = [{ | ||
device = "/dev/sdXY"; | device = "/dev/sdXY"; | ||
options = [ "discard" ]; # equivalent to swapon --discard | |||
}]; | |||
</syntaxhighlight> | |||
A lower-impact option is <code>"discard=once"</code>, which runs discard exactly once when the swap is enabled, but does not continually issue discard commands as pages are being overwritten. This could make more sense depending on your hardware. | |||
<code>systemd-gpt-auto-generator(8)</code> does not automatically enable <code>discard</code>. Also, never enable <code>discard</code> on mdadm RAID setups, as ArchWiki reports that it causes lockup. | |||
== Encrypt swap with random key == | |||
Because data from memory is evicted into swap, any secret data in memory can also end up in swap. Because the disks backing the swap is often nonvolatile (data is not lost after power cut), this can represent another way for data to end up in the wrong hands if you computer is seized. | |||
By encrypting the swap with a random key kept in memory, we make sure that the contents of the swap become unreadable as soon as the data in memory has been lost. NixOS contains a handy helper to help you do this, generating a new key on each boot: | |||
<syntaxhighlight lang="nix">swapDevices = [{ | |||
device = "/dev/disk/by-partuuid/aaaaaaaaa-bbbb-cccc-dddd-0123456789ab"; | |||
randomEncryption.enable = true; | randomEncryption.enable = true; | ||
}];</syntaxhighlight> | |||
The selected device will have all its content made unusable at every boot. Using a partuuid or partlabel is recommended because it is less subject to change when the overall partition scheme changes. | |||
If you want to use TRIM, set <code>randomEncryption.allowDiscards</code> in addition to the <code>options</code>. This has the security implication of: | |||
* telling whoever gets ahold of your swap drive which parts are being actually used (bad), | |||
* telling your SSD to not give out the data in unused parts and to not try to keep them around during garbage collection (good). | |||
You will need to weigh between the two. | |||
'''Warning:''' On some NixOS versions, if <code>randomEncryption.enable = true</code> and the <code>swap</code> is a file (rather than a partition) located on an encrypted LUKS partition, [https://discourse.nixos.org/t/swap-file-on-luks-partition/72234 the system can freeze as soon as the swap is used.] | |||
Using a random key makes hibernation impossible. If you want to use hibernation, use a regular [[Full Disk Encryption]] with a fixed key. Alternatively, you can encrypt the swap partition separately: | |||
== Encrypt swap partition with password or fixed key == | |||
If you prefer to encrypt the swap partition individually, first create an unformatted partition of the desired size, for example using <code>gparted</code>. In the following, the partition is <code>/dev/sdXY</code>. Then<syntaxhighlight lang="bash"> | |||
sudo cryptsetup luksFormat /dev/sdXY --label lb_luks_swap | |||
sudo cryptsetup luksOpen /dev/disk/by-label/lb_luks_swap swap | |||
sudo mkswap /dev/mapper/swap -L lb_swap | |||
</syntaxhighlight>When asked, provide a password for unlocking the partition. | |||
This will create | |||
* a LUKS container on the unformatted partition with label <code>lb_luks_swap</code> | |||
* open it and mount it under <code>/dev/mapper/swap</code>, | |||
* format it as swap with label <code>lb_swap</code>. | |||
If all is correct, block devices should look similar to:<syntaxhighlight lang="bash"> | |||
$ lsblk -o +LABEL | |||
... | |||
└─sdaXY 259:16 0 128G 0 part lb_luks_swap | |||
└─lb_swap 254:0 0 128G 0 crypt [SWAP] lb_swap | |||
... | |||
</syntaxhighlight>To tell NixOS to use this partition for swap, add to <code>hardware-configuration.nix</code>:<syntaxhighlight lang="nix"> | |||
swapDevices = [{ | |||
device = "/dev/disk/by-label/lb_swap"; | |||
encrypted = { | |||
enable = true; | |||
label = "swap"; | |||
blkDev = "/dev/disk/by-label/lb_luks_swap"; | |||
}; | |||
}]; | }]; | ||
</syntaxhighlight> | |||
</syntaxhighlight>This automatically adds the swap partition to <code>boot.initrd.luks.devices</code> so that <code>initrd</code> will ask for a password on reboot. initrd will automatically try to use the same password on any other LUKS volumes listed in <code>boot.initrd.luks.devices</code>. Therefore if you use the same password for other volumes you will only have to type it once. If all went well, the swap partition should be mapped at <code>/mapper/swap</code> and <code>/dev/disk/by-id/lb_swap</code>. | |||
It is also possible to specify a key file using the <code>--key-file</code> argument to <code>luksFormat</code> and <code>luksOpen</code>. Be aware that the system needs access to this file during boot, so if the key itself is stored on an encrypted volume, it may be tricky to get the unlock sequencing right. | |||
== Adjusting swap usage behaviour == | == Adjusting swap usage behaviour == | ||
[https://docs.kernel.org/admin-guide/sysctl/vm.html#swappiness Swappiness] controls how | [https://docs.kernel.org/admin-guide/sysctl/vm.html#swappiness Swappiness] controls how aggressively swap space is used, specifically how to free up memory when needed. By default, Linux uses a swappiness value of 60. Higher values will make the kernel prefer swapping out idle processes over dropping caches. Conversely lower values will try to avoid swapping as much as possible, keeping processes in RAM unless absolutely necessary. An optimal value is workload dependent and will require experimentation. | ||
{{file|/etc/nixos/configuration.nix|nix| | {{file|/etc/nixos/configuration.nix|nix| | ||
| Line 135: | Line 222: | ||
}} | }} | ||
You can see your current swappiness level by <code>cat /proc/sys/vm/swappiness</code>. | You can see your current swappiness level by <code>cat /proc/sys/vm/swappiness</code>. The lowest accepted value is 0 while the maximum value is 200. The lowest sane value is 1 (0 causes the system to not scan for unused anonymous pages, i.e. memory freed by processes, at all). | ||
For more on tuning the swap, start with [https://wiki.archlinux.org/title/Swap#Swappiness ArchWiki]'s description. | |||
== ZFS and swap == | == ZFS and swap == | ||
OpenZFS does not support swap on zvols | OpenZFS does not support swap files on a ZFS dataset. Swap on zvols is technically supported, but does not support resume from hibernation and can lead to system lockup in high memory pressure situation and is thus not recommended (see [https://github.com/openzfs/zfs/issues/7734 OpenZFS issue #7734] and the [https://wiki.archlinux.org/title/ZFS#Swap_volume arch wiki on Swap Volumes in ZFS]). | ||
Instead you should set up a swap partition or swap file on a non-ZFS filesystem.<ref>https://utcc.utoronto.ca/~cks/space/blog/solaris/ZFSForSwapMyViews</ref> | Instead, you should set up a swap partition or swap file on a non-ZFS filesystem.<ref>https://utcc.utoronto.ca/~cks/space/blog/solaris/ZFSForSwapMyViews</ref> | ||
== Using swap files on Btrfs == | == Using swap files on Btrfs == | ||