NVIDIA: Difference between revisions

From NixOS Wiki
imported>Samuela
imported>Samuela
No edit summary
Line 23: Line 23:


= CUDA and using your GPU for compute =
= CUDA and using your GPU for compute =
NixOS supports using NVIDIA GPUs for pure computing purposes, not just for graphics. For example, many users rely on NixOS for machine learning both locally and on cloud instances. These use cases are supported by the [https://github.com/orgs/NixOS/teams/cuda-maintainers @NixOS/cuda-maintainers team] on GitHub. If you have an issue using your NVIDIA GPU for computing purposes [https://github.com/nixos/nixpkgs/issues/new open an issue] on GitHub and tag @NixOS/cuda-maintainers.
See the [[CUDA]] wiki page!
 
Note that you may need to adjust your driver version to use "data center" GPUs like V100/A100s. See [https://discourse.nixos.org/t/how-to-use-nvidia-v100-a100-gpus/17754 this thread] for more info.
 
== CUDA ==
 
The CUDA toolkit is available in a [https://search.nixos.org/packages?channel=unstable&from=0&size=50&buckets=%7B%22package_attr_set%22%3A%5B%22cudaPackages%22%5D%2C%22package_license_set%22%3A%5B%5D%2C%22package_maintainers_set%22%3A%5B%5D%2C%22package_platforms%22%3A%5B%5D%7D&sort=relevance&type=packages&query=cudatoolkit number of different versions]. Please use the latest major version. You can see where they're defined in nixpkgs [https://github.com/NixOS/nixpkgs/blob/4d5b1d6b273fc4acd5dce966d2e9c0ca197b6df2/pkgs/development/compilers/cudatoolkit/default.nix here].
 
Several "CUDA-X" libraries are packages as well. In particular,
* cuDNN is packaged [https://github.com/NixOS/nixpkgs/blob/634141959076a8ab69ca2cca0f266852256d79ee/pkgs/development/libraries/science/math/cudnn/default.nix here].
* cuTENSOR is packaged [https://github.com/NixOS/nixpkgs/blob/634141959076a8ab69ca2cca0f266852256d79ee/pkgs/development/libraries/science/math/cutensor/default.nix here].
 
'''Note that these examples haven't been updated in a while (as of 2022-03-12). May not be the best solution. A better resource is likely the packaging CUDA sample code [https://github.com/NixOS/nixpkgs/blob/634141959076a8ab69ca2cca0f266852256d79ee/pkgs/test/cuda/cuda-library-samples/generic.nix here].'''
 
There some possible ways to setup a development environment using CUDA on NixOS. This can accomplished in the following ways:
 
* By making a FHS user env
 
{{file|cuda-fsh.nix|nix|<nowiki>
{ pkgs ? import <nixpkgs> {} }:
 
let
  fhs = pkgs.buildFHSUserEnv {
    name = "cuda-env";
    targetPkgs = pkgs: with pkgs; [  
      git
      gitRepo
      gnupg
      autoconf
      curl
      procps
      gnumake
      utillinux
      m4
      gperf
      unzip
      cudatoolkit
      linuxPackages.nvidia_x11
      libGLU libGL
      xorg.libXi xorg.libXmu freeglut
      xorg.libXext xorg.libX11 xorg.libXv xorg.libXrandr zlib
      ncurses5
      stdenv.cc
      binutils
    ];
    multiPkgs = pkgs: with pkgs; [ zlib ];
    runScript = "bash";
    profile = ''
      export CUDA_PATH=${pkgs.cudatoolkit}
      # export LD_LIBRARY_PATH=${pkgs.linuxPackages.nvidia_x11}/lib
      export EXTRA_LDFLAGS="-L/lib -L${pkgs.linuxPackages.nvidia_x11}/lib"
      export EXTRA_CCFLAGS="-I/usr/include"
    '';
  };
in pkgs.stdenv.mkDerivation {
  name = "cuda-env-shell";
  nativeBuildInputs = [ fhs ];
  shellHook = "exec cuda-env";
}
</nowiki>}}
 
 
* By making a nix-shell
{{file|cuda-shell.nix|nix|<nowiki>
{ pkgs ? import <nixpkgs> {} }:
 
pkgs.stdenv.mkDerivation {
  name = "cuda-env-shell";
  buildInputs = with pkgs; [
    git gitRepo gnupg autoconf curl
    procps gnumake utillinux m4 gperf unzip
    cudatoolkit linuxPackages.nvidia_x11
    libGLU libGL
    xorg.libXi xorg.libXmu freeglut
    xorg.libXext xorg.libX11 xorg.libXv xorg.libXrandr zlib
    ncurses5 stdenv.cc binutils
  ];
  shellHook = ''
      export CUDA_PATH=${pkgs.cudatoolkit}
      # export LD_LIBRARY_PATH=${pkgs.linuxPackages.nvidia_x11}/lib:${pkgs.ncurses5}/lib
      export EXTRA_LDFLAGS="-L/lib -L${pkgs.linuxPackages.nvidia_x11}/lib"
      export EXTRA_CCFLAGS="-I/usr/include"
  '';         
}
</nowiki>}}
 
== See also ==
 
* [https://github.com/grahamc/nixos-cuda-example nixos-cuda-example]
* [https://github.com/grahamc/nixos-cuda-example/pull/2 nix-shell envs for Cuda]
* [https://discourse.nixos.org/t/cuda-setup-on-nixos/1118 CUDA setup on NixOS]
* [https://github.com/NixOS/nixpkgs/issues/131608 eGPU with nvidia-docker on intel-xserver]


= Using your GPU for graphics =
= Using your GPU for graphics =

Revision as of 19:18, 31 March 2022

Installing NVIDIA official drivers on NixOS

If you're using NixOS, installing and using the official NVIDIA drivers is as simple as,

/etc/nixos/configuration.nix
{
  # NVIDIA drivers are unfree.
  nixpkgs.config.allowUnfree = true;

  services.xserver.videoDrivers = [ "nvidia" ];
  hardware.opengl.enable = true;

  # Optionally, you may need to select the appropriate driver version for your specific GPU.
  hardware.nvidia.package = config.boot.kernelPackages.nvidiaPackages.stable;
  ...
}

As noted in the final comment, you'll need to determine the appropriate driver version for your card. For "legacy" cards, you can consult nvidia official legacy driver list. You can consult the set of possible options in the source here.

Using GPUs on non-NixOS

If you're using Nix-packaged software on a non-NixOS system, you'll need a workaround to get everything up-and-running. The nixGL project provides wrapper to use GL drivers on non-NixOS systems. You need to have GPU drivers installed on your distro (for kernel modules). With nixGL installed, you'll run nixGL foobar instead of foobar.

Note that nixGL is not specific to NVIDIA GPUs, and should work with just about any GPU.

CUDA and using your GPU for compute

See the CUDA wiki page!

Using your GPU for graphics

Please note that, if you are setting up PRIME offloading, you must set the single value of "nvidia" even though it would be more conceptually correct to also include the driver for your other GPU. Doing otherwise will cause a broken xorg.conf to be generated. This is because NixOS doesn't actually handle multiple GPUs / GPU drivers properly, as per https://github.com/NixOS/nixpkgs/issues/108018.

Determining the type of your GPU

  • MXM / output-providing card (shows as VGA Controller in lspci), i.e. graphics card in desktop computer or in some laptops
  • muxless/non-MXM Optimus cards have no display outputs and show as 3D Controller in lspci output, seen in most modern consumer laptops

MXM cards allow you to use the Nvidia card standalone, in Non-Optimus mode. Non-MXM cards require Optimus, Nvidia's integrated-vs-discrete GPU switching technology.

Non-Optimus mode

You need an MXM card (see above) for Non-Optimus mode. Follow NVIDIA Graphics Cards section in official manual.

In case of laptop you may also need to use a BIOS option to select which card to use for the internal display.

Optimus

Mostly useful for laptops. There are currently two solutions available under NixOS, described in detail below:

  • Official solution: Nvidia PRIME (in on-demand "offload" mode, and always-on "sync" mode)
  • Previous open-source solution: Bumblebee (now deprecated)

Nvidia PRIME

Official solution by nvidia. Currently, reverse PRIME does not work. The consequence of this is that if you have a special laptop configuration where external display ports are only exposed to the dedicated GPU, then running in offload mode will not allow you to use those display ports for external monitors. If you wish to use the external monitors in that particular case, you have to use sync mode.

offload mode

Available since 20.09 (see #66601).

In this mode the Nvidia card is only activated on demand, however a Nvidia card of the Turing generation or newer and an Intel Coffee Lake chipset is required for a complete poweroff of the Nvidia card (see discussion).

Offload mode is enabled by running your program(s) with specific environment variables, i.e., here's a sample script called nvidia-offload that you can run wrapped around your exacutable, for example nvidia-offload glxgears:

nvidia-offload
export __NV_PRIME_RENDER_OFFLOAD=1
export __NV_PRIME_RENDER_OFFLOAD_PROVIDER=NVIDIA-G0
export __GLX_VENDOR_LIBRARY_NAME=nvidia
export __VK_LAYER_NV_optimus=NVIDIA_only
exec -a "$0" "$@"

To configure Offload mode, you firstly you need to enable the proprietary Nvidia driver:

/etc/nixos/configuration.nix
{
  services.xserver.videoDrivers = [ "nvidia" ];
  ...

Note that on certain laptops and/or if you are using a custom kernel version, you may have issues with your NixOS system finding the primary display. In this case you should use hardware.nvidia.modesetting.enable, i.e.:

/etc/nixos/configuration.nix
 	
{
  hardware.nvidia.modesetting.enable = true;
  services.xserver.videoDrivers = [ "nvidia" ];
  // ...

Then you need to setup the Bus ID's of the cards as seen below.

Note: Bus ID is important and needs to be formatted properly

The Nvidia driver expects the bus ID to be in decimal format; There are two ways you can get the bus IDs, one is with lspci, which shows the bus IDs in hexadecimal format and the other with lshw, which shows it in decimal format, as wanted by nixos.

lspci

You can convert the value by:

  • Stripping any leading zeros from the bus numbers or if the number is above 09, convert it to decimal and use that value.
  • Replacing any full stops with colons.
  • Prefix the final value with "PCI".

For example:

Output from lspci

09:1f.0 VGA compatible controller: NVIDIA Corporation Device 1f91 (rev a1)

Converted and correct format

PCI:9:31:0

lshw

If you don't have lshw installed, you can get it temporarily in a ephemeral shell by running:

nix-shell -p lshw --run "lshw -c display"

The two bus ID's will be in the corresponding "bus info" field.

For example:

bus info: pci@0000:01:00.0

and

bus info: pci@0000:00:02.0

Now you have to take everything after the first colon, and replace the . with another colon.

01:00:0

and

00:02:0

A possible configuration:

/etc/nixos/configuration.nix
{ pkgs, ... }:

let
  nvidia-offload = pkgs.writeShellScriptBin "nvidia-offload" ''
    export __NV_PRIME_RENDER_OFFLOAD=1
    export __NV_PRIME_RENDER_OFFLOAD_PROVIDER=NVIDIA-G0
    export __GLX_VENDOR_LIBRARY_NAME=nvidia
    export __VK_LAYER_NV_optimus=NVIDIA_only
    exec -a "$0" "$@"
  '';
in
{
  environment.systemPackages = [ nvidia-offload ];

  services.xserver.videoDrivers = [ "nvidia" ];
  hardware.nvidia.prime = {
    offload.enable = true;

    # Bus ID of the Intel GPU. You can find it using lspci, either under 3D or VGA
    intelBusId = "PCI:0:2:0";

    # Bus ID of the NVIDIA GPU. You can find it using lspci, either under 3D or VGA
    nvidiaBusId = "PCI:1:0:0";
  };
}

booting with an external display

Most Optimus laptops have the HDMI port for an external display wired directly to the Nvidia chip, in which case you need a configuration to use the Nvidia driver directly without offload mode. Fortunately, NixOS has an amazing feature called specialisations which allows you to do this easily. Here is an example configuration:

/etc/nixos/configuration.nix
  specialisation = {
    external-display.configuration = {
      system.nixos.tags = [ "external-display" ];
      hardware.nvidia.prime.offload.enable = lib.mkForce false;
      hardware.nvidia.powerManagement.enable = lib.mkForce false;
    };
  };

Once you rebuild your configuration, an extra external-display configuration will be built and placed in your boot menu.

To use this, boot your laptop with the lid open, choose the external-display configuration in the boot menu, and continue to keep the lid open until your desktop appears on the external display. At this point you can close the lid.

offloading steam

First, add this to your ~/.bashrc :

export XDG_DATA_HOME="$HOME/.local/share"

.

For NixOS Steam run:

mkdir -p ~/.local/share/applications
sed 's/^Exec=/&nvidia-offload /' /run/current-system/sw/share/applications/steam.desktop > ~/.local/share/applications/steam.desktop

.

For Flatpak Steam run:

mkdir -p ~/.local/share/applications
sed 's/^Exec=/&nvidia-offload /' /var/lib/flatpak/exports/share/applications/com.valvesoftware.steam.desktop > ~/.local/share/applications/com.valvesoftware.steam.desktop

.

Then restart your graphical environment session.

sync mode

In this mode the Nvidia card is turned on constantly, having impact on laptop battery and health.

Possible issues:

  • Hangs of applications after resume from suspend
  • Wrong DPI calculation (in this case provide dpi manually services.xserver.dpi = 96;)
  • Black screen after system upgrade (e.g. nixos-rebuild switch; use nixos-rebuild boot instead and reboot)
  • No video playback acceleration available (vaapi)
Example for NixOS 20.03
/etc/nixos/configuration.nix
{
  hardware.nvidia.modesetting.enable = true;
  services.xserver.videoDrivers = [ "nvidia" ];
  hardware.nvidia.optimus_prime = {
    enable = true;

    # Bus ID of the NVIDIA GPU. You can find it using lspci, either under 3D or VGA
    nvidiaBusId = "PCI:1:0:0";

    # Bus ID of the Intel GPU. You can find it using lspci, either under 3D or VGA
    intelBusId = "PCI:0:2:0";
  };
}
Example for NixOS 20.09/unstable
/etc/nixos/configuration.nix
{
  services.xserver.videoDrivers = [ "nvidia" ];

  hardware.nvidia.prime = {
    sync.enable = true;

    # Bus ID of the NVIDIA GPU. You can find it using lspci, either under 3D or VGA
    nvidiaBusId = "PCI:1:0:0";

    # Bus ID of the Intel GPU. You can find it using lspci, either under 3D or VGA
    intelBusId = "PCI:0:2:0";
  };
}

Bumblebee

Deprecated solution. You should use offload mode instead.

Use option

hardware.bumblebee.enable = true;


Troubleshooting

Fix screen tearing

You may often incounter screen tearing or artifacts when using proprietary Nvidia drivers. You can fix that by forcing full composition pipeline.

Note: This has been reported to reduce the performance of some OpenGL applications and may produce issues in WebGL. It also drastically increases the time the driver needs to clock down after load.
/etc/nixos/configuration.nix
services.xserver.screenSection = ''
  Option         "metamodes" "nvidia-auto-select +0+0 {ForceFullCompositionPipeline=On}"
  Option         "AllowIndirectGLXProtocol" "off"
  Option         "TripleBuffer" "on"
'';

Fix app flickering with Picom

~/.config/picom/picom.conf
unredir-if-possible = false;
backend = "xrender"; # try "glx" if xrender doesn't help
vsync = true;

Fix graphical corruption on suspend/resume

By default only a small portion of VRAM is saved when suspending the system [1]. This can cause graphical issues in some applications when resuming from suspend. To fix it, enable systemd-based suspend, which will save and restore all of VRAM:

/etc/nixos/configuration.nix
hardware.nvidia.powerManagement.enable = true;

If you have a modern Nvidia GPU (Turing [2] or later), you may also want to investigate the hardware.nvidia.powerManagement.finegrained option: [3]