CUDA: Difference between revisions

Bachp (talk | contribs)
Update fhs example to fix warning: evaluation warning: 'buildFHSUserEnv' has been renamed to 'buildFHSEnv' and will be removed in 25.11
Smudgebun (talk | contribs)
Enabling CUDA In Packages: Improved tips on installing large CUDA packages without a cache
 
(17 intermediate revisions by 6 users not shown)
Line 1: Line 1:
NixOS supports using NVIDIA GPUs for pure computing purposes, not just for graphics. For example, many users rely on NixOS for machine learning both locally and on cloud instances. These use cases are supported by the [https://github.com/orgs/NixOS/teams/cuda-maintainers @NixOS/cuda-maintainers team] on GitHub ([https://github.com/orgs/NixOS/projects/27 project board]). If you have an issue using your NVIDIA GPU for computing purposes [https://github.com/NixOS/nixpkgs/issues/new/choose open an issue] on GitHub and tag <code>@NixOS/cuda-maintainers</code>.
NixOS supports using NVIDIA GPUs for pure computing purposes, not just for graphics. For example, many users rely on NixOS for machine learning both locally and on cloud instances. These use cases are supported by the [https://github.com/orgs/NixOS/teams/cuda-maintainers @NixOS/cuda-maintainers team] on GitHub ([https://github.com/orgs/NixOS/projects/27 project board]). If you have an issue using your NVIDIA GPU for computing purposes [https://github.com/NixOS/nixpkgs/issues/new/choose open an issue] on GitHub and tag <code>@NixOS/cuda-maintainers</code>.


{{tip|1='''Cache''': Using the [https://app.cachix.org/cache/nix-community nix-community cache] is recommended! It will save you valuable time and electrons. Getting set up should be as simple as <code>cachix use nix-community</code>. Click [[#Setting up CUDA Binary Cache|here]] for more details.}}
{{tip|1='''Cache''': Using the binary cache is recommended! It will save you valuable time and electrons. Click [[#Setting up CUDA Binary Cache|here]] for more details.}}


{{tip|1='''Data center GPUs''': Note that you may need to adjust your driver version to use "data center" GPUs like V100/A100s. See [https://discourse.nixos.org/t/how-to-use-nvidia-v100-a100-gpus/17754 this thread] for more info.}}
{{tip|1='''Data center GPUs''': Note that you may need to adjust your driver version to use "data center" GPUs like V100/A100s. See [https://discourse.nixos.org/t/how-to-use-nvidia-v100-a100-gpus/17754 this thread] for more info.}}
== Driver Installation ==
Assuming you've followed the [[NVIDIA]] page correctly, and have a CUDA compatible GPU, you shouldn't need to do any further configuration. You can confirm your CUDA version by running the following command in your terminal.
{{code|lang=console|line=no|<nowiki>nvidia-smi | grep CUDA</nowiki>}}


== <code>cudatoolkit</code>, <code>cudnn</code>, and related packages ==
== <code>cudatoolkit</code>, <code>cudnn</code>, and related packages ==
 
{{outdated|scope=section|date=July 2024|reason=Note that these examples have been updated more recently (as of 2024-07-30). May not be the best solution. A better resource is likely the packaging CUDA sample code [https://github.com/NixOS/nixpkgs/tree/master/pkgs/development/cuda-modules/cutensor here].}}
The CUDA toolkit is available in a [https://search.nixos.org/packages?channel=unstable&from=0&size=50&buckets=%7B%22package_attr_set%22%3A%5B%22cudaPackages%22%5D%2C%22package_license_set%22%3A%5B%5D%2C%22package_maintainers_set%22%3A%5B%5D%2C%22package_platforms%22%3A%5B%5D%7D&sort=relevance&type=packages&query=cudatoolkit number of different versions]. Please use the latest major version. You can see where they're defined in nixpkgs [https://github.com/NixOS/nixpkgs/blob/master/pkgs/development/cuda-modules/cudatoolkit/releases.nix here].
The CUDA toolkit is available in a [https://search.nixos.org/packages?channel=unstable&from=0&size=50&buckets=%7B%22package_attr_set%22%3A%5B%22cudaPackages%22%5D%2C%22package_license_set%22%3A%5B%5D%2C%22package_maintainers_set%22%3A%5B%5D%2C%22package_platforms%22%3A%5B%5D%7D&sort=relevance&type=packages&query=cudatoolkit number of different versions]. Please use the latest major version. You can see where they're defined in nixpkgs [https://github.com/NixOS/nixpkgs/blob/master/pkgs/development/cuda-modules/cudatoolkit/releases.nix here].


Line 12: Line 18:
* cuDNN is packaged [https://github.com/NixOS/nixpkgs/tree/master/pkgs/development/cuda-modules/cudnn here].
* cuDNN is packaged [https://github.com/NixOS/nixpkgs/tree/master/pkgs/development/cuda-modules/cudnn here].
* cuTENSOR is packaged [https://github.com/NixOS/nixpkgs/tree/master/pkgs/development/cuda-modules/cutensor here].
* cuTENSOR is packaged [https://github.com/NixOS/nixpkgs/tree/master/pkgs/development/cuda-modules/cutensor here].
{{warning|1=Note that these examples have been updated more recently (as of 2024-07-30). May not be the best solution. A better resource is likely the packaging CUDA sample code [https://github.com/NixOS/nixpkgs/tree/master/pkgs/development/cuda-modules/cutensor here].}}


There are some possible ways to setup a development environment using CUDA on NixOS. This can be accomplished in the following ways:
There are some possible ways to setup a development environment using CUDA on NixOS. This can be accomplished in the following ways:
Line 19: Line 23:
* By making a FHS user env
* By making a FHS user env


{{file|cuda-fhs.nix|nix|<nowiki>
<syntaxhighlight lang="nix" line="1" start="1"># flake.nix, run with `nix develop`
# Run with `nix-shell cuda-fhs.nix`
# Run with `nix-shell cuda-fhs.nix`
{ pkgs ? import </nowiki><nixpkgs><nowiki> {} }:
{ pkgs ? import <nixpkgs> {} }:
let
  # Change according to the driver used: stable, beta
  nvidiaPackage = pkgs.linuxPackages.nvidiaPackages.stable;
in
(pkgs.buildFHSEnv {
(pkgs.buildFHSEnv {
   name = "cuda-env";
   name = "cuda-env";
Line 37: Line 45:
     unzip
     unzip
     cudatoolkit
     cudatoolkit
     linuxPackages.nvidia_x11
     nvidiaPackage
     libGLU libGL
     libGLU libGL
     xorg.libXi xorg.libXmu freeglut
     xorg.libXi xorg.libXmu freeglut
Line 49: Line 57:
   profile = ''
   profile = ''
     export CUDA_PATH=${pkgs.cudatoolkit}
     export CUDA_PATH=${pkgs.cudatoolkit}
     # export LD_LIBRARY_PATH=${pkgs.linuxPackages.nvidia_x11}/lib
     # export LD_LIBRARY_PATH=${nvidiaPackage}/lib
     export EXTRA_LDFLAGS="-L/lib -L${pkgs.linuxPackages.nvidia_x11}/lib"
     export EXTRA_LDFLAGS="-L/lib -L${nvidiaPackage}/lib"
     export EXTRA_CCFLAGS="-I/usr/include"
     export EXTRA_CCFLAGS="-I/usr/include"
   '';
   '';
}).env
}).env
</nowiki>}}
</syntaxhighlight>




* By making a nix-shell
* By making a nix-shell
{{file|cuda-shell.nix|nix|<nowiki>
<syntaxhighlight lang="nix" line="1" start="1">
# Run with `nix-shell cuda-shell.nix`
# flake.nix, run with `nix develop`# Run with `nix-shell cuda-shell.nix`
{ pkgs ? import </nowiki><nixpkgs><nowiki> {} }:
{ pkgs ? import <nixpkgs> {} }:
let
  nvidiaPackage = pkgs.linuxPackages.nvidiaPackages.stable;
in
pkgs.mkShell {
pkgs.mkShell {
   name = "cuda-env-shell";
   name = "cuda-env-shell";
Line 66: Line 77:
     git gitRepo gnupg autoconf curl
     git gitRepo gnupg autoconf curl
     procps gnumake util-linux m4 gperf unzip
     procps gnumake util-linux m4 gperf unzip
     cudatoolkit linuxPackages.nvidia_x11
     cudatoolkit nvidiaPackage
     libGLU libGL
     libGLU libGL
     xorg.libXi xorg.libXmu freeglut
     xorg.libXi xorg.libXmu freeglut
Line 74: Line 85:
   shellHook = ''
   shellHook = ''
       export CUDA_PATH=${pkgs.cudatoolkit}
       export CUDA_PATH=${pkgs.cudatoolkit}
       # export LD_LIBRARY_PATH=${pkgs.linuxPackages.nvidia_x11}/lib:${pkgs.ncurses5}/lib
       # export LD_LIBRARY_PATH=${nvidiaPackage}/lib:${pkgs.ncurses}/lib
       export EXTRA_LDFLAGS="-L/lib -L${pkgs.linuxPackages.nvidia_x11}/lib"
       export EXTRA_LDFLAGS="-L/lib -L${nvidiaPackage}/lib"
       export EXTRA_CCFLAGS="-I/usr/include"
       export EXTRA_CCFLAGS="-I/usr/include"
   '';           
   '';           
}
}
</nowiki>}}
</syntaxhighlight>


* By making a flake.nix<syntaxhighlight lang="nix" line="1" start="1">
* By making a flake.nix
# flake.nix, run with `nix develop`
<syntaxhighlight lang="nix" line="1" start="1"># flake.nix, run with `nix develop`
{
{
   description = "CUDA development environment";
   description = "CUDA development environment";
Line 96: Line 107:
       config.cudaVersion = "12";
       config.cudaVersion = "12";
     };
     };
    # Change according to the driver used: stable, beta
    nvidiaPackage = pkgs.linuxPackages.nvidiaPackages.stable;
   in {
   in {
     # alejandra is a nix formatter with a beautiful output
     # alejandra is a nix formatter with a beautiful output
Line 105: Line 118:
         cudaPackages.cuda_cudart
         cudaPackages.cuda_cudart
         cudatoolkit
         cudatoolkit
         linuxPackages.nvidia_x11
         nvidiaPackage
         cudaPackages.cudnn
         cudaPackages.cudnn
         libGLU
         libGLU
Line 117: Line 130:
         xorg.libXrandr
         xorg.libXrandr
         zlib
         zlib
         ncurses5
         ncurses
         stdenv.cc
         stdenv.cc
         binutils
         binutils
Line 124: Line 137:


       shellHook = ''
       shellHook = ''
         export LD_LIBRARY_PATH="${pkgs.linuxPackages.nvidia_x11}/lib:$LD_LIBRARY_PATH"
         export LD_LIBRARY_PATH="${nvidiaPackage}/lib:$LD_LIBRARY_PATH"
         export CUDA_PATH=${pkgs.cudatoolkit}
         export CUDA_PATH=${pkgs.cudatoolkit}
         export EXTRA_LDFLAGS="-L/lib -L${pkgs.linuxPackages.nvidia_x11}/lib"
         export EXTRA_LDFLAGS="-L/lib -L${nvidiaPackage}/lib"
         export EXTRA_CCFLAGS="-I/usr/include"
         export EXTRA_CCFLAGS="-I/usr/include"
         export CMAKE_PREFIX_PATH="${pkgs.fmt.dev}:$CMAKE_PREFIX_PATH"
         export CMAKE_PREFIX_PATH="${pkgs.fmt.dev}:$CMAKE_PREFIX_PATH"
Line 133: Line 146:
     };
     };
   };
   };
}
}</syntaxhighlight>
</syntaxhighlight>


== Setting up CUDA Binary Cache ==
== Setting up CUDA Binary Cache ==


The [https://nix-community.org/cache/ Nix-community cache] contains pre-built CUDA packages. By adding it to your system, Nix will fetch these packages instead of building them, saving valuable time and processing power.
The binary cache contains pre-built CUDA packages. By adding it to your system, Nix will fetch these packages instead of building them, saving valuable time and processing power.


For more information, refer to the [[Binary Cache#Using a binary cache Using a binary cache|Using a binary cache]] page.
For more information, refer to the [[Binary Cache#Using a binary cache Using a binary cache|Using a binary cache]] page.


{{warning|1=You need to rebuild your system at least once after adding the cache, before it can be used.}}
{{info|1=You need to rebuild your system at least once after adding the cache, before it can be used.}}


=== NixOS ===
=== NixOS ===
Add the cache to <code>substituters</code> and <code>trusted-public-keys</code> inside your system configuration:
Add the cache to <code>substituters</code> and <code>trusted-public-keys</code> inside your system configuration:


{{file|/etc/nixos/configuration.nix|nix|<nowiki>
{{file|3=<nowiki>
nix.settings = {
nix.settings = {
   substituters = [
   substituters = [
     "https://nix-community.cachix.org"
     "https://cache.nixos-cuda.org"
   ];
   ];
   trusted-public-keys = [
   trusted-public-keys = [
     "nix-community.cachix.org-1:mB9FSh9qf2dCimDSUo8Zy7bkq5CX+/rkCWyvRCYg3Fs="
     "cache.nixos-cuda.org:74DUi4Ye579gUqzH4ziL9IyiJBlDpMRn9MBN8oNan9M="
   ];
   ];
};
};
</nowiki>}}
</nowiki>|name=/etc/nixos/configuration.nix|lang=nix}}


=== Non-NixOS ===
=== Non-NixOS ===


If you have [https://www.cachix.org cachix] installed and set up, all you need to do is run:
You have to add <code>substituters</code> and <code>trusted-public-keys</code> to <code>/etc/nix/nix.conf</code>:
 
{{file|3=<nowiki>
trusted-public-keys = cache.nixos-cuda.org:74DUi4Ye579gUqzH4ziL9IyiJBlDpMRn9MBN8oNan9M=
trusted-substituters = https://cache.nixos-cuda.org
trusted-users = root @wheel
</nowiki>|name=/etc/nix/nix.conf|lang=nix}}
 
If your user is in <code>trusted-users</code>, you can also add the cache in your home directory:
 
{{file|3=<nowiki>
trusted-public-keys = cache.nixos-cuda.org:74DUi4Ye579gUqzH4ziL9IyiJBlDpMRn9MBN8oNan9M=
trusted-substituters = https://cache.nixos-cuda.org
</nowiki>|name=~/.config/nix/nix.conf|lang=nix}}


<syntaxHighlight lang="console">
== Enabling CUDA In Packages ==
$ cachix use nix-community
By default, software packaged in source code form has CUDA support disabled, because of the unfree license. There are multiple options to solve this.
</syntaxHighlight>


Else, you have to add <code>substituters</code> and <code>trusted-public-keys</code> to <code>/etc/nix/nix.conf</code>:
You can enable builds with CUDA support with a nixpkgs wide configuration.
<syntaxhighlight lang="nix">
nixpkgs.config.cudaSupport = true;
</syntaxhighlight>


{{file|/etc/nix/nix.conf|nix|<nowiki>
Or you can override individual packages.
trusted-public-keys = nix-community.cachix.org-1:mB9FSh9qf2dCimDSUo8Zy7bkq5CX+/rkCWyvRCYg3Fs=
<syntaxhighlight lang="nix">
trusted-substituters = https://nix-community.cachix.org
environment.systemPackages = with pkgs; [
trusted-users = root @wheel
(mlt.override {config.cudaSupport=true;})
</nowiki>}}
];
</syntaxhighlight>
 
Or you can use binary-packaged versions of CUDA compatible software, such as [https://github.com/edolstra/nix-warez/tree/master/blender blender-bin] for Blender.
 
{{info|If you will be using <code>cudaSupport</code> in packages, it is recommended you utilize a [[#Setting up CUDA Binary Cache|CUDA binary cache]].}}
 
Without a [[#Setting up CUDA Binary Cache|CUDA cache]], any CUDA compatible package installed with <code>cudaSupport</code> will be compiled from source. This is because NixOS Foundation does not build (and therefore [https://cache.nixos.org/ cache.nixos.org] does not cache) CUDA packages.
 
For larger programs like Blender, that process can be very resource-intensive. If you are installing large CUDA-enabled package(s) that either are not cached or you are not using a cache, then (especially on older or weaker hardware) it is recommended to reduce the number of cores and/or jobs that the process will take, to prevent a system freeze from resource limits. This can be done with the <code>--max-jobs</code> / <code>-j</code> and <code>--cores</code> flags, for more details see the [https://github.com/NixOS/nix/blob/master/doc/manual/source/advanced-topics/cores-vs-jobs.md Tuning Cores & Jobs] manual page.  


If your user is in <code>trusted-users</code>, you can also add the cache in your home directory:
If you don't want to deal with the increased time that compilation will take when <code>--max-jobs</code> / <code>-j</code> and <code>--cores</code> are set below maximum, you can also try simply closing other running processes to see if that frees up enough resources for compilation to be successful.


{{file|~/.config/nix/nix.conf|nix|<nowiki>
&rarr; For specifics on setting up Blender with CUDA (and OptiX) see: [[Blender#CUDA & OptiX]].
substituters = https://nix-community.cachix.org
</nowiki>}}


== Some things to keep in mind when setting up CUDA in NixOS ==
== Some things to keep in mind when setting up CUDA in NixOS ==
* Some GPUs, like Tesla K80, don't work with the latest drivers, so you must specify them in the option <code>hardware.nvidia.package</code> getting the value from your selected kernel, for example, <code>config.boot.kernelPackages.nvidia_x11_legacy470</code>. You can check which driver version your GPU supports by visiting the  [https://www.nvidia.com/Download/index.aspx nvidia site] and checking the driver version.
* Some GPUs, like Tesla K80, don't work with the latest drivers, so you must specify them in the option <code>hardware.nvidia.package</code> getting the value from your selected kernel, for example, <code>config.boot.kernelPackages.nvidia_x11_legacy470</code>. You can check which driver version your GPU supports by visiting the  [https://www.nvidia.com/Download/index.aspx nvidia site] and checking the driver version.
* Even with the drivers correctly installed, some software, like Blender, may not see the CUDA GPU. Make sure your system configuration has the option <code>hardware.opengl.enable</code> enabled.
* Even with the drivers correctly installed, some software, like Blender, may not see the CUDA GPU. Make sure your system configuration has the option <code>hardware.graphics.enable</code> enabled.
* By default, software packaged in source code form has CUDA support disabled, because of the unfree license. To solve this, you can enable builds with CUDA support with a nixpkgs wide configuration, or use binary packaged CUDA compatible software such as [https://github.com/edolstra/nix-warez/tree/master/blender blender-bin].


== CUDA under WSL ==
== CUDA under WSL ==