NVIDIA: Difference between revisions
Focusgraph (talk | contribs) mNo edit summary |
MaxWolf-01 (talk | contribs) Add MPS section: systemd service config + Docker usage, document silent PATH failure |
||
| Line 210: | Line 210: | ||
=== CUDA and using your GPU for compute === | === CUDA and using your GPU for compute === | ||
See the [[CUDA]] wiki page. | See the [[CUDA]] wiki page. | ||
=== Multi-Process Service (MPS) === | |||
[https://docs.nvidia.com/deploy/mps/index.html NVIDIA Multi-Process Service (MPS)] allows multiple CUDA processes to share a single GPU context. NixOS does not provide a dedicated module for MPS, so a custom systemd service is required: | |||
{{file|configuration.nix|nix|<nowiki> | |||
{ config, pkgs, ... }: | |||
{ | |||
systemd.services.nvidia-mps = { | |||
description = "NVIDIA CUDA Multi-Process Service"; | |||
after = [ "nvidia-persistenced.service" ]; | |||
requires = [ "nvidia-persistenced.service" ]; | |||
wantedBy = [ "multi-user.target" ]; | |||
path = [ config.hardware.nvidia.package.bin ]; | |||
serviceConfig = { | |||
Type = "forking"; | |||
ExecStart = "${config.hardware.nvidia.package.bin}/bin/nvidia-cuda-mps-control -d"; | |||
ExecStop = "${pkgs.writeShellScript "nvidia-mps-stop" '' | |||
echo quit | ${config.hardware.nvidia.package.bin}/bin/nvidia-cuda-mps-control | |||
''}"; | |||
Restart = "on-failure"; | |||
RestartSec = 5; | |||
}; | |||
}; | |||
} | |||
</nowiki>}} | |||
{{Warning|The <code>path</code> option is required. The MPS control daemon uses <code>execlp</code> to spawn <code>nvidia-cuda-mps-server</code>, which must be in the service's <code>PATH</code>. Without it, the daemon appears to start normally but silently fails to spawn the server process. CUDA clients will receive Error 805 (<code>cudaErrorMpsConnectionFailed</code>).}} | |||
To use MPS from [[Docker]] containers, the MPS pipe directory must be mounted and the host IPC namespace must be shared: | |||
<syntaxhighlight lang="yaml"> | |||
services: | |||
gpu-worker: | |||
ipc: host | |||
volumes: | |||
- /tmp/nvidia-mps:/tmp/nvidia-mps | |||
environment: | |||
CUDA_MPS_PIPE_DIRECTORY: /tmp/nvidia-mps | |||
</syntaxhighlight> | |||
=== Running Specific NVIDIA Driver Versions === | === Running Specific NVIDIA Driver Versions === | ||