K3s

K3s is a simplified version of Kubernetes. It bundles all components for a kubernetes cluster into a few of small binaries.

Single node setup

{

  networking.firewall.allowedTCPPorts = [
    6443 # k3s: required so that pods can reach the API server (running on port 6443 by default)
    # 2379 # k3s, etcd clients: required if using a "High Availability Embedded etcd" configuration
    # 2380 # k3s, etcd peers: required if using a "High Availability Embedded etcd" configuration
  ];
  networking.firewall.allowedUDPPorts = [
    # 8472 # k3s, flannel: required if using multi-node for inter-node networking
  ];
  services.k3s.enable = true;
  services.k3s.role = "server";
  services.k3s.extraFlags = toString [
    # "--kubelet-arg=v=4" # Optionally add additional args to k3s
  ];
  environment.systemPackages = [ pkgs.k3s ];
}

After enabling, you can access your cluster through sudo k3s kubectl i.e. sudo k3s kubectl cluster-info, or by using the generated kubeconfig file in /etc/rancher/k3s/k3s.yaml

Multi-node setup

it is simple to create a cluster of multiple nodes in a highly available setup (all nodes are in the control-plane and are a part of the etcd cluster).

The first node is configured like this:

{
  services.k3s = {
    enable = true;
    role = "server";
    token = "<randomized common secret>";
    clusterInit = true;
  };
}

Any other subsequent nodes can be added with a sligtly different config:

{
  services.k3s = {
    enable = true;
    role = "server";
    token = "<randomized common secret>";
    serverAddr = "https://<ip of first node>:6443";
  };
}

For this to work you need to open the aforementioned API, etcd, and flannel ports in the firewall. Note that it is recommended to use an odd number of nodes in such a cluster.

Or see this real world example. You might want to ignore some parts of it i.e. the monitoring as this is specific to our setup. The K3s server needs to import modules/k3s/server.nix and an agent modules/k3s/agent.nix. Tip: You might run into issues with coredns not being reachable from agent nodes. Right now, we disable the NixOS firewall all together until we find a better solution.

ZFS support

K3s's builtin containerd does not support the zfs snapshotter. However, it is possible to configure it to use an external containerd:

  virtualisation.containerd = {
    enable = true;
    settings =
      let
        fullCNIPlugins = pkgs.buildEnv {
          name = "full-cni";
          paths = with pkgs;[
            cni-plugins
            cni-plugin-flannel
          ];
        };
      in {
        plugins."io.containerd.grpc.v1.cri".cni = {
          bin_dir = "${fullCNIPlugins}/bin";
          conf_dir = "/var/lib/rancher/k3s/agent/etc/cni/net.d/";
        };
      };
  };
  # TODO describe how to enable zfs snapshotter in containerd
  services.k3s.extraFlags = toString [
    "--container-runtime-endpoint unix:///run/containerd/containerd.sock"
  ];

Network policies

The current k3s derivation doesn't include ipset package, which is required by the network policy controller.

k3s logs

level=warning msg="Skipping network policy controller start, ipset unavailable: ipset utility not found"

There is an open pull request to fix it https://github.com/NixOS/nixpkgs/pull/176520#pullrequestreview-1304593562. Until then, the package can be added to k3s's path as follows

  systemd.services.k3s.path = [ pkgs.ipset ];

Nvidia support

To use Nvidia GPU in the cluster the nvidia-container-runtime and runc are needed. To get the two components it suffices to add the following to the configuration

virtualisation.docker = {
  enable = true;
  enableNvidia = true;
};
environment.systemPackages = with pkgs; [ docker runc ];

Note, using docker here is a workaround, it will install nvidia-container-runtime and that will cause it to be accessible via /run/current-system/sw/bin/nvidia-container-runtime, currently its not directly accessible in nixpkgs.

You now need to create a new file in /var/lib/rancher/k3s/agent/etc/containerd/config.toml.tmpl with the following

{{ template "base" . }}

[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.nvidia]
  privileged_without_host_devices = false
  runtime_engine = ""
  runtime_root = ""
  runtime_type = "io.containerd.runc.v2"

[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.nvidia.options]
  BinaryName = "/run/current-system/sw/bin/nvidia-container-runtime"

Note here we are pointing the nvidia runtime to "/run/current-system/sw/bin/nvidia-container-runtime".

Now apply the following runtime class to k3s cluster:

apiVersion: node.k8s.io/v1
handler: nvidia
kind: RuntimeClass
metadata:
  labels:
    app.kubernetes.io/component: gpu-operator
  name: nvidia

Following k8s-device-plugin install the helm chart with runtimeClassName: nvidia set. In order to passthrough the nvidia card into the container, your deployments spec must contain - runtimeClassName: nvidia - env:

   - name: NVIDIA_VISIBLE_DEVICES
     value: all
   - name: NVIDIA_DRIVER_CAPABILITIES
     value: all

to test its working exec onto a pod and run nvidia-smi. For more configurability of nvidia related matters in k3s look in k3s-docs

Troubleshooting

Raspberry Pi not working

If the k3s.service/k3s server does not start and gives you the error FATA[0000] failed to find memory cgroup (v2) Here's the github issue: https://github.com/k3s-io/k3s/issues/2067 .

To fix the problem, you can add these things to your configuration.nix.

  boot.kernelParams = [
    "cgroup_enable=cpuset" "cgroup_memory=1" "cgroup_enable=memory"
  ];