K3s: Difference between revisions

From NixOS Wiki
imported>Atropos112
Added nvidia support
Klinger (talk | contribs)
Category:Container. Link. Description.
 
(24 intermediate revisions by 9 users not shown)
Line 1: Line 1:
K3s is a simplified version of [[Kubernetes]]. It bundles all components for a kubernetes cluster into a few of small binaries.
[https://k3s.io/ K3s] is a simplified Kubernetes version that bundles Kubernetes cluster components into a few small binaries optimized for Edge and IoT devices.  


== Single node setup ==


<syntaxHighlight lang=nix>
NixOS's K3s documentation is available at:
{


  networking.firewall.allowedTCPPorts = [
https://github.com/NixOS/nixpkgs/blob/master/pkgs/applications/networking/cluster/k3s/README.md
    6443 # k3s: required so that pods can reach the API server (running on port 6443 by default)
    # 2379 # k3s, etcd clients: required if using a "High Availability Embedded etcd" configuration
    # 2380 # k3s, etcd peers: required if using a "High Availability Embedded etcd" configuration
  ];
  networking.firewall.allowedUDPPorts = [
    # 8472 # k3s, flannel: required if using multi-node for inter-node networking
  ];
  services.k3s.enable = true;
  services.k3s.role = "server";
  services.k3s.extraFlags = toString [
    # "--kubelet-arg=v=4" # Optionally add additional args to k3s
  ];
  environment.systemPackages = [ pkgs.k3s ];
}
</syntaxHighlight>


After enabling, you can access your cluster through <code>sudo k3s kubectl</code> i.e. <code>sudo k3s kubectl cluster-info</code>, or by using the generated kubeconfig file in <code>/etc/rancher/k3s/k3s.yaml</code>
[[Category:Container]]
 
== Multi-node setup ==
 
it is simple to create a cluster of multiple nodes in a highly available setup (all nodes are in the control-plane and are a part of the etcd cluster).
 
The first node is configured like this:
 
<syntaxHighlight lang=nix>
{
  services.k3s = {
    enable = true;
    role = "server";
    token = "<randomized common secret>";
    clusterInit = true;
  };
}
</syntaxHighlight>
 
Any other subsequent nodes can be added with a sligtly different config:
 
<syntaxHighlight lang=nix>
{
  services.k3s = {
    enable = true;
    role = "server";
    token = "<randomized common secret>";
    serverAddr = "https://<ip of first node>:6443";
  };
}
</syntaxHighlight>
 
For this to work you need to open the aforementioned API, etcd, and flannel ports in the firewall. Note that it is [https://etcd.io/docs/v3.3/faq/#why-an-odd-number-of-cluster-members recommended] to use an odd number of nodes in such a cluster.
 
Or see this [https://github.com/Mic92/doctor-cluster-config/tree/master/modules/k3s real world example]. You might want to ignore some parts of it i.e. the monitoring as this is specific to our setup.
The K3s server needs to import <code>modules/k3s/server.nix</code> and an agent <code>modules/k3s/agent.nix</code>.
Tip: You might run into issues with coredns not being reachable from agent nodes. Right now, we disable the NixOS firewall all together until we find a better solution.
 
== ZFS support ==
 
K3s's builtin containerd does not support the zfs snapshotter. However, it is possible to configure it to use an external containerd:
 
<syntaxHighlight lang=nix>
  virtualisation.containerd = {
    enable = true;
    settings =
      let
        fullCNIPlugins = pkgs.buildEnv {
          name = "full-cni";
          paths = with pkgs;[
            cni-plugins
            cni-plugin-flannel
          ];
        };
      in {
        plugins."io.containerd.grpc.v1.cri".cni = {
          bin_dir = "${fullCNIPlugins}/bin";
          conf_dir = "/var/lib/rancher/k3s/agent/etc/cni/net.d/";
        };
      };
  };
  # TODO describe how to enable zfs snapshotter in containerd
  services.k3s.extraFlags = toString [
    "--container-runtime-endpoint unix:///run/containerd/containerd.sock"
  ];
</syntaxHighlight>
 
== Network policies ==
 
The current k3s derivation doesn't include <code>ipset</code> package, which is required by the network policy controller.
 
k3s logs
<syntaxHighlight lang=text>
level=warning msg="Skipping network policy controller start, ipset unavailable: ipset utility not found"
</syntaxHighlight>
 
There is an open pull request to fix it https://github.com/NixOS/nixpkgs/pull/176520#pullrequestreview-1304593562. Until then, the package can be added to k3s's path as follows
<syntaxHighlight lang=nix>
  systemd.services.k3s.path = [ pkgs.ipset ];
</syntaxHighlight>
 
== Nvidia support ==
To use Nvidia GPU in the cluster the nvidia-container-runtime and runc are needed. To get the two components it suffices to add the following to the configuration
 
<syntaxHighlight lang=nix>
virtualisation.docker = {
  enable = true;
  enableNvidia = true;
};
environment.systemPackages = with pkgs; [ docker runc ];
</syntaxHighlight>
 
Note, using docker here is a workaround, it will install nvidia-container-runtime and that will cause it to be accessible via "/run/current-system/sw/bin/nvidia-container-runtime", currently its not directly accessible in nixpkgs.
 
You now need to create a new file in  <code>/var/lib/rancher/k3s/agent/etc/containerd/config.toml.tmpl</code> with the following
 
<syntaxHighlight lang=toml>
{{ template "base" . }}
 
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.nvidia]
  privileged_without_host_devices = false
  runtime_engine = ""
  runtime_root = ""
  runtime_type = "io.containerd.runc.v2"
 
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.nvidia.options]
  BinaryName = "/run/current-system/sw/bin/nvidia-container-runtime"
</syntaxHighlight>
 
Note here we are pointing the nvidia runtime to  "/run/current-system/sw/bin/nvidia-container-runtime".
 
Now apply the following runtime class to k3s cluster:
 
<syntaxHighlight lang=yaml>
apiVersion: node.k8s.io/v1
handler: nvidia
kind: RuntimeClass
metadata:
  labels:
    app.kubernetes.io/component: gpu-operator
  name: nvidia
</syntaxHighlight>
 
Following [https://github.com/NVIDIA/k8s-device-plugin#deployment-via-helm k8s-device-plugin] install the helm chart with  <code>runtimeClassName: nvidia</code> set. In order to passthrough the nvidia card into the container, your deployments spec must contain
- runtimeClassName: nvidia
- env:
    - name: NVIDIA_VISIBLE_DEVICES
      value: all
    - name: NVIDIA_DRIVER_CAPABILITIES
      value: all
to test its working exec onto a pod and run  <code>nvidia-smi </code>. For more configurability of nvidia related matters in k3s look in [https://docs.k3s.io/advanced#nvidia-container-runtime-support k3s-docs]
 
== Troubleshooting ==
 
=== Raspberry Pi not working ===
 
If the k3s.service/k3s server does not start and gives you the error <code>FATA[0000] failed to find memory cgroup (v2)</code> Here's the github issue: https://github.com/k3s-io/k3s/issues/2067 .
 
To fix the problem, you can add these things to your configuration.nix.
 
<source lang="nix">  boot.kernelParams = [
    "cgroup_enable=cpuset" "cgroup_memory=1" "cgroup_enable=memory"
  ];
</source>
 
 
[[Category:Applications]]
[[Category:Server]]
[[Category:orchestration]]

Latest revision as of 21:54, 18 June 2024

K3s is a simplified Kubernetes version that bundles Kubernetes cluster components into a few small binaries optimized for Edge and IoT devices.


NixOS's K3s documentation is available at:

https://github.com/NixOS/nixpkgs/blob/master/pkgs/applications/networking/cluster/k3s/README.md