| 
				   | 
				
| (21 intermediate revisions by 7 users not shown) | 
| Line 1: | 
Line 1: | 
 | K3s is a simplified version of [[Kubernetes]]. It bundles all components for a kubernetes cluster into a few of small binaries.  |  | [https://k3s.io/ K3s] is a simplified [[Kubernetes]] version that bundles Kubernetes cluster components into a few small binaries optimized for Edge and IoT devices.    | 
 | 
  |  | 
  | 
 | == Single node setup ==
  |  | 
 | 
  |  | 
  | 
 | <syntaxHighlight lang=nix>
  |  | NixOS's K3s documentation is available at:  | 
 | {
  |  | 
 | 
  |  | 
  | 
 |   networking.firewall.allowedTCPPorts = [
  |  | https://github.com/NixOS/nixpkgs/blob/master/pkgs/applications/networking/cluster/k3s/README.md  | 
 |     6443 # k3s: required so that pods can reach the API server (running on port 6443 by default)
  |  | 
 |     # 2379 # k3s, etcd clients: required if using a "High Availability Embedded etcd" configuration
  |  | 
 |     # 2380 # k3s, etcd peers: required if using a "High Availability Embedded etcd" configuration
  |  | 
 |   ];
  |  | 
 |   networking.firewall.allowedUDPPorts = [
  |  | 
 |     # 8472 # k3s, flannel: required if using multi-node for inter-node networking
  |  | 
 |   ];
  |  | 
 |   services.k3s.enable = true;
  |  | 
 |   services.k3s.role = "server";
  |  | 
 |   services.k3s.extraFlags = toString [
  |  | 
 |     # "--kubelet-arg=v=4" # Optionally add additional args to k3s
  |  | 
 |   ];
  |  | 
 |   environment.systemPackages = [ pkgs.k3s ];
  |  | 
 | }
  |  | 
 | </syntaxHighlight>
  |  | 
 | 
  |  | 
  | 
 | After enabling, you can access your cluster through <code>sudo k3s kubectl</code> i.e. <code>sudo k3s kubectl cluster-info</code>, or by using the generated kubeconfig file in <code>/etc/rancher/k3s/k3s.yaml</code>
  |  | [[Category:Container]]  | 
 |    |  | 
 | == Multi-node setup ==
  |  | 
 |    |  | 
 | it is simple to create a cluster of multiple nodes in a highly available setup (all nodes are in the control-plane and are a part of the etcd cluster).
  |  | 
 |    |  | 
 | The first node is configured like this:
  |  | 
 |    |  | 
 | <syntaxHighlight lang=nix>
  |  | 
 | {
  |  | 
 |   services.k3s = {
  |  | 
 |     enable = true;
  |  | 
 |     role = "server";
  |  | 
 |     token = "<randomized common secret>";
  |  | 
 |     clusterInit = true;
  |  | 
 |   };
  |  | 
 | }
  |  | 
 | </syntaxHighlight>
  |  | 
 |    |  | 
 | Any other subsequent nodes can be added with a slightly different config:
  |  | 
 |    |  | 
 | <syntaxHighlight lang=nix>
  |  | 
 | {
  |  | 
 |   services.k3s = {
  |  | 
 |     enable = true;
  |  | 
 |     role = "server";
  |  | 
 |     token = "<randomized common secret>";
  |  | 
 |     serverAddr = "https://<ip of first node>:6443";
  |  | 
 |   };
  |  | 
 | }
  |  | 
 | </syntaxHighlight>
  |  | 
 |    |  | 
 | For this to work you need to open the aforementioned API, etcd, and flannel ports in the firewall. Note that it is [https://etcd.io/docs/v3.3/faq/#why-an-odd-number-of-cluster-members recommended] to use an odd number of nodes in such a cluster.
  |  | 
 |    |  | 
 | Or see this [https://github.com/Mic92/doctor-cluster-config/tree/master/modules/k3s real world example]. You might want to ignore some parts of it i.e. the monitoring as this is specific to our setup.
  |  | 
 | The K3s server needs to import <code>modules/k3s/server.nix</code> and an agent <code>modules/k3s/agent.nix</code>.
  |  | 
 | Tip: You might run into issues with coredns not being reachable from agent nodes. Right now, we disable the NixOS firewall all together until we find a better solution.
  |  | 
 |    |  | 
 | == ZFS support ==
  |  | 
 |    |  | 
 | K3s's builtin containerd does not support the zfs snapshotter. However, it is possible to configure it to use an external containerd:
  |  | 
 |    |  | 
 | <syntaxHighlight lang=nix>
  |  | 
 |   virtualisation.containerd = {
  |  | 
 |     enable = true;
  |  | 
 |     settings =
  |  | 
 |       let
  |  | 
 |         fullCNIPlugins = pkgs.buildEnv {
  |  | 
 |           name = "full-cni";
  |  | 
 |           paths = with pkgs;[
  |  | 
 |             cni-plugins
  |  | 
 |             cni-plugin-flannel
  |  | 
 |           ];
  |  | 
 |         };
  |  | 
 |       in {
  |  | 
 |         plugins."io.containerd.grpc.v1.cri".cni = {
  |  | 
 |           bin_dir = "${fullCNIPlugins}/bin";
  |  | 
 |           conf_dir = "/var/lib/rancher/k3s/agent/etc/cni/net.d/";
  |  | 
 |         };
  |  | 
 |         # Optionally set private registry credentials here instead of using /etc/rancher/k3s/registries.yaml
  |  | 
 |         # plugins."io.containerd.grpc.v1.cri".registry.configs."registry.example.com".auth = {
  |  | 
 |         #  username = "";
  |  | 
 |         #  password = "";
  |  | 
 |         # };
  |  | 
 |       };
  |  | 
 |   };
  |  | 
 |   # TODO describe how to enable zfs snapshotter in containerd
  |  | 
 |   services.k3s.extraFlags = toString [
  |  | 
 |     "--container-runtime-endpoint unix:///run/containerd/containerd.sock"
  |  | 
 |   ];
  |  | 
 | </syntaxHighlight>
  |  | 
 |    |  | 
 | == Network policies ==
  |  | 
 |    |  | 
 | The current k3s derivation doesn't include <code>ipset</code> package, which is required by the network policy controller.
  |  | 
 |    |  | 
 | k3s logs
  |  | 
 | <syntaxHighlight lang=text>
  |  | 
 | level=warning msg="Skipping network policy controller start, ipset unavailable: ipset utility not found"
  |  | 
 | </syntaxHighlight>
  |  | 
 |    |  | 
 | There is an open pull request to fix it https://github.com/NixOS/nixpkgs/pull/176520#pullrequestreview-1304593562. Until then, the package can be added to k3s's path as follows
  |  | 
 | <syntaxHighlight lang=nix>
  |  | 
 |   systemd.services.k3s.path = [ pkgs.ipset ];
  |  | 
 | </syntaxHighlight>
  |  | 
 |    |  | 
 | == Nvidia support ==
  |  | 
 | To use Nvidia GPU in the cluster the nvidia-container-runtime and runc are needed. To get the two components it suffices to add the following to the configuration
  |  | 
 |    |  | 
 | <syntaxHighlight lang=nix>
  |  | 
 | virtualisation.docker = {
  |  | 
 |   enable = true;
  |  | 
 |   enableNvidia = true;
  |  | 
 | };
  |  | 
 | environment.systemPackages = with pkgs; [ docker runc ];
  |  | 
 | </syntaxHighlight>
  |  | 
 |    |  | 
 | Note, using docker here is a workaround, it will install nvidia-container-runtime and that will cause it to be accessible via  <code>/run/current-system/sw/bin/nvidia-container-runtime</code>, currently its not directly accessible in nixpkgs.
  |  | 
 |    |  | 
 | You now need to create a new file in  <code>/var/lib/rancher/k3s/agent/etc/containerd/config.toml.tmpl</code> with the following
  |  | 
 |    |  | 
 | <syntaxHighlight lang=toml>
  |  | 
 | {{ template "base" . }}
  |  | 
 |    |  | 
 | [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.nvidia]
  |  | 
 |   privileged_without_host_devices = false
  |  | 
 |   runtime_engine = ""
  |  | 
 |   runtime_root = ""
  |  | 
 |   runtime_type = "io.containerd.runc.v2"
  |  | 
 |    |  | 
 | [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.nvidia.options]
  |  | 
 |   BinaryName = "/run/current-system/sw/bin/nvidia-container-runtime"
  |  | 
 | </syntaxHighlight>
  |  | 
 |    |  | 
 | Note here we are pointing the nvidia runtime to  "/run/current-system/sw/bin/nvidia-container-runtime".
  |  | 
 |    |  | 
 | Now apply the following runtime class to k3s cluster:
  |  | 
 |    |  | 
 | <syntaxHighlight lang=yaml>
  |  | 
 | apiVersion: node.k8s.io/v1
  |  | 
 | handler: nvidia
  |  | 
 | kind: RuntimeClass
  |  | 
 | metadata:
  |  | 
 |   labels:
  |  | 
 |     app.kubernetes.io/component: gpu-operator
  |  | 
 |   name: nvidia
  |  | 
 | </syntaxHighlight>
  |  | 
 |    |  | 
 | Following [https://github.com/NVIDIA/k8s-device-plugin#deployment-via-helm k8s-device-plugin] install the helm chart with  <code>runtimeClassName: nvidia</code> set. In order to passthrough the nvidia card into the container, your deployments spec must contain 
  |  | 
 | - runtimeClassName: nvidia
  |  | 
 | - env:
  |  | 
 |     - name: NVIDIA_VISIBLE_DEVICES
  |  | 
 |       value: all
  |  | 
 |     - name: NVIDIA_DRIVER_CAPABILITIES
  |  | 
 |       value: all
  |  | 
 | to test its working exec onto a pod and run  <code>nvidia-smi</code>. For more configurability of nvidia related matters in k3s look in [https://docs.k3s.io/advanced#nvidia-container-runtime-support k3s-docs]
  |  | 
 |    |  | 
 | == Storage ==
  |  | 
 |    |  | 
 | === Longhorn ===
  |  | 
 |    |  | 
 | NixOS configuration required for Longhorn:
  |  | 
 |    |  | 
 | <syntaxHighlight lang=nix>
  |  | 
 | environment.systemPackages = [ pkgs.nfs-utils ];
  |  | 
 | services.openiscsi = {
  |  | 
 |   enable = true;
  |  | 
 |   name = "hostname-initiatorhost"; 
  |  | 
 | };
  |  | 
 | </syntaxHighlight>
  |  | 
 |    |  | 
 | Longhorn container has trouble with NixOS path. Solution is to override PATH environment variable, such as:
  |  | 
 |    |  | 
 | <syntaxHighlight lang=bash>
  |  | 
 | PATH: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/run/wrappers/bin:/nix/var/nix/profiles/default/bin:/run/current-system/sw/bin
  |  | 
 | </syntaxHighlight>
  |  | 
 |    |  | 
 | ==== Kyverno Policy for Fixing Longhorn Container ====
  |  | 
 |    |  | 
 | <syntaxHighlight lang=yaml>
  |  | 
 | ---
  |  | 
 | apiVersion: v1
  |  | 
 | kind: ConfigMap
  |  | 
 | metadata:
  |  | 
 |   name: longhorn-nixos-path
  |  | 
 |   namespace: longhorn-system
  |  | 
 | data:
  |  | 
 |   PATH: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/run/wrappers/bin:/nix/var/nix/profiles/default/bin:/run/current-system/sw/bin
  |  | 
 | ---
  |  | 
 | apiVersion: kyverno.io/v1
  |  | 
 | kind: ClusterPolicy
  |  | 
 | metadata:
  |  | 
 |   name: longhorn-add-nixos-path
  |  | 
 |   annotations:
  |  | 
 |     policies.kyverno.io/title: Add Environment Variables from ConfigMap
  |  | 
 |     policies.kyverno.io/subject: Pod
  |  | 
 |     policies.kyverno.io/category: Other
  |  | 
 |     policies.kyverno.io/description: >-
  |  | 
 |       Longhorn invokes executables on the host system, and needs
  |  | 
 |       to be aware of the host systems PATH. This modifies all
  |  | 
 |       deployments such that the PATH is explicitly set to support
  |  | 
 |       NixOS based systems.
  |  | 
 | spec:
  |  | 
 |   rules:
  |  | 
 |     - name: add-env-vars
  |  | 
 |       match:
  |  | 
 |         resources:
  |  | 
 |           kinds:
  |  | 
 |             - Pod
  |  | 
 |           namespaces:
  |  | 
 |             - longhorn-system
  |  | 
 |       mutate:
  |  | 
 |         patchStrategicMerge:
  |  | 
 |           spec:
  |  | 
 |             initContainers:
  |  | 
 |               - (name): "*"
  |  | 
 |                 envFrom:
  |  | 
 |                   - configMapRef:
  |  | 
 |                       name: longhorn-nixos-path
  |  | 
 |             containers:
  |  | 
 |               - (name): "*"
  |  | 
 |                 envFrom:
  |  | 
 |                   - configMapRef:
  |  | 
 |                       name: longhorn-nixos-path
  |  | 
 | ---
  |  | 
 | </syntaxHighlight>
  |  | 
 |    |  | 
 | === NFS  ===
  |  | 
 |    |  | 
 | NixOS configuration required for NFS:
  |  | 
 |    |  | 
 | <syntaxHighlight lang=nix>
  |  | 
 | boot.supportedFilesystems = [ "nfs" ];
  |  | 
 | services.rpcbind.enable = true;
  |  | 
 | </syntaxHighlight>
  |  | 
 |    |  | 
 | == Troubleshooting ==
  |  | 
 |    |  | 
 | === Raspberry Pi not working ===
  |  | 
 |    |  | 
 | If the k3s.service/k3s server does not start and gives you the error <code>FATA[0000] failed to find memory cgroup (v2)</code> Here's the github issue: https://github.com/k3s-io/k3s/issues/2067 .
  |  | 
 |    |  | 
 | To fix the problem, you can add these things to your configuration.nix.
  |  | 
 |    |  | 
 | <source lang="nix">  boot.kernelParams = [
  |  | 
 |     "cgroup_enable=cpuset" "cgroup_memory=1" "cgroup_enable=memory"
  |  | 
 |   ];
  |  | 
 | </source>
  |  | 
 |    |  | 
 |    |  | 
 | [[Category:Applications]]
  |  | 
 | [[Category:Server]]
  |  | 
 | [[Category:orchestration]]
  |  |