|
|
(17 intermediate revisions by 7 users not shown) |
Line 1: |
Line 1: |
| K3s is a simplified version of [[Kubernetes]]. It bundles all components for a kubernetes cluster into a few of small binaries. | | [https://k3s.io/ K3s] is a simplified Kubernetes version that bundles Kubernetes cluster components into a few small binaries optimized for Edge and IoT devices. |
|
| |
|
| == Single node setup ==
| |
|
| |
|
| <syntaxHighlight lang=nix>
| | NixOS's K3s documentation is available at: |
| {
| |
|
| |
|
| networking.firewall.allowedTCPPorts = [
| | https://github.com/NixOS/nixpkgs/blob/master/pkgs/applications/networking/cluster/k3s/README.md |
| 6443 # k3s: required so that pods can reach the API server (running on port 6443 by default)
| |
| # 2379 # k3s, etcd clients: required if using a "High Availability Embedded etcd" configuration
| |
| # 2380 # k3s, etcd peers: required if using a "High Availability Embedded etcd" configuration
| |
| ];
| |
| networking.firewall.allowedUDPPorts = [
| |
| # 8472 # k3s, flannel: required if using multi-node for inter-node networking
| |
| ];
| |
| services.k3s.enable = true;
| |
| services.k3s.role = "server";
| |
| services.k3s.extraFlags = toString [
| |
| # "--kubelet-arg=v=4" # Optionally add additional args to k3s
| |
| ];
| |
| environment.systemPackages = [ pkgs.k3s ];
| |
| }
| |
| </syntaxHighlight>
| |
|
| |
|
| After enabling, you can access your cluster through <code>sudo k3s kubectl</code> i.e. <code>sudo k3s kubectl cluster-info</code>, or by using the generated kubeconfig file in <code>/etc/rancher/k3s/k3s.yaml</code>
| | [[Category:Container]] |
| | |
| == Multi-node setup ==
| |
| | |
| it is simple to create a cluster of multiple nodes in a highly available setup (all nodes are in the control-plane and are a part of the etcd cluster).
| |
| | |
| The first node is configured like this:
| |
| | |
| <syntaxHighlight lang=nix>
| |
| {
| |
| services.k3s = {
| |
| enable = true;
| |
| role = "server";
| |
| token = "<randomized common secret>";
| |
| clusterInit = true;
| |
| };
| |
| }
| |
| </syntaxHighlight>
| |
| | |
| Any other subsequent nodes can be added with a slightly different config:
| |
| | |
| <syntaxHighlight lang=nix>
| |
| {
| |
| services.k3s = {
| |
| enable = true;
| |
| role = "server";
| |
| token = "<randomized common secret>";
| |
| serverAddr = "https://<ip of first node>:6443";
| |
| };
| |
| }
| |
| </syntaxHighlight>
| |
| | |
| For this to work you need to open the aforementioned API, etcd, and flannel ports in the firewall. Note that it is [https://etcd.io/docs/v3.3/faq/#why-an-odd-number-of-cluster-members recommended] to use an odd number of nodes in such a cluster.
| |
| | |
| Or see this [https://github.com/Mic92/doctor-cluster-config/tree/master/modules/k3s real world example]. You might want to ignore some parts of it i.e. the monitoring as this is specific to our setup.
| |
| The K3s server needs to import <code>modules/k3s/server.nix</code> and an agent <code>modules/k3s/agent.nix</code>.
| |
| Tip: You might run into issues with coredns not being reachable from agent nodes. Right now, we disable the NixOS firewall all together until we find a better solution.
| |
| | |
| == ZFS support ==
| |
| | |
| K3s's builtin containerd does not support the zfs snapshotter. However, it is possible to configure it to use an external containerd:
| |
| | |
| <syntaxHighlight lang=nix>
| |
| virtualisation.containerd = {
| |
| enable = true;
| |
| settings =
| |
| let
| |
| fullCNIPlugins = pkgs.buildEnv {
| |
| name = "full-cni";
| |
| paths = with pkgs;[
| |
| cni-plugins
| |
| cni-plugin-flannel
| |
| ];
| |
| };
| |
| in {
| |
| plugins."io.containerd.grpc.v1.cri".cni = {
| |
| bin_dir = "${fullCNIPlugins}/bin";
| |
| conf_dir = "/var/lib/rancher/k3s/agent/etc/cni/net.d/";
| |
| };
| |
| # Optionally set private registry credentials here instead of using /etc/rancher/k3s/registries.yaml
| |
| # plugins."io.containerd.grpc.v1.cri".registry.configs."registry.example.com".auth = {
| |
| # username = "";
| |
| # password = "";
| |
| # };
| |
| };
| |
| };
| |
| # TODO describe how to enable zfs snapshotter in containerd
| |
| services.k3s.extraFlags = toString [
| |
| "--container-runtime-endpoint unix:///run/containerd/containerd.sock"
| |
| ];
| |
| </syntaxHighlight>
| |
| | |
| == Nvidia support ==
| |
| To use Nvidia GPU in the cluster the nvidia-container-runtime and runc are needed. To get the two components it suffices to add the following to the configuration
| |
| | |
| <syntaxHighlight lang=nix>
| |
| virtualisation.docker = {
| |
| enable = true;
| |
| enableNvidia = true;
| |
| };
| |
| environment.systemPackages = with pkgs; [ docker runc ];
| |
| </syntaxHighlight>
| |
| | |
| Note, using docker here is a workaround, it will install nvidia-container-runtime and that will cause it to be accessible via <code>/run/current-system/sw/bin/nvidia-container-runtime</code>, currently its not directly accessible in nixpkgs.
| |
| | |
| You now need to create a new file in <code>/var/lib/rancher/k3s/agent/etc/containerd/config.toml.tmpl</code> with the following
| |
| | |
| <syntaxHighlight lang=toml>
| |
| {{ template "base" . }}
| |
| | |
| [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.nvidia]
| |
| privileged_without_host_devices = false
| |
| runtime_engine = ""
| |
| runtime_root = ""
| |
| runtime_type = "io.containerd.runc.v2"
| |
| | |
| [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.nvidia.options]
| |
| BinaryName = "/run/current-system/sw/bin/nvidia-container-runtime"
| |
| </syntaxHighlight>
| |
| | |
| Note here we are pointing the nvidia runtime to "/run/current-system/sw/bin/nvidia-container-runtime".
| |
| | |
| Now apply the following runtime class to k3s cluster:
| |
| | |
| <syntaxHighlight lang=yaml>
| |
| apiVersion: node.k8s.io/v1
| |
| handler: nvidia
| |
| kind: RuntimeClass
| |
| metadata:
| |
| labels:
| |
| app.kubernetes.io/component: gpu-operator
| |
| name: nvidia
| |
| </syntaxHighlight>
| |
| | |
| Following [https://github.com/NVIDIA/k8s-device-plugin#deployment-via-helm k8s-device-plugin] install the helm chart with <code>runtimeClassName: nvidia</code> set. In order to passthrough the nvidia card into the container, your deployments spec must contain
| |
| - runtimeClassName: nvidia
| |
| - env:
| |
| - name: NVIDIA_VISIBLE_DEVICES
| |
| value: all
| |
| - name: NVIDIA_DRIVER_CAPABILITIES
| |
| value: all
| |
| to test its working exec onto a pod and run <code>nvidia-smi</code>. For more configurability of nvidia related matters in k3s look in [https://docs.k3s.io/advanced#nvidia-container-runtime-support k3s-docs]
| |
| | |
| == Storage ==
| |
| | |
| === Longhorn ===
| |
| | |
| NixOS configuration required for Longhorn:
| |
| | |
| <syntaxHighlight lang=nix>
| |
| environment.systemPackages = [ pkgs.nfs-utils ];
| |
| services.openiscsi = {
| |
| enable = true;
| |
| name = "${config.networking.hostName}-initiatorhost";
| |
| };
| |
| </syntaxHighlight>
| |
| | |
| Longhorn container has trouble with NixOS path. Solution is to override PATH environment variable, such as:
| |
| | |
| <syntaxHighlight lang=bash>
| |
| PATH: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/run/wrappers/bin:/nix/var/nix/profiles/default/bin:/run/current-system/sw/bin
| |
| </syntaxHighlight>
| |
| | |
| ==== Kyverno Policy for Fixing Longhorn Container for NixOS ====
| |
| | |
| <syntaxHighlight lang=yaml>
| |
| ---
| |
| apiVersion: v1
| |
| kind: ConfigMap
| |
| metadata:
| |
| name: longhorn-nixos-path
| |
| namespace: longhorn-system
| |
| data:
| |
| PATH: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/run/wrappers/bin:/nix/var/nix/profiles/default/bin:/run/current-system/sw/bin
| |
| ---
| |
| apiVersion: kyverno.io/v1
| |
| kind: ClusterPolicy
| |
| metadata:
| |
| name: longhorn-add-nixos-path
| |
| annotations:
| |
| policies.kyverno.io/title: Add Environment Variables from ConfigMap
| |
| policies.kyverno.io/subject: Pod
| |
| policies.kyverno.io/category: Other
| |
| policies.kyverno.io/description: >-
| |
| Longhorn invokes executables on the host system, and needs
| |
| to be aware of the host systems PATH. This modifies all
| |
| deployments such that the PATH is explicitly set to support
| |
| NixOS based systems.
| |
| spec:
| |
| rules:
| |
| - name: add-env-vars
| |
| match:
| |
| resources:
| |
| kinds:
| |
| - Pod
| |
| namespaces:
| |
| - longhorn-system
| |
| mutate:
| |
| patchStrategicMerge:
| |
| spec:
| |
| initContainers:
| |
| - (name): "*"
| |
| envFrom:
| |
| - configMapRef:
| |
| name: longhorn-nixos-path
| |
| containers:
| |
| - (name): "*"
| |
| envFrom:
| |
| - configMapRef:
| |
| name: longhorn-nixos-path
| |
| ---
| |
| </syntaxHighlight>
| |
| | |
| === NFS ===
| |
| | |
| NixOS configuration required for NFS:
| |
| | |
| <syntaxHighlight lang=nix>
| |
| boot.supportedFilesystems = [ "nfs" ];
| |
| services.rpcbind.enable = true;
| |
| </syntaxHighlight>
| |
| | |
| == Troubleshooting ==
| |
| | |
| === Raspberry Pi not working ===
| |
| | |
| If the k3s.service/k3s server does not start and gives you the error <code>FATA[0000] failed to find memory cgroup (v2)</code> Here's the github issue: https://github.com/k3s-io/k3s/issues/2067 .
| |
| | |
| To fix the problem, you can add these things to your configuration.nix.
| |
| | |
| <source lang="nix"> boot.kernelParams = [
| |
| "cgroup_enable=cpuset" "cgroup_memory=1" "cgroup_enable=memory"
| |
| ];
| |
| </source>
| |
| | |
| | |
| [[Category:Applications]]
| |
| [[Category:Server]]
| |
| [[Category:orchestration]]
| |