Kubernetes: Difference between revisions

From NixOS Wiki
imported>Iceychris
(+ master setup, stub for node)
 
(removed kube-nix which is unrelated to kubernetes but installs a kde groupware.)
 
(34 intermediate revisions by 15 users not shown)
Line 1: Line 1:
[https://kubernetes.io/ Kubernetes] is an open-source container orchestration system for automating software deployment, scaling, and management.
This wiki article extends the documentation in [https://nixos.org/manual/nixos/stable/#sec-kubernetes NixOS manual].
== [[wikipedia:en:KISS principle|KISS]] ==
If you are new to [[Kubernetes]] you might want to check out [[K3s]] first as it is easier to set up (less moving parts).
== 1 Master and 1 Node ==
== 1 Master and 1 Node ==


Line 9: Line 17:
Caveats:
Caveats:


* this is probably not best-practice?
* this was only tested on <code>20.09pre215024.e97dfe73bba (Nightingale)</code> (<code>unstable</code>)
* this is probably not best-practice
** for a production-grade cluster you shouldn't use <code>easyCerts</code>
* If you experience inability to reach service CIDR from pods, disable firewall via <code>networking.firewall.enable = false;</code> or otherwise make sure that it doesn't interfere with packet forwarding.
* Make sure to set <code>docker0</code> in promiscuous mode <code>ip link set docker0 promisc on</code>


=== Master  ===
=== Master  ===
Line 15: Line 27:
Add to your <code>configuration.nix</code>:
Add to your <code>configuration.nix</code>:


<syntaxhighlight lang="nix">
<syntaxhighlight lang=nix>
{ config, pkgs, ... }:
{ config, pkgs, ... }:
let
let
  # When using easyCerts=true the IP Address must resolve to the master on creation.
# So use simply 127.0.0.1 in that case. Otherwise you will have errors like this https://github.com/NixOS/nixpkgs/issues/59364
   kubeMasterIP = "10.1.1.2";
   kubeMasterIP = "10.1.1.2";
   kubeMasterHostname = "api.kube";
   kubeMasterHostname = "api.kube";
   kubeMasterAPIServerPort = 443;
   kubeMasterAPIServerPort = 6443;
in
in
{
{
Line 35: Line 49:
   services.kubernetes = {
   services.kubernetes = {
     roles = ["master" "node"];
     roles = ["master" "node"];
    masterAddress = kubeMasterHostname;
    apiserverAddress = "https://${kubeMasterHostname}:${toString kubeMasterAPIServerPort}";
    easyCerts = true;
     apiserver = {
     apiserver = {
       securePort = ${kubeMasterAPIServerPort};
       securePort = kubeMasterAPIServerPort;
       advertiseAddress = ${kubeMasterIP};
       advertiseAddress = kubeMasterIP;
     };
     };
     masterAddress = ${kubeMasterHostname};
 
     easyCerts = true;
     # use coredns
    addons.dns.enable = true;
 
     # needed if you use swap
    kubelet.extraOpts = "--fail-swap-on=false";
   };
   };
  # needed if you use swap
  services.kubernetes.kubelet.extraOpts = "--fail-swap-on=false";
}
}
</syntaxhighlight>
</syntaxhighlight>
Line 52: Line 70:
Link your <code>kubeconfig</code> to your home directory:
Link your <code>kubeconfig</code> to your home directory:


<syntaxhighlight lang="bash">
<syntaxhighlight lang=bash>
ln -s /etc/kubernetes/cluster-admin.kubeconfig ~/.kube/config
ln -s /etc/kubernetes/cluster-admin.kubeconfig ~/.kube/config
</syntaxhighlight>
</syntaxhighlight>
Line 58: Line 76:
Now, executing <code>kubectl cluster-info</code> should yield something like this:
Now, executing <code>kubectl cluster-info</code> should yield something like this:


<syntaxhighlight>
<syntaxhighlight lang=shell>
Kubernetes master is running at https://10.1.1.2
Kubernetes master is running at https://10.1.1.2
CoreDNS is running at https://10.1.1.2/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
CoreDNS is running at https://10.1.1.2/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
KubeDashboard is running at https://10.1.1.2/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy


To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
Line 68: Line 85:
You should also see that the master is also a node using <code>kubectl get nodes</code>:
You should also see that the master is also a node using <code>kubectl get nodes</code>:


<syntaxhighlight>
<syntaxhighlight lang=shell>
NAME      STATUS  ROLES    AGE  VERSION
NAME      STATUS  ROLES    AGE  VERSION
direwolf  Ready    <none>  41m  v1.16.6-beta.0
direwolf  Ready    <none>  41m  v1.16.6-beta.0
</syntaxhighlight>
</syntaxhighlight>


=== Node  ===
=== Node  ===
Line 79: Line 94:
Add to your <code>configuration.nix</code>:
Add to your <code>configuration.nix</code>:


<syntaxhighlight lang="nix">
<syntaxhighlight lang=nix>
{ config, pkgs, ... }:
{ config, pkgs, ... }:
let
let
   kubeMasterIP = "10.1.1.2";
   kubeMasterIP = "10.1.1.2";
   kubeMasterHostname = "api.kube";
   kubeMasterHostname = "api.kube";
   kubeMasterAPIServerPort = 443;
   kubeMasterAPIServerPort = 6443;
in
in
{
{
  # resolve master hostname
  networking.extraHosts = "${kubeMasterIP} ${kubeMasterHostname}";
  # packages for administration tasks
  environment.systemPackages = with pkgs; [
    kompose
    kubectl
    kubernetes
  ];
  services.kubernetes = let
    api = "https://${kubeMasterHostname}:${toString kubeMasterAPIServerPort}";
  in
  {
    roles = ["node"];
    masterAddress = kubeMasterHostname;
    easyCerts = true;


    # point kubelet and other services to kube-apiserver
    kubelet.kubeconfig.server = api;
    apiserverAddress = api;
    # use coredns
    addons.dns.enable = true;
    # needed if you use swap
    kubelet.extraOpts = "--fail-swap-on=false";
  };
}
}
</syntaxhighlight>
</syntaxhighlight>


== Multiple Masters (HA) ==
Apply your config (e.g. <code>nixos-rebuild switch</code>).
 
According to the [https://github.com/NixOS/nixpkgs/blob/18ff53d7656636aa440b2f73d2da788b785e6a9c/nixos/tests/kubernetes/rbac.nix#L118 NixOS tests], make your Node join the cluster:
 
on the master, grab the apitoken
<syntaxhighlight lang=bash>
cat /var/lib/kubernetes/secrets/apitoken.secret
</syntaxhighlight>
 
on the node, join the node with
<syntaxhighlight lang=bash>
echo TOKEN | nixos-kubernetes-node-join
</syntaxhighlight>
 
After that, you should see your new node using <code>kubectl get nodes</code>:
 
<syntaxhighlight lang=shell>
NAME      STATUS  ROLES    AGE    VERSION
direwolf  Ready    <none>  62m    v1.16.6-beta.0
drake      Ready    <none>  102m  v1.16.6-beta.0
</syntaxhighlight>
 
== N Masters (HA) ==


{{expansion|How to set this up?}}
{{expansion|How to set this up?}}


== Debugging ==
== Troubleshooting ==


<syntaxhighlight>
<syntaxhighlight lang=bash>
systemctl status kubelet
systemctl status kubelet
</syntaxhighlight>
<syntaxhighlight lang="bash">
systemctl status kube-apiserver
systemctl status kube-apiserver
</syntaxhighlight>
<syntaxhighlight lang="bash">
kubectl get nodes
kubectl get nodes
</syntaxhighlight>
</syntaxhighlight>


== Sources ==
=== Join Cluster not working ===


If you face issues while running the <code>nixos-kubernetes-node-join</code> script:
<syntaxhighlight lang=shell>
Restarting certmgr...
Job for certmgr.service failed because a timeout was exceeded.
See "systemctl status certmgr.service" and "journalctl -xe" for details.
</syntaxhighlight>
Go investigate with <code>journalctl -u certmgr</code>:
<syntaxhighlight lang=shell>
... certmgr: loading from config file /nix/store/gj7qr7lp6wakhiwcxdpxwbpamvmsifhk-certmgr.yaml
... manager: loading certificates from /nix/store/4n41ikm7322jxg7bh0afjpxsd4b2idpv-certmgr.d
... manager: loading spec from /nix/store/4n41ikm7322jxg7bh0afjpxsd4b2idpv-certmgr.d/flannelClient.json
... [ERROR] cert: failed to fetch remote CA: failed to parse rootCA certs
</syntaxhighlight>
In this case, <code>cfssl</code> could be overloaded.
Restarting cfssl on the <code>master</code> node should help: <code>systemctl restart cfssl</code>
Also, make sure that port <code>8888</code> is open on your master node.
=== DNS issues ===
Check if coredns is running via <code>kubectl get pods -n kube-system</code>:
<syntaxhighlight lang=shell>
NAME                      READY  STATUS    RESTARTS  AGE
coredns-577478d784-bmt5s  1/1    Running  2          163m
coredns-577478d784-bqj65  1/1    Running  2          163m
</syntaxhighlight>
Run a pod to check with <code>kubectl run curl --restart=Never --image=radial/busyboxplus:curl -i --tty</code>:
If you don't see a command prompt, try pressing enter.
<syntaxhighlight lang=shell>
[ root@curl:/ ]$
</syntaxhighlight>
<syntaxhighlight lang=bash>
nslookup google.com
</syntaxhighlight>
<syntaxhighlight lang=shell>
Server:    10.0.0.254
Address 1: 10.0.0.254 kube-dns.kube-system.svc.cluster.local
Name:      google.com
Address 1: 2a00:1450:4016:803::200e muc12s04-in-x0e.1e100.net
Address 2: 172.217.23.14 lhr35s01-in-f14.1e100.net
</syntaxhighlight>
In case DNS is still not working I found that sometimes, restarting services helps:
<syntaxhighlight lang=bash>
systemctl restart kube-proxy flannel kubelet
</syntaxhighlight>
=== reset to a clean state ===
Sometimes it helps to have a clean state on all instances:
* comment kubernetes-related code in <code>configuration.nix</code>
* <code>nixos-rebuild switch</code>
* clean up filesystem
** <code>rm -rf /var/lib/kubernetes/ /var/lib/etcd/ /var/lib/cfssl/ /var/lib/kubelet/</code>
** <code>rm -rf /etc/kube-flannel/ /etc/kubernetes/</code>
* uncomment kubernetes-related code again
* <code>nixos-rebuild switch</code>
== Miscellaneous ==
=== Rook Ceph storage cluster ===
Chances are you want to setup a storage cluster using [https://rook.io/ rook].
To do so, I found it necessary to change a few things (tested with <code>rook v1.2</code>):
* you need the <code>ceph</code> kernel module: <code>boot.kernelModules = [ "ceph" ];</code>
* change the root dir of the kubelet: <code>kubelet.extraOpts = "--root-dir=/var/lib/kubelet";</code>
* reboot all your nodes
* continue with [https://rook.io/docs/rook/v1.2/ceph-quickstart.html the official quickstart guide]
* in <code>operator.yaml</code>, help the CSI plugins find the hosts' ceph kernel modules by adding (or uncommenting -- they're in the example config) these entries:
  CSI_CEPHFS_PLUGIN_VOLUME: |
  - name: lib-modules
    hostPath:
      path: /run/current-system/kernel-modules/lib/modules/
  CSI_RBD_PLUGIN_VOLUME: |
  - name: lib-modules
    hostPath:
      path: /run/current-system/kernel-modules/lib/modules/
=== NVIDIA ===
You can use NVIDIA's [https://github.com/NVIDIA/k8s-device-plugin k8s-device-plugin].
Make <code>nvidia-docker</code> your default docker runtime:
<syntaxhighlight lang=nix>
virtualisation.docker = {
    enable = true;
    # use nvidia as the default runtime
    enableNvidia = true;
    extraOptions = "--default-runtime=nvidia";
};
</syntaxhighlight>
Apply their Daemonset:
<syntaxhighlight lang=bash>
kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/1.0.0-beta4/nvidia-device-plugin.yml
</syntaxhighlight>
=== <code>/dev/shm</code> ===
Some applications need enough shared memory to work properly.
Create a new volumeMount for your Deployment:
<syntaxhighlight lang=bash>
volumeMounts:
- mountPath: /dev/shm
  name: dshm
</syntaxhighlight>
and mark its <code>medium</code> as <code>Memory</code>:
<syntaxhighlight lang=bash>
volumes:
- name: dshm
  emptyDir:
  medium: Memory
</syntaxhighlight>
=== Arm64 ===
Nix might pull in <code>coredns</code> and <code>etcd</code> images that are incompatible with arm, To resolve this add the following to your master node's configuration:
==== etcd ====
<syntaxhighlight lang=nix>
  ...
  services.kubernetes = {...};
  systemd.services.etcd = {
    environment = {
      ETCD_UNSUPPORTED_ARCH = "arm64";
    };
  };
  ...
</syntaxhighlight>
==== coredns ====
<syntaxhighlight lang=nix>
  services.kubernetes = {
    ...
    # use coredns
    addons.dns = {
      enable = true;
      coredns = {
        finalImageTag = "1.10.1";
        imageDigest = "sha256:a0ead06651cf580044aeb0a0feba63591858fb2e43ade8c9dea45a6a89ae7e5e";
        imageName = "coredns/coredns";
        sha256 = "0c4vdbklgjrzi6qc5020dvi8x3mayq4li09rrq2w0hcjdljj0yf9";
      };
    };
  ...
  };
</syntaxhighlight>
== Tooling ==
There are various community projects aimed at facilitating working with Kubernetes combined with Nix:
* [https://github.com/saschagrunert/kubernix kubernix]: simple setup of development clusters using Nix
* [https://kubenix.org/ kubenix] - [https://github.com/hall/kubenix GitHub (updated 2023)]
* [https://github.com/justinas/nixos-ha-kubernetes nixos-ha-kubernetes]
== References ==
* [https://github.com/NixOS/nixpkgs/issues/39327 Issue #39327]: kubernetes support is missing some documentation
* [https://discourse.nixos.org/t/kubernetes-using-multiple-nodes-with-latest-unstable/3936 NixOS Discourse]: Using multiple nodes on unstable
* [https://kubernetes.io/docs/home/ Kubernetes docs]
* [https://kubernetes.io/docs/home/ Kubernetes docs]
* [https://github.com/NixOS/nixpkgs/tree/master/nixos/tests/kubernetes NixOS e2e kubernetes tests]: Node Joining etc.
* [https://github.com/NixOS/nixpkgs/tree/master/nixos/tests/kubernetes NixOS e2e kubernetes tests]: Node Joining etc.
* [https://logs.nix.samueldr.com/nixos-kubernetes/2018-09-07 IRC (2018-09)]: issues related to DNS
* [https://logs.nix.samueldr.com/nixos-kubernetes/2018-09-07 IRC (2018-09)]: issues related to DNS
* [https://logs.nix.samueldr.com/nixos-kubernetes/2019-09-05 IRC (2019-09)]: discussion about <code>easyCerts</code> and general setup
* [https://logs.nix.samueldr.com/nixos-kubernetes/2019-09-05 IRC (2019-09)]: discussion about <code>easyCerts</code> and general setup
[[Category:Applications]]
[[Category:Server]]
[[Category:Container]]
[[Category:NixOS Manual]]

Latest revision as of 11:03, 18 May 2024

Kubernetes is an open-source container orchestration system for automating software deployment, scaling, and management.

This wiki article extends the documentation in NixOS manual.

KISS

If you are new to Kubernetes you might want to check out K3s first as it is easier to set up (less moving parts).

1 Master and 1 Node

Assumptions:

  • Master and Node are on the same network (in this example 10.1.1.0/24)
  • IP of the Master: 10.1.1.2
  • IP of the first Node: 10.1.1.3

Caveats:

  • this was only tested on 20.09pre215024.e97dfe73bba (Nightingale) (unstable)
  • this is probably not best-practice
    • for a production-grade cluster you shouldn't use easyCerts
  • If you experience inability to reach service CIDR from pods, disable firewall via networking.firewall.enable = false; or otherwise make sure that it doesn't interfere with packet forwarding.
  • Make sure to set docker0 in promiscuous mode ip link set docker0 promisc on

Master

Add to your configuration.nix:

{ config, pkgs, ... }:
let
  # When using easyCerts=true the IP Address must resolve to the master on creation.
 # So use simply 127.0.0.1 in that case. Otherwise you will have errors like this https://github.com/NixOS/nixpkgs/issues/59364
  kubeMasterIP = "10.1.1.2";
  kubeMasterHostname = "api.kube";
  kubeMasterAPIServerPort = 6443;
in
{
  # resolve master hostname
  networking.extraHosts = "${kubeMasterIP} ${kubeMasterHostname}";

  # packages for administration tasks
  environment.systemPackages = with pkgs; [
    kompose
    kubectl
    kubernetes
  ];

  services.kubernetes = {
    roles = ["master" "node"];
    masterAddress = kubeMasterHostname;
    apiserverAddress = "https://${kubeMasterHostname}:${toString kubeMasterAPIServerPort}";
    easyCerts = true;
    apiserver = {
      securePort = kubeMasterAPIServerPort;
      advertiseAddress = kubeMasterIP;
    };

    # use coredns
    addons.dns.enable = true;

    # needed if you use swap
    kubelet.extraOpts = "--fail-swap-on=false";
  };
}

Apply your config (e.g. nixos-rebuild switch).

Link your kubeconfig to your home directory:

ln -s /etc/kubernetes/cluster-admin.kubeconfig ~/.kube/config

Now, executing kubectl cluster-info should yield something like this:

Kubernetes master is running at https://10.1.1.2
CoreDNS is running at https://10.1.1.2/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

You should also see that the master is also a node using kubectl get nodes:

NAME       STATUS   ROLES    AGE   VERSION
direwolf   Ready    <none>   41m   v1.16.6-beta.0

Node

Add to your configuration.nix:

{ config, pkgs, ... }:
let
  kubeMasterIP = "10.1.1.2";
  kubeMasterHostname = "api.kube";
  kubeMasterAPIServerPort = 6443;
in
{
  # resolve master hostname
  networking.extraHosts = "${kubeMasterIP} ${kubeMasterHostname}";

  # packages for administration tasks
  environment.systemPackages = with pkgs; [
    kompose
    kubectl
    kubernetes
  ];

  services.kubernetes = let
    api = "https://${kubeMasterHostname}:${toString kubeMasterAPIServerPort}";
  in
  {
    roles = ["node"];
    masterAddress = kubeMasterHostname;
    easyCerts = true;

    # point kubelet and other services to kube-apiserver
    kubelet.kubeconfig.server = api;
    apiserverAddress = api;

    # use coredns
    addons.dns.enable = true;

    # needed if you use swap
    kubelet.extraOpts = "--fail-swap-on=false";
  };
}

Apply your config (e.g. nixos-rebuild switch).

According to the NixOS tests, make your Node join the cluster:

on the master, grab the apitoken

cat /var/lib/kubernetes/secrets/apitoken.secret

on the node, join the node with

echo TOKEN | nixos-kubernetes-node-join

After that, you should see your new node using kubectl get nodes:

NAME       STATUS   ROLES    AGE    VERSION
direwolf   Ready    <none>   62m    v1.16.6-beta.0
drake      Ready    <none>   102m   v1.16.6-beta.0

N Masters (HA)

Troubleshooting

systemctl status kubelet
systemctl status kube-apiserver
kubectl get nodes

Join Cluster not working

If you face issues while running the nixos-kubernetes-node-join script:

Restarting certmgr...
Job for certmgr.service failed because a timeout was exceeded.
See "systemctl status certmgr.service" and "journalctl -xe" for details.

Go investigate with journalctl -u certmgr:

... certmgr: loading from config file /nix/store/gj7qr7lp6wakhiwcxdpxwbpamvmsifhk-certmgr.yaml
... manager: loading certificates from /nix/store/4n41ikm7322jxg7bh0afjpxsd4b2idpv-certmgr.d
... manager: loading spec from /nix/store/4n41ikm7322jxg7bh0afjpxsd4b2idpv-certmgr.d/flannelClient.json
... [ERROR] cert: failed to fetch remote CA: failed to parse rootCA certs

In this case, cfssl could be overloaded.

Restarting cfssl on the master node should help: systemctl restart cfssl

Also, make sure that port 8888 is open on your master node.

DNS issues

Check if coredns is running via kubectl get pods -n kube-system:

NAME                       READY   STATUS    RESTARTS   AGE
coredns-577478d784-bmt5s   1/1     Running   2          163m
coredns-577478d784-bqj65   1/1     Running   2          163m

Run a pod to check with kubectl run curl --restart=Never --image=radial/busyboxplus:curl -i --tty:

If you don't see a command prompt, try pressing enter.

[ root@curl:/ ]$
nslookup google.com
Server:    10.0.0.254
Address 1: 10.0.0.254 kube-dns.kube-system.svc.cluster.local

Name:      google.com
Address 1: 2a00:1450:4016:803::200e muc12s04-in-x0e.1e100.net
Address 2: 172.217.23.14 lhr35s01-in-f14.1e100.net

In case DNS is still not working I found that sometimes, restarting services helps:

systemctl restart kube-proxy flannel kubelet

reset to a clean state

Sometimes it helps to have a clean state on all instances:

  • comment kubernetes-related code in configuration.nix
  • nixos-rebuild switch
  • clean up filesystem
    • rm -rf /var/lib/kubernetes/ /var/lib/etcd/ /var/lib/cfssl/ /var/lib/kubelet/
    • rm -rf /etc/kube-flannel/ /etc/kubernetes/
  • uncomment kubernetes-related code again
  • nixos-rebuild switch

Miscellaneous

Rook Ceph storage cluster

Chances are you want to setup a storage cluster using rook.

To do so, I found it necessary to change a few things (tested with rook v1.2):

  • you need the ceph kernel module: boot.kernelModules = [ "ceph" ];
  • change the root dir of the kubelet: kubelet.extraOpts = "--root-dir=/var/lib/kubelet";
  • reboot all your nodes
  • continue with the official quickstart guide
  • in operator.yaml, help the CSI plugins find the hosts' ceph kernel modules by adding (or uncommenting -- they're in the example config) these entries:
 CSI_CEPHFS_PLUGIN_VOLUME: |
 - name: lib-modules
   hostPath:
     path: /run/current-system/kernel-modules/lib/modules/
 CSI_RBD_PLUGIN_VOLUME: |
 - name: lib-modules
   hostPath:
     path: /run/current-system/kernel-modules/lib/modules/

NVIDIA

You can use NVIDIA's k8s-device-plugin.

Make nvidia-docker your default docker runtime:

virtualisation.docker = {
    enable = true;

    # use nvidia as the default runtime
    enableNvidia = true;
    extraOptions = "--default-runtime=nvidia";
};

Apply their Daemonset:

kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/1.0.0-beta4/nvidia-device-plugin.yml

/dev/shm

Some applications need enough shared memory to work properly. Create a new volumeMount for your Deployment:

volumeMounts:
- mountPath: /dev/shm
  name: dshm

and mark its medium as Memory:

volumes:
- name: dshm
  emptyDir:
  medium: Memory

Arm64

Nix might pull in coredns and etcd images that are incompatible with arm, To resolve this add the following to your master node's configuration:

etcd

  ...
  services.kubernetes = {...};
  systemd.services.etcd = {
    environment = {
      ETCD_UNSUPPORTED_ARCH = "arm64";
    };
  };
  ...

coredns

  services.kubernetes = {
    ...
    # use coredns
    addons.dns = {
      enable = true;
      coredns = {
        finalImageTag = "1.10.1";
        imageDigest = "sha256:a0ead06651cf580044aeb0a0feba63591858fb2e43ade8c9dea45a6a89ae7e5e";
        imageName = "coredns/coredns";
        sha256 = "0c4vdbklgjrzi6qc5020dvi8x3mayq4li09rrq2w0hcjdljj0yf9";
      };
    };
   ...
  };

Tooling

There are various community projects aimed at facilitating working with Kubernetes combined with Nix:

References