Kubernetes: Difference between revisions

From NixOS Wiki
imported>Nateinaction
m Discovered that port 8888 must be open to pull certs during node join
removed kube-nix which is unrelated to kubernetes but installs a kde groupware.
 
(17 intermediate revisions by 9 users not shown)
Line 1: Line 1:
If you are new to kubernetes you might want to check out [[k3s]] first as it is easier to set up (less moving parts)
[https://kubernetes.io/ Kubernetes] is an open-source container orchestration system for automating software deployment, scaling, and management.
 
This wiki article extends the documentation in [https://nixos.org/manual/nixos/stable/#sec-kubernetes NixOS manual].
 
== [[wikipedia:en:KISS principle|KISS]] ==
 
If you are new to [[Kubernetes]] you might want to check out [[K3s]] first as it is easier to set up (less moving parts).


== 1 Master and 1 Node ==
== 1 Master and 1 Node ==
Line 21: Line 27:
Add to your <code>configuration.nix</code>:
Add to your <code>configuration.nix</code>:


<syntaxhighlight lang="nix">
<syntaxhighlight lang=nix>
{ config, pkgs, ... }:
{ config, pkgs, ... }:
let
let
  # When using easyCerts=true the IP Address must resolve to the master on creation.
# So use simply 127.0.0.1 in that case. Otherwise you will have errors like this https://github.com/NixOS/nixpkgs/issues/59364
   kubeMasterIP = "10.1.1.2";
   kubeMasterIP = "10.1.1.2";
   kubeMasterHostname = "api.kube";
   kubeMasterHostname = "api.kube";
Line 62: Line 70:
Link your <code>kubeconfig</code> to your home directory:
Link your <code>kubeconfig</code> to your home directory:


<syntaxhighlight lang="bash">
<syntaxhighlight lang=bash>
ln -s /etc/kubernetes/cluster-admin.kubeconfig ~/.kube/config
ln -s /etc/kubernetes/cluster-admin.kubeconfig ~/.kube/config
</syntaxhighlight>
</syntaxhighlight>
Line 68: Line 76:
Now, executing <code>kubectl cluster-info</code> should yield something like this:
Now, executing <code>kubectl cluster-info</code> should yield something like this:


<syntaxhighlight>
<syntaxhighlight lang=shell>
Kubernetes master is running at https://10.1.1.2
Kubernetes master is running at https://10.1.1.2
CoreDNS is running at https://10.1.1.2/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
CoreDNS is running at https://10.1.1.2/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
Line 77: Line 85:
You should also see that the master is also a node using <code>kubectl get nodes</code>:
You should also see that the master is also a node using <code>kubectl get nodes</code>:


<syntaxhighlight>
<syntaxhighlight lang=shell>
NAME      STATUS  ROLES    AGE  VERSION
NAME      STATUS  ROLES    AGE  VERSION
direwolf  Ready    <none>  41m  v1.16.6-beta.0
direwolf  Ready    <none>  41m  v1.16.6-beta.0
Line 86: Line 94:
Add to your <code>configuration.nix</code>:
Add to your <code>configuration.nix</code>:


<syntaxhighlight lang="nix">
<syntaxhighlight lang=nix>
{ config, pkgs, ... }:
{ config, pkgs, ... }:
let
let
   kubeMasterIP = "10.1.1.2";
   kubeMasterIP = "10.1.1.2";
   kubeMasterHostname = "api.kube";
   kubeMasterHostname = "api.kube";
   kubeMasterAPIServerPort = 443;
   kubeMasterAPIServerPort = 6443;
in
in
{
{
Line 129: Line 137:
According to the [https://github.com/NixOS/nixpkgs/blob/18ff53d7656636aa440b2f73d2da788b785e6a9c/nixos/tests/kubernetes/rbac.nix#L118 NixOS tests], make your Node join the cluster:
According to the [https://github.com/NixOS/nixpkgs/blob/18ff53d7656636aa440b2f73d2da788b785e6a9c/nixos/tests/kubernetes/rbac.nix#L118 NixOS tests], make your Node join the cluster:


<syntaxhighlight lang="bash">
on the master, grab the apitoken
# on the master, grab the apitoken
<syntaxhighlight lang=bash>
cat /var/lib/kubernetes/secrets/apitoken.secret
cat /var/lib/kubernetes/secrets/apitoken.secret
</syntaxhighlight>


# on the node, join the node with
on the node, join the node with
<syntaxhighlight lang=bash>
echo TOKEN | nixos-kubernetes-node-join
echo TOKEN | nixos-kubernetes-node-join
</syntaxhighlight>
</syntaxhighlight>
Line 139: Line 149:
After that, you should see your new node using <code>kubectl get nodes</code>:
After that, you should see your new node using <code>kubectl get nodes</code>:


<syntaxhighlight>
<syntaxhighlight lang=shell>
NAME      STATUS  ROLES    AGE    VERSION
NAME      STATUS  ROLES    AGE    VERSION
direwolf  Ready    <none>  62m    v1.16.6-beta.0
direwolf  Ready    <none>  62m    v1.16.6-beta.0
drake      Ready    <none>  102m  v1.16.6-beta.0
drake      Ready    <none>  102m  v1.16.6-beta.0
</syntaxhighlight>
</syntaxhighlight>


== N Masters (HA) ==
== N Masters (HA) ==
Line 152: Line 161:
== Troubleshooting ==
== Troubleshooting ==


<syntaxhighlight>
<syntaxhighlight lang=bash>
systemctl status kubelet
systemctl status kubelet
</syntaxhighlight>
<syntaxhighlight lang="bash">
systemctl status kube-apiserver
systemctl status kube-apiserver
</syntaxhighlight>
<syntaxhighlight lang="bash">
kubectl get nodes
kubectl get nodes
</syntaxhighlight>
</syntaxhighlight>
Line 162: Line 175:
If you face issues while running the <code>nixos-kubernetes-node-join</code> script:
If you face issues while running the <code>nixos-kubernetes-node-join</code> script:


<syntaxhighlight>
<syntaxhighlight lang=shell>
Restarting certmgr...
Restarting certmgr...
Job for certmgr.service failed because a timeout was exceeded.
Job for certmgr.service failed because a timeout was exceeded.
Line 170: Line 183:
Go investigate with <code>journalctl -u certmgr</code>:
Go investigate with <code>journalctl -u certmgr</code>:


<syntaxhighlight>
<syntaxhighlight lang=shell>
... certmgr: loading from config file /nix/store/gj7qr7lp6wakhiwcxdpxwbpamvmsifhk-certmgr.yaml
... certmgr: loading from config file /nix/store/gj7qr7lp6wakhiwcxdpxwbpamvmsifhk-certmgr.yaml
... manager: loading certificates from /nix/store/4n41ikm7322jxg7bh0afjpxsd4b2idpv-certmgr.d
... manager: loading certificates from /nix/store/4n41ikm7322jxg7bh0afjpxsd4b2idpv-certmgr.d
Line 187: Line 200:
Check if coredns is running via <code>kubectl get pods -n kube-system</code>:
Check if coredns is running via <code>kubectl get pods -n kube-system</code>:


<syntaxhighlight>
<syntaxhighlight lang=shell>
NAME                      READY  STATUS    RESTARTS  AGE
NAME                      READY  STATUS    RESTARTS  AGE
coredns-577478d784-bmt5s  1/1    Running  2          163m
coredns-577478d784-bmt5s  1/1    Running  2          163m
Line 195: Line 208:
Run a pod to check with <code>kubectl run curl --restart=Never --image=radial/busyboxplus:curl -i --tty</code>:
Run a pod to check with <code>kubectl run curl --restart=Never --image=radial/busyboxplus:curl -i --tty</code>:


<syntaxhighlight>
If you don't see a command prompt, try pressing enter.
If you don't see a command prompt, try pressing enter.
[ root@curl:/ ]$ nslookup google.com
<syntaxhighlight lang=shell>
[ root@curl:/ ]$  
</syntaxhighlight>
<syntaxhighlight lang=bash>
nslookup google.com
</syntaxhighlight>
<syntaxhighlight lang=shell>
Server:    10.0.0.254
Server:    10.0.0.254
Address 1: 10.0.0.254 kube-dns.kube-system.svc.cluster.local
Address 1: 10.0.0.254 kube-dns.kube-system.svc.cluster.local
Line 208: Line 226:
In case DNS is still not working I found that sometimes, restarting services helps:
In case DNS is still not working I found that sometimes, restarting services helps:


<syntaxhighlight>
<syntaxhighlight lang=bash>
systemctl restart kube-proxy flannel kubelet
systemctl restart kube-proxy flannel kubelet
</syntaxhighlight>
</syntaxhighlight>
Line 215: Line 233:


Sometimes it helps to have a clean state on all instances:
Sometimes it helps to have a clean state on all instances:
* comment kubernetes-related code in <code>configuration.nix</code>
* comment kubernetes-related code in <code>configuration.nix</code>
* <code>nixos-rebuild switch</code>
* <code>nixos-rebuild switch</code>
Line 231: Line 248:


To do so, I found it necessary to change a few things (tested with <code>rook v1.2</code>):
To do so, I found it necessary to change a few things (tested with <code>rook v1.2</code>):
* you need the <code>ceph</code> kernel module: <code>boot.kernelModules = [ "ceph" ];</code>
* you need the <code>ceph</code> kernel module: <code>boot.kernelModules = [ "ceph" ];</code>
* change the root dir of the kubelet: <code>kubelet.extraOpts = "--root-dir=/var/lib/kubelet";</code>
* change the root dir of the kubelet: <code>kubelet.extraOpts = "--root-dir=/var/lib/kubelet";</code>
* reboot all your nodes
* reboot all your nodes
* continue with [https://rook.io/docs/rook/v1.2/ceph-quickstart.html the official quickstart guide]
* continue with [https://rook.io/docs/rook/v1.2/ceph-quickstart.html the official quickstart guide]
 
* in <code>operator.yaml</code>, help the CSI plugins find the hosts' ceph kernel modules by adding (or uncommenting -- they're in the example config) these entries:
* in <code>operator.yaml</code>, set <code>CSI_FORCE_CEPHFS_KERNEL_CLIENT</code> to <code>false</code>
  CSI_CEPHFS_PLUGIN_VOLUME: |
  - name: lib-modules
    hostPath:
      path: /run/current-system/kernel-modules/lib/modules/
  CSI_RBD_PLUGIN_VOLUME: |
  - name: lib-modules
    hostPath:
      path: /run/current-system/kernel-modules/lib/modules/


=== NVIDIA ===
=== NVIDIA ===
Line 247: Line 267:


Make <code>nvidia-docker</code> your default docker runtime:
Make <code>nvidia-docker</code> your default docker runtime:
<syntaxhighlight>
<syntaxhighlight lang=nix>
virtualisation.docker = {
virtualisation.docker = {
     enable = true;
     enable = true;
Line 259: Line 279:
Apply their Daemonset:
Apply their Daemonset:


<syntaxhighlight>
<syntaxhighlight lang=bash>
kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/1.0.0-beta4/nvidia-device-plugin.yml
kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/1.0.0-beta4/nvidia-device-plugin.yml
</syntaxhighlight>
</syntaxhighlight>
Line 267: Line 287:
Some applications need enough shared memory to work properly.
Some applications need enough shared memory to work properly.
Create a new volumeMount for your Deployment:
Create a new volumeMount for your Deployment:
<syntaxhighlight>
<syntaxhighlight lang=bash>
...
volumeMounts:
volumeMounts:
- mountPath: /dev/shm
- mountPath: /dev/shm
   name: dshm
   name: dshm
...
</syntaxhighlight>
</syntaxhighlight>


and mark its <code>medium</code> as <code>Memory</code>:
and mark its <code>medium</code> as <code>Memory</code>:
<syntaxhighlight>
<syntaxhighlight lang=bash>
...
volumes:
volumes:
- name: dshm
- name: dshm
   emptyDir:
   emptyDir:
   medium: Memory
   medium: Memory
...
</syntaxhighlight>
 
=== Arm64 ===
Nix might pull in <code>coredns</code> and <code>etcd</code> images that are incompatible with arm, To resolve this add the following to your master node's configuration:
==== etcd ====
<syntaxhighlight lang=nix>
  ...
  services.kubernetes = {...};
  systemd.services.etcd = {
    environment = {
      ETCD_UNSUPPORTED_ARCH = "arm64";
    };
  };
  ...
</syntaxhighlight>
==== coredns ====
<syntaxhighlight lang=nix>
  services.kubernetes = {
    ...
    # use coredns
    addons.dns = {
      enable = true;
      coredns = {
        finalImageTag = "1.10.1";
        imageDigest = "sha256:a0ead06651cf580044aeb0a0feba63591858fb2e43ade8c9dea45a6a89ae7e5e";
        imageName = "coredns/coredns";
        sha256 = "0c4vdbklgjrzi6qc5020dvi8x3mayq4li09rrq2w0hcjdljj0yf9";
      };
    };
  ...
  };
</syntaxhighlight>
</syntaxhighlight>


Line 289: Line 336:
There are various community projects aimed at facilitating working with Kubernetes combined with Nix:
There are various community projects aimed at facilitating working with Kubernetes combined with Nix:
* [https://github.com/saschagrunert/kubernix kubernix]: simple setup of development clusters using Nix
* [https://github.com/saschagrunert/kubernix kubernix]: simple setup of development clusters using Nix
* [https://github.com/cmollekopf/kube-nix kube-nix]
* [https://kubenix.org/ kubenix] - [https://github.com/hall/kubenix GitHub (updated 2023)]
* [https://github.com/justinas/nixos-ha-kubernetes nixos-ha-kubernetes]


== References ==
== References ==
Line 299: Line 347:
* [https://logs.nix.samueldr.com/nixos-kubernetes/2018-09-07 IRC (2018-09)]: issues related to DNS
* [https://logs.nix.samueldr.com/nixos-kubernetes/2018-09-07 IRC (2018-09)]: issues related to DNS
* [https://logs.nix.samueldr.com/nixos-kubernetes/2019-09-05 IRC (2019-09)]: discussion about <code>easyCerts</code> and general setup
* [https://logs.nix.samueldr.com/nixos-kubernetes/2019-09-05 IRC (2019-09)]: discussion about <code>easyCerts</code> and general setup
[[Category:Applications]]
[[Category:Server]]
[[Category:Container]]
[[Category:NixOS Manual]]

Latest revision as of 11:03, 18 May 2024

Kubernetes is an open-source container orchestration system for automating software deployment, scaling, and management.

This wiki article extends the documentation in NixOS manual.

KISS

If you are new to Kubernetes you might want to check out K3s first as it is easier to set up (less moving parts).

1 Master and 1 Node

Assumptions:

  • Master and Node are on the same network (in this example 10.1.1.0/24)
  • IP of the Master: 10.1.1.2
  • IP of the first Node: 10.1.1.3

Caveats:

  • this was only tested on 20.09pre215024.e97dfe73bba (Nightingale) (unstable)
  • this is probably not best-practice
    • for a production-grade cluster you shouldn't use easyCerts
  • If you experience inability to reach service CIDR from pods, disable firewall via networking.firewall.enable = false; or otherwise make sure that it doesn't interfere with packet forwarding.
  • Make sure to set docker0 in promiscuous mode ip link set docker0 promisc on

Master

Add to your configuration.nix:

{ config, pkgs, ... }:
let
  # When using easyCerts=true the IP Address must resolve to the master on creation.
 # So use simply 127.0.0.1 in that case. Otherwise you will have errors like this https://github.com/NixOS/nixpkgs/issues/59364
  kubeMasterIP = "10.1.1.2";
  kubeMasterHostname = "api.kube";
  kubeMasterAPIServerPort = 6443;
in
{
  # resolve master hostname
  networking.extraHosts = "${kubeMasterIP} ${kubeMasterHostname}";

  # packages for administration tasks
  environment.systemPackages = with pkgs; [
    kompose
    kubectl
    kubernetes
  ];

  services.kubernetes = {
    roles = ["master" "node"];
    masterAddress = kubeMasterHostname;
    apiserverAddress = "https://${kubeMasterHostname}:${toString kubeMasterAPIServerPort}";
    easyCerts = true;
    apiserver = {
      securePort = kubeMasterAPIServerPort;
      advertiseAddress = kubeMasterIP;
    };

    # use coredns
    addons.dns.enable = true;

    # needed if you use swap
    kubelet.extraOpts = "--fail-swap-on=false";
  };
}

Apply your config (e.g. nixos-rebuild switch).

Link your kubeconfig to your home directory:

ln -s /etc/kubernetes/cluster-admin.kubeconfig ~/.kube/config

Now, executing kubectl cluster-info should yield something like this:

Kubernetes master is running at https://10.1.1.2
CoreDNS is running at https://10.1.1.2/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

You should also see that the master is also a node using kubectl get nodes:

NAME       STATUS   ROLES    AGE   VERSION
direwolf   Ready    <none>   41m   v1.16.6-beta.0

Node

Add to your configuration.nix:

{ config, pkgs, ... }:
let
  kubeMasterIP = "10.1.1.2";
  kubeMasterHostname = "api.kube";
  kubeMasterAPIServerPort = 6443;
in
{
  # resolve master hostname
  networking.extraHosts = "${kubeMasterIP} ${kubeMasterHostname}";

  # packages for administration tasks
  environment.systemPackages = with pkgs; [
    kompose
    kubectl
    kubernetes
  ];

  services.kubernetes = let
    api = "https://${kubeMasterHostname}:${toString kubeMasterAPIServerPort}";
  in
  {
    roles = ["node"];
    masterAddress = kubeMasterHostname;
    easyCerts = true;

    # point kubelet and other services to kube-apiserver
    kubelet.kubeconfig.server = api;
    apiserverAddress = api;

    # use coredns
    addons.dns.enable = true;

    # needed if you use swap
    kubelet.extraOpts = "--fail-swap-on=false";
  };
}

Apply your config (e.g. nixos-rebuild switch).

According to the NixOS tests, make your Node join the cluster:

on the master, grab the apitoken

cat /var/lib/kubernetes/secrets/apitoken.secret

on the node, join the node with

echo TOKEN | nixos-kubernetes-node-join

After that, you should see your new node using kubectl get nodes:

NAME       STATUS   ROLES    AGE    VERSION
direwolf   Ready    <none>   62m    v1.16.6-beta.0
drake      Ready    <none>   102m   v1.16.6-beta.0

N Masters (HA)

Troubleshooting

systemctl status kubelet
systemctl status kube-apiserver
kubectl get nodes

Join Cluster not working

If you face issues while running the nixos-kubernetes-node-join script:

Restarting certmgr...
Job for certmgr.service failed because a timeout was exceeded.
See "systemctl status certmgr.service" and "journalctl -xe" for details.

Go investigate with journalctl -u certmgr:

... certmgr: loading from config file /nix/store/gj7qr7lp6wakhiwcxdpxwbpamvmsifhk-certmgr.yaml
... manager: loading certificates from /nix/store/4n41ikm7322jxg7bh0afjpxsd4b2idpv-certmgr.d
... manager: loading spec from /nix/store/4n41ikm7322jxg7bh0afjpxsd4b2idpv-certmgr.d/flannelClient.json
... [ERROR] cert: failed to fetch remote CA: failed to parse rootCA certs

In this case, cfssl could be overloaded.

Restarting cfssl on the master node should help: systemctl restart cfssl

Also, make sure that port 8888 is open on your master node.

DNS issues

Check if coredns is running via kubectl get pods -n kube-system:

NAME                       READY   STATUS    RESTARTS   AGE
coredns-577478d784-bmt5s   1/1     Running   2          163m
coredns-577478d784-bqj65   1/1     Running   2          163m

Run a pod to check with kubectl run curl --restart=Never --image=radial/busyboxplus:curl -i --tty:

If you don't see a command prompt, try pressing enter.

[ root@curl:/ ]$
nslookup google.com
Server:    10.0.0.254
Address 1: 10.0.0.254 kube-dns.kube-system.svc.cluster.local

Name:      google.com
Address 1: 2a00:1450:4016:803::200e muc12s04-in-x0e.1e100.net
Address 2: 172.217.23.14 lhr35s01-in-f14.1e100.net

In case DNS is still not working I found that sometimes, restarting services helps:

systemctl restart kube-proxy flannel kubelet

reset to a clean state

Sometimes it helps to have a clean state on all instances:

  • comment kubernetes-related code in configuration.nix
  • nixos-rebuild switch
  • clean up filesystem
    • rm -rf /var/lib/kubernetes/ /var/lib/etcd/ /var/lib/cfssl/ /var/lib/kubelet/
    • rm -rf /etc/kube-flannel/ /etc/kubernetes/
  • uncomment kubernetes-related code again
  • nixos-rebuild switch

Miscellaneous

Rook Ceph storage cluster

Chances are you want to setup a storage cluster using rook.

To do so, I found it necessary to change a few things (tested with rook v1.2):

  • you need the ceph kernel module: boot.kernelModules = [ "ceph" ];
  • change the root dir of the kubelet: kubelet.extraOpts = "--root-dir=/var/lib/kubelet";
  • reboot all your nodes
  • continue with the official quickstart guide
  • in operator.yaml, help the CSI plugins find the hosts' ceph kernel modules by adding (or uncommenting -- they're in the example config) these entries:
 CSI_CEPHFS_PLUGIN_VOLUME: |
 - name: lib-modules
   hostPath:
     path: /run/current-system/kernel-modules/lib/modules/
 CSI_RBD_PLUGIN_VOLUME: |
 - name: lib-modules
   hostPath:
     path: /run/current-system/kernel-modules/lib/modules/

NVIDIA

You can use NVIDIA's k8s-device-plugin.

Make nvidia-docker your default docker runtime:

virtualisation.docker = {
    enable = true;

    # use nvidia as the default runtime
    enableNvidia = true;
    extraOptions = "--default-runtime=nvidia";
};

Apply their Daemonset:

kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/1.0.0-beta4/nvidia-device-plugin.yml

/dev/shm

Some applications need enough shared memory to work properly. Create a new volumeMount for your Deployment:

volumeMounts:
- mountPath: /dev/shm
  name: dshm

and mark its medium as Memory:

volumes:
- name: dshm
  emptyDir:
  medium: Memory

Arm64

Nix might pull in coredns and etcd images that are incompatible with arm, To resolve this add the following to your master node's configuration:

etcd

  ...
  services.kubernetes = {...};
  systemd.services.etcd = {
    environment = {
      ETCD_UNSUPPORTED_ARCH = "arm64";
    };
  };
  ...

coredns

  services.kubernetes = {
    ...
    # use coredns
    addons.dns = {
      enable = true;
      coredns = {
        finalImageTag = "1.10.1";
        imageDigest = "sha256:a0ead06651cf580044aeb0a0feba63591858fb2e43ade8c9dea45a6a89ae7e5e";
        imageName = "coredns/coredns";
        sha256 = "0c4vdbklgjrzi6qc5020dvi8x3mayq4li09rrq2w0hcjdljj0yf9";
      };
    };
   ...
  };

Tooling

There are various community projects aimed at facilitating working with Kubernetes combined with Nix:

References