Kubernetes: Difference between revisions

From NixOS Wiki
imported>Nix
m add applications and server categories
imported>Vater
mNo edit summary
Line 1: Line 1:
If you are new to kubernetes you might want to check out [[k3s]] first as it is easier to set up (less moving parts)
== [[wikipedia:en:KISS principle|KISS]] ==
 
If you are new to [[Kubernetes]] you might want to check out [[K3s]] first as it is easier to set up (less moving parts).


== 1 Master and 1 Node ==
== 1 Master and 1 Node ==
Line 21: Line 23:
Add to your <code>configuration.nix</code>:
Add to your <code>configuration.nix</code>:


<syntaxhighlight lang="nix">
<syntaxhighlight lang=nix>
{ config, pkgs, ... }:
{ config, pkgs, ... }:
let
let
Line 62: Line 64:
Link your <code>kubeconfig</code> to your home directory:
Link your <code>kubeconfig</code> to your home directory:


<syntaxhighlight lang="bash">
<syntaxhighlight lang=bash>
ln -s /etc/kubernetes/cluster-admin.kubeconfig ~/.kube/config
ln -s /etc/kubernetes/cluster-admin.kubeconfig ~/.kube/config
</syntaxhighlight>
</syntaxhighlight>
Line 68: Line 70:
Now, executing <code>kubectl cluster-info</code> should yield something like this:
Now, executing <code>kubectl cluster-info</code> should yield something like this:


<syntaxhighlight>
<syntaxhighlight lang=shell>
Kubernetes master is running at https://10.1.1.2
Kubernetes master is running at https://10.1.1.2
CoreDNS is running at https://10.1.1.2/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
CoreDNS is running at https://10.1.1.2/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
Line 77: Line 79:
You should also see that the master is also a node using <code>kubectl get nodes</code>:
You should also see that the master is also a node using <code>kubectl get nodes</code>:


<syntaxhighlight>
<syntaxhighlight lang=shell>
NAME      STATUS  ROLES    AGE  VERSION
NAME      STATUS  ROLES    AGE  VERSION
direwolf  Ready    <none>  41m  v1.16.6-beta.0
direwolf  Ready    <none>  41m  v1.16.6-beta.0
Line 86: Line 88:
Add to your <code>configuration.nix</code>:
Add to your <code>configuration.nix</code>:


<syntaxhighlight lang="nix">
<syntaxhighlight lang=nix>
{ config, pkgs, ... }:
{ config, pkgs, ... }:
let
let
Line 129: Line 131:
According to the [https://github.com/NixOS/nixpkgs/blob/18ff53d7656636aa440b2f73d2da788b785e6a9c/nixos/tests/kubernetes/rbac.nix#L118 NixOS tests], make your Node join the cluster:
According to the [https://github.com/NixOS/nixpkgs/blob/18ff53d7656636aa440b2f73d2da788b785e6a9c/nixos/tests/kubernetes/rbac.nix#L118 NixOS tests], make your Node join the cluster:


<syntaxhighlight lang="bash">
on the master, grab the apitoken
# on the master, grab the apitoken
<syntaxhighlight lang=bash>
cat /var/lib/kubernetes/secrets/apitoken.secret
cat /var/lib/kubernetes/secrets/apitoken.secret
</syntaxhighlight>


# on the node, join the node with
on the node, join the node with
<syntaxhighlight lang=bash>
echo TOKEN | nixos-kubernetes-node-join
echo TOKEN | nixos-kubernetes-node-join
</syntaxhighlight>
</syntaxhighlight>
Line 139: Line 143:
After that, you should see your new node using <code>kubectl get nodes</code>:
After that, you should see your new node using <code>kubectl get nodes</code>:


<syntaxhighlight>
<syntaxhighlight lang=shell>
NAME      STATUS  ROLES    AGE    VERSION
NAME      STATUS  ROLES    AGE    VERSION
direwolf  Ready    <none>  62m    v1.16.6-beta.0
direwolf  Ready    <none>  62m    v1.16.6-beta.0
Line 152: Line 156:
== Troubleshooting ==
== Troubleshooting ==


<syntaxhighlight>
<syntaxhighlight lang=bash>
systemctl status kubelet
systemctl status kubelet
</syntaxhighlight>
<syntaxhighlight lanh="bash">
systemctl status kube-apiserver
systemctl status kube-apiserver
</syntaxhighlight>
<syntaxhighlight lanh="bash">
kubectl get nodes
kubectl get nodes
</syntaxhighlight>
</syntaxhighlight>
Line 162: Line 170:
If you face issues while running the <code>nixos-kubernetes-node-join</code> script:
If you face issues while running the <code>nixos-kubernetes-node-join</code> script:


<syntaxhighlight>
<syntaxhighlight lang=shell>
Restarting certmgr...
Restarting certmgr...
Job for certmgr.service failed because a timeout was exceeded.
Job for certmgr.service failed because a timeout was exceeded.
Line 170: Line 178:
Go investigate with <code>journalctl -u certmgr</code>:
Go investigate with <code>journalctl -u certmgr</code>:


<syntaxhighlight>
<syntaxhighlight lang=shell>
... certmgr: loading from config file /nix/store/gj7qr7lp6wakhiwcxdpxwbpamvmsifhk-certmgr.yaml
... certmgr: loading from config file /nix/store/gj7qr7lp6wakhiwcxdpxwbpamvmsifhk-certmgr.yaml
... manager: loading certificates from /nix/store/4n41ikm7322jxg7bh0afjpxsd4b2idpv-certmgr.d
... manager: loading certificates from /nix/store/4n41ikm7322jxg7bh0afjpxsd4b2idpv-certmgr.d
Line 187: Line 195:
Check if coredns is running via <code>kubectl get pods -n kube-system</code>:
Check if coredns is running via <code>kubectl get pods -n kube-system</code>:


<syntaxhighlight>
<syntaxhighlight lang=shell>
NAME                      READY  STATUS    RESTARTS  AGE
NAME                      READY  STATUS    RESTARTS  AGE
coredns-577478d784-bmt5s  1/1    Running  2          163m
coredns-577478d784-bmt5s  1/1    Running  2          163m
Line 195: Line 203:
Run a pod to check with <code>kubectl run curl --restart=Never --image=radial/busyboxplus:curl -i --tty</code>:
Run a pod to check with <code>kubectl run curl --restart=Never --image=radial/busyboxplus:curl -i --tty</code>:


<syntaxhighlight>
<syntaxhighlight lang=shell>
If you don't see a command prompt, try pressing enter.
If you don't see a command prompt, try pressing enter.
[ root@curl:/ ]$ nslookup google.com
[ root@curl:/ ]$ nslookup google.com
Line 208: Line 216:
In case DNS is still not working I found that sometimes, restarting services helps:
In case DNS is still not working I found that sometimes, restarting services helps:


<syntaxhighlight>
<syntaxhighlight lang=bash>
systemctl restart kube-proxy flannel kubelet
systemctl restart kube-proxy flannel kubelet
</syntaxhighlight>
</syntaxhighlight>
Line 215: Line 223:


Sometimes it helps to have a clean state on all instances:
Sometimes it helps to have a clean state on all instances:
* comment kubernetes-related code in <code>configuration.nix</code>
* comment kubernetes-related code in <code>configuration.nix</code>
* <code>nixos-rebuild switch</code>
* <code>nixos-rebuild switch</code>
Line 231: Line 238:


To do so, I found it necessary to change a few things (tested with <code>rook v1.2</code>):
To do so, I found it necessary to change a few things (tested with <code>rook v1.2</code>):
* you need the <code>ceph</code> kernel module: <code>boot.kernelModules = [ "ceph" ];</code>
* you need the <code>ceph</code> kernel module: <code>boot.kernelModules = [ "ceph" ];</code>
* change the root dir of the kubelet: <code>kubelet.extraOpts = "--root-dir=/var/lib/kubelet";</code>
* change the root dir of the kubelet: <code>kubelet.extraOpts = "--root-dir=/var/lib/kubelet";</code>
* reboot all your nodes
* reboot all your nodes
* continue with [https://rook.io/docs/rook/v1.2/ceph-quickstart.html the official quickstart guide]
* continue with [https://rook.io/docs/rook/v1.2/ceph-quickstart.html the official quickstart guide]
* in <code>operator.yaml</code>, set <code>CSI_FORCE_CEPHFS_KERNEL_CLIENT</code> to <code>false</code>
* in <code>operator.yaml</code>, set <code>CSI_FORCE_CEPHFS_KERNEL_CLIENT</code> to <code>false</code>


Line 247: Line 249:


Make <code>nvidia-docker</code> your default docker runtime:
Make <code>nvidia-docker</code> your default docker runtime:
<syntaxhighlight>
<syntaxhighlight lang=nix>
virtualisation.docker = {
virtualisation.docker = {
     enable = true;
     enable = true;
Line 259: Line 261:
Apply their Daemonset:
Apply their Daemonset:


<syntaxhighlight>
<syntaxhighlight lang=bash>
kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/1.0.0-beta4/nvidia-device-plugin.yml
kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/1.0.0-beta4/nvidia-device-plugin.yml
</syntaxhighlight>
</syntaxhighlight>
Line 267: Line 269:
Some applications need enough shared memory to work properly.
Some applications need enough shared memory to work properly.
Create a new volumeMount for your Deployment:
Create a new volumeMount for your Deployment:
<syntaxhighlight>
<syntaxhighlight lang=bash></syntaxhighlight>
...
<syntaxhighlight lang=bash>
volumeMounts:
volumeMounts:
- mountPath: /dev/shm
- mountPath: /dev/shm
   name: dshm
   name: dshm
...
<syntaxhighlight lang=bash></syntaxhighlight>
</syntaxhighlight>


and mark its <code>medium</code> as <code>Memory</code>:
and mark its <code>medium</code> as <code>Memory</code>:
<syntaxhighlight>
<syntaxhighlight lang=bash></syntaxhighlight>
...
<syntaxhighlight lang=bash>
volumes:
volumes:
- name: dshm
- name: dshm
   emptyDir:
   emptyDir:
   medium: Memory
   medium: Memory
...
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang=bash></syntaxhighlight>


== Tooling ==
== Tooling ==
Line 302: Line 303:
[[Category:Applications]]
[[Category:Applications]]
[[Category:Servers]]
[[Category:Servers]]
[[Category:orchestration]]

Revision as of 05:32, 5 January 2022

KISS

If you are new to Kubernetes you might want to check out K3s first as it is easier to set up (less moving parts).

1 Master and 1 Node

Assumptions:

  • Master and Node are on the same network (in this example 10.1.1.0/24)
  • IP of the Master: 10.1.1.2
  • IP of the first Node: 10.1.1.3

Caveats:

  • this was only tested on 20.09pre215024.e97dfe73bba (Nightingale) (unstable)
  • this is probably not best-practice
    • for a production-grade cluster you shouldn't use easyCerts
  • If you experience inability to reach service CIDR from pods, disable firewall via networking.firewall.enable = false; or otherwise make sure that it doesn't interfere with packet forwarding.
  • Make sure to set docker0 in promiscuous mode ip link set docker0 promisc on

Master

Add to your configuration.nix:

{ config, pkgs, ... }:
let
  kubeMasterIP = "10.1.1.2";
  kubeMasterHostname = "api.kube";
  kubeMasterAPIServerPort = 6443;
in
{
  # resolve master hostname
  networking.extraHosts = "${kubeMasterIP} ${kubeMasterHostname}";

  # packages for administration tasks
  environment.systemPackages = with pkgs; [
    kompose
    kubectl
    kubernetes
  ];

  services.kubernetes = {
    roles = ["master" "node"];
    masterAddress = kubeMasterHostname;
    apiserverAddress = "https://${kubeMasterHostname}:${toString kubeMasterAPIServerPort}";
    easyCerts = true;
    apiserver = {
      securePort = kubeMasterAPIServerPort;
      advertiseAddress = kubeMasterIP;
    };

    # use coredns
    addons.dns.enable = true;

    # needed if you use swap
    kubelet.extraOpts = "--fail-swap-on=false";
  };
}

Apply your config (e.g. nixos-rebuild switch).

Link your kubeconfig to your home directory:

ln -s /etc/kubernetes/cluster-admin.kubeconfig ~/.kube/config

Now, executing kubectl cluster-info should yield something like this:

Kubernetes master is running at https://10.1.1.2
CoreDNS is running at https://10.1.1.2/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

You should also see that the master is also a node using kubectl get nodes:

NAME       STATUS   ROLES    AGE   VERSION
direwolf   Ready    <none>   41m   v1.16.6-beta.0

Node

Add to your configuration.nix:

{ config, pkgs, ... }:
let
  kubeMasterIP = "10.1.1.2";
  kubeMasterHostname = "api.kube";
  kubeMasterAPIServerPort = 443;
in
{
  # resolve master hostname
  networking.extraHosts = "${kubeMasterIP} ${kubeMasterHostname}";

  # packages for administration tasks
  environment.systemPackages = with pkgs; [
    kompose
    kubectl
    kubernetes
  ];

  services.kubernetes = let
    api = "https://${kubeMasterHostname}:${toString kubeMasterAPIServerPort}";
  in
  {
    roles = ["node"];
    masterAddress = kubeMasterHostname;
    easyCerts = true;

    # point kubelet and other services to kube-apiserver
    kubelet.kubeconfig.server = api;
    apiserverAddress = api;

    # use coredns
    addons.dns.enable = true;

    # needed if you use swap
    kubelet.extraOpts = "--fail-swap-on=false";
  };
}

Apply your config (e.g. nixos-rebuild switch).

According to the NixOS tests, make your Node join the cluster:

on the master, grab the apitoken

cat /var/lib/kubernetes/secrets/apitoken.secret

on the node, join the node with

echo TOKEN | nixos-kubernetes-node-join

After that, you should see your new node using kubectl get nodes:

NAME       STATUS   ROLES    AGE    VERSION
direwolf   Ready    <none>   62m    v1.16.6-beta.0
drake      Ready    <none>   102m   v1.16.6-beta.0


N Masters (HA)

Troubleshooting

systemctl status kubelet
systemctl status kube-apiserver
kubectl get nodes

Join Cluster not working

If you face issues while running the nixos-kubernetes-node-join script:

Restarting certmgr...
Job for certmgr.service failed because a timeout was exceeded.
See "systemctl status certmgr.service" and "journalctl -xe" for details.

Go investigate with journalctl -u certmgr:

... certmgr: loading from config file /nix/store/gj7qr7lp6wakhiwcxdpxwbpamvmsifhk-certmgr.yaml
... manager: loading certificates from /nix/store/4n41ikm7322jxg7bh0afjpxsd4b2idpv-certmgr.d
... manager: loading spec from /nix/store/4n41ikm7322jxg7bh0afjpxsd4b2idpv-certmgr.d/flannelClient.json
... [ERROR] cert: failed to fetch remote CA: failed to parse rootCA certs

In this case, cfssl could be overloaded.

Restarting cfssl on the master node should help: systemctl restart cfssl

Also, make sure that port 8888 is open on your master node.

DNS issues

Check if coredns is running via kubectl get pods -n kube-system:

NAME                       READY   STATUS    RESTARTS   AGE
coredns-577478d784-bmt5s   1/1     Running   2          163m
coredns-577478d784-bqj65   1/1     Running   2          163m

Run a pod to check with kubectl run curl --restart=Never --image=radial/busyboxplus:curl -i --tty:

If you don't see a command prompt, try pressing enter.
[ root@curl:/ ]$ nslookup google.com
Server:    10.0.0.254
Address 1: 10.0.0.254 kube-dns.kube-system.svc.cluster.local

Name:      google.com
Address 1: 2a00:1450:4016:803::200e muc12s04-in-x0e.1e100.net
Address 2: 172.217.23.14 lhr35s01-in-f14.1e100.net

In case DNS is still not working I found that sometimes, restarting services helps:

systemctl restart kube-proxy flannel kubelet

reset to a clean state

Sometimes it helps to have a clean state on all instances:

  • comment kubernetes-related code in configuration.nix
  • nixos-rebuild switch
  • clean up filesystem
    • rm -rf /var/lib/kubernetes/ /var/lib/etcd/ /var/lib/cfssl/ /var/lib/kubelet/
    • rm -rf /etc/kube-flannel/ /etc/kubernetes/
  • uncomment kubernetes-related code again
  • nixos-rebuild switch

Miscellaneous

Rook Ceph storage cluster

Chances are you want to setup a storage cluster using rook.

To do so, I found it necessary to change a few things (tested with rook v1.2):

  • you need the ceph kernel module: boot.kernelModules = [ "ceph" ];
  • change the root dir of the kubelet: kubelet.extraOpts = "--root-dir=/var/lib/kubelet";
  • reboot all your nodes
  • continue with the official quickstart guide
  • in operator.yaml, set CSI_FORCE_CEPHFS_KERNEL_CLIENT to false

NVIDIA

You can use NVIDIA's k8s-device-plugin.

Make nvidia-docker your default docker runtime:

virtualisation.docker = {
    enable = true;

    # use nvidia as the default runtime
    enableNvidia = true;
    extraOptions = "--default-runtime=nvidia";
};

Apply their Daemonset:

kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/1.0.0-beta4/nvidia-device-plugin.yml

/dev/shm

Some applications need enough shared memory to work properly. Create a new volumeMount for your Deployment:

volumeMounts:
- mountPath: /dev/shm
  name: dshm
<syntaxhighlight lang=bash>

and mark its medium as Memory:

volumes:
- name: dshm
  emptyDir:
  medium: Memory

Tooling

There are various community projects aimed at facilitating working with Kubernetes combined with Nix:

References