Kubernetes: Difference between revisions

← Older edit Newer edit →

VisualWikitext

Revision as of 05:32, 5 January 2022

KISS

If you are new to Kubernetes you might want to check out K3s first as it is easier to set up (less moving parts).

1 Master and 1 Node

Assumptions:

Master and Node are on the same network (in this example 10.1.1.0/24)
IP of the Master: 10.1.1.2
IP of the first Node: 10.1.1.3

Caveats:

this was only tested on 20.09pre215024.e97dfe73bba (Nightingale) (unstable)
this is probably not best-practice
- for a production-grade cluster you shouldn't use easyCerts
If you experience inability to reach service CIDR from pods, disable firewall via networking.firewall.enable = false; or otherwise make sure that it doesn't interfere with packet forwarding.
Make sure to set docker0 in promiscuous mode ip link set docker0 promisc on

Master

Add to your configuration.nix:

{ config, pkgs, ... }:
let
  kubeMasterIP = "10.1.1.2";
  kubeMasterHostname = "api.kube";
  kubeMasterAPIServerPort = 6443;
in
{
  # resolve master hostname
  networking.extraHosts = "${kubeMasterIP} ${kubeMasterHostname}";

  # packages for administration tasks
  environment.systemPackages = with pkgs; [
    kompose
    kubectl
    kubernetes
  ];

  services.kubernetes = {
    roles = ["master" "node"];
    masterAddress = kubeMasterHostname;
    apiserverAddress = "https://${kubeMasterHostname}:${toString kubeMasterAPIServerPort}";
    easyCerts = true;
    apiserver = {
      securePort = kubeMasterAPIServerPort;
      advertiseAddress = kubeMasterIP;
    };

    # use coredns
    addons.dns.enable = true;

    # needed if you use swap
    kubelet.extraOpts = "--fail-swap-on=false";
  };
}

Apply your config (e.g. nixos-rebuild switch).

Link your kubeconfig to your home directory:

ln -s /etc/kubernetes/cluster-admin.kubeconfig ~/.kube/config

Now, executing kubectl cluster-info should yield something like this:

Kubernetes master is running at https://10.1.1.2
CoreDNS is running at https://10.1.1.2/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

You should also see that the master is also a node using kubectl get nodes:

NAME       STATUS   ROLES    AGE   VERSION
direwolf   Ready    <none>   41m   v1.16.6-beta.0

Node

Add to your configuration.nix:

{ config, pkgs, ... }:
let
  kubeMasterIP = "10.1.1.2";
  kubeMasterHostname = "api.kube";
  kubeMasterAPIServerPort = 443;
in
{
  # resolve master hostname
  networking.extraHosts = "${kubeMasterIP} ${kubeMasterHostname}";

  # packages for administration tasks
  environment.systemPackages = with pkgs; [
    kompose
    kubectl
    kubernetes
  ];

  services.kubernetes = let
    api = "https://${kubeMasterHostname}:${toString kubeMasterAPIServerPort}";
  in
  {
    roles = ["node"];
    masterAddress = kubeMasterHostname;
    easyCerts = true;

    # point kubelet and other services to kube-apiserver
    kubelet.kubeconfig.server = api;
    apiserverAddress = api;

    # use coredns
    addons.dns.enable = true;

    # needed if you use swap
    kubelet.extraOpts = "--fail-swap-on=false";
  };
}

Apply your config (e.g. nixos-rebuild switch).

According to the NixOS tests, make your Node join the cluster:

on the master, grab the apitoken

cat /var/lib/kubernetes/secrets/apitoken.secret

on the node, join the node with

echo TOKEN | nixos-kubernetes-node-join

After that, you should see your new node using kubectl get nodes:

NAME       STATUS   ROLES    AGE    VERSION
direwolf   Ready    <none>   62m    v1.16.6-beta.0
drake      Ready    <none>   102m   v1.16.6-beta.0

N Masters (HA)

☶︎

This article or section needs to be expanded. Further information may be found in the related discussion page. Please consult the pedia article metapage for guidelines on contributing.

Troubleshooting

systemctl status kubelet

systemctl status kube-apiserver

kubectl get nodes

Join Cluster not working

If you face issues while running the nixos-kubernetes-node-join script:

Restarting certmgr...
Job for certmgr.service failed because a timeout was exceeded.
See "systemctl status certmgr.service" and "journalctl -xe" for details.

Go investigate with journalctl -u certmgr:

... certmgr: loading from config file /nix/store/gj7qr7lp6wakhiwcxdpxwbpamvmsifhk-certmgr.yaml
... manager: loading certificates from /nix/store/4n41ikm7322jxg7bh0afjpxsd4b2idpv-certmgr.d
... manager: loading spec from /nix/store/4n41ikm7322jxg7bh0afjpxsd4b2idpv-certmgr.d/flannelClient.json
... [ERROR] cert: failed to fetch remote CA: failed to parse rootCA certs

In this case, cfssl could be overloaded.

Restarting cfssl on the master node should help: systemctl restart cfssl

Also, make sure that port 8888 is open on your master node.

DNS issues

Check if coredns is running via kubectl get pods -n kube-system:

NAME                       READY   STATUS    RESTARTS   AGE
coredns-577478d784-bmt5s   1/1     Running   2          163m
coredns-577478d784-bqj65   1/1     Running   2          163m

Run a pod to check with kubectl run curl --restart=Never --image=radial/busyboxplus:curl -i --tty:

If you don't see a command prompt, try pressing enter.
[ root@curl:/ ]$ nslookup google.com
Server:    10.0.0.254
Address 1: 10.0.0.254 kube-dns.kube-system.svc.cluster.local

Name:      google.com
Address 1: 2a00:1450:4016:803::200e muc12s04-in-x0e.1e100.net
Address 2: 172.217.23.14 lhr35s01-in-f14.1e100.net

In case DNS is still not working I found that sometimes, restarting services helps:

systemctl restart kube-proxy flannel kubelet

reset to a clean state

Sometimes it helps to have a clean state on all instances:

comment kubernetes-related code in configuration.nix
nixos-rebuild switch
clean up filesystem
- rm -rf /var/lib/kubernetes/ /var/lib/etcd/ /var/lib/cfssl/ /var/lib/kubelet/
- rm -rf /etc/kube-flannel/ /etc/kubernetes/
uncomment kubernetes-related code again
nixos-rebuild switch

Miscellaneous

Rook Ceph storage cluster

Chances are you want to setup a storage cluster using rook.

To do so, I found it necessary to change a few things (tested with rook v1.2):

you need the ceph kernel module: boot.kernelModules = [ "ceph" ];
change the root dir of the kubelet: kubelet.extraOpts = "--root-dir=/var/lib/kubelet";
reboot all your nodes
continue with the official quickstart guide
in operator.yaml, set CSI_FORCE_CEPHFS_KERNEL_CLIENT to false

NVIDIA

You can use NVIDIA's k8s-device-plugin.

Make nvidia-docker your default docker runtime:

virtualisation.docker = {
    enable = true;

    # use nvidia as the default runtime
    enableNvidia = true;
    extraOptions = "--default-runtime=nvidia";
};

Apply their Daemonset:

kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/1.0.0-beta4/nvidia-device-plugin.yml

`/dev/shm`

Some applications need enough shared memory to work properly. Create a new volumeMount for your Deployment:

volumeMounts:
- mountPath: /dev/shm
  name: dshm
<syntaxhighlight lang=bash>

and mark its medium as Memory:

volumes:
- name: dshm
  emptyDir:
  medium: Memory

Tooling

There are various community projects aimed at facilitating working with Kubernetes combined with Nix:

kubernix: simple setup of development clusters using Nix
kube-nix

References

Issue #39327: kubernetes support is missing some documentation
NixOS Discourse: Using multiple nodes on unstable
Kubernetes docs
NixOS e2e kubernetes tests: Node Joining etc.
IRC (2018-09): issues related to DNS
IRC (2019-09): discussion about easyCerts and general setup

@@ Line 1: / Line 1: @@
-If you are new to kubernetes you might want to check out [[k3s]] first as it is easier to set up (less moving parts)
+== [[wikipedia:en:KISS principle|KISS]] ==
+If you are new to [[Kubernetes]] you might want to check out [[K3s]] first as it is easier to set up (less moving parts).
 == 1 Master and 1 Node ==
@@ Line 21: / Line 23: @@
 Add to your <code>configuration.nix</code>:
-<syntaxhighlight lang="nix">
+<syntaxhighlight lang=nix>
 { config, pkgs, ... }:
 let
@@ Line 62: / Line 64: @@
 Link your <code>kubeconfig</code> to your home directory:
-<syntaxhighlight lang="bash">
+<syntaxhighlight lang=bash>
 ln -s /etc/kubernetes/cluster-admin.kubeconfig ~/.kube/config
 </syntaxhighlight>
@@ Line 68: / Line 70: @@
 Now, executing <code>kubectl cluster-info</code> should yield something like this:
-<syntaxhighlight>
+<syntaxhighlight lang=shell>
 Kubernetes master is running at https://10.1.1.2
 CoreDNS is running at https://10.1.1.2/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
@@ Line 77: / Line 79: @@
 You should also see that the master is also a node using <code>kubectl get nodes</code>:
-<syntaxhighlight>
+<syntaxhighlight lang=shell>
 NAME       STATUS   ROLES    AGE   VERSION
 direwolf   Ready    <none>   41m   v1.16.6-beta.0
@@ Line 86: / Line 88: @@
 Add to your <code>configuration.nix</code>:
-<syntaxhighlight lang="nix">
+<syntaxhighlight lang=nix>
 { config, pkgs, ... }:
 let
@@ Line 129: / Line 131: @@
 According to the [https://github.com/NixOS/nixpkgs/blob/18ff53d7656636aa440b2f73d2da788b785e6a9c/nixos/tests/kubernetes/rbac.nix#L118 NixOS tests], make your Node join the cluster:
-<syntaxhighlight lang="bash">
+on the master, grab the apitoken
-# on the master, grab the apitoken
+<syntaxhighlight lang=bash>
 cat /var/lib/kubernetes/secrets/apitoken.secret
+</syntaxhighlight>
-# on the node, join the node with
+on the node, join the node with
+<syntaxhighlight lang=bash>
 echo TOKEN | nixos-kubernetes-node-join
 </syntaxhighlight>
@@ Line 139: / Line 143: @@
 After that, you should see your new node using <code>kubectl get nodes</code>:
-<syntaxhighlight>
+<syntaxhighlight lang=shell>
 NAME       STATUS   ROLES    AGE    VERSION
 direwolf   Ready    <none>   62m    v1.16.6-beta.0
@@ Line 152: / Line 156: @@
 == Troubleshooting ==
-<syntaxhighlight>
+<syntaxhighlight lang=bash>
 systemctl status kubelet
+</syntaxhighlight>
+<syntaxhighlight lanh="bash">
 systemctl status kube-apiserver
+</syntaxhighlight>
+<syntaxhighlight lanh="bash">
 kubectl get nodes
 </syntaxhighlight>
@@ Line 162: / Line 170: @@
 If you face issues while running the <code>nixos-kubernetes-node-join</code> script:
-<syntaxhighlight>
+<syntaxhighlight lang=shell>
 Restarting certmgr...
 Job for certmgr.service failed because a timeout was exceeded.
@@ Line 170: / Line 178: @@
 Go investigate with <code>journalctl -u certmgr</code>:
-<syntaxhighlight>
+<syntaxhighlight lang=shell>
 ... certmgr: loading from config file /nix/store/gj7qr7lp6wakhiwcxdpxwbpamvmsifhk-certmgr.yaml
 ... manager: loading certificates from /nix/store/4n41ikm7322jxg7bh0afjpxsd4b2idpv-certmgr.d
@@ Line 187: / Line 195: @@
 Check if coredns is running via <code>kubectl get pods -n kube-system</code>:
-<syntaxhighlight>
+<syntaxhighlight lang=shell>
 NAME                       READY   STATUS    RESTARTS   AGE
 coredns-577478d784-bmt5s   1/1     Running   2          163m
@@ Line 195: / Line 203: @@
 Run a pod to check with <code>kubectl run curl --restart=Never --image=radial/busyboxplus:curl -i --tty</code>:
-<syntaxhighlight>
+<syntaxhighlight lang=shell>
 If you don't see a command prompt, try pressing enter.
 [ root@curl:/ ]$ nslookup google.com
@@ Line 208: / Line 216: @@
 In case DNS is still not working I found that sometimes, restarting services helps:
-<syntaxhighlight>
+<syntaxhighlight lang=bash>
 systemctl restart kube-proxy flannel kubelet
 </syntaxhighlight>
@@ Line 215: / Line 223: @@
 Sometimes it helps to have a clean state on all instances:
 * comment kubernetes-related code in <code>configuration.nix</code>
 * <code>nixos-rebuild switch</code>
@@ Line 231: / Line 238: @@
 To do so, I found it necessary to change a few things (tested with <code>rook v1.2</code>):
 * you need the <code>ceph</code> kernel module: <code>boot.kernelModules = [ "ceph" ];</code>
 * change the root dir of the kubelet: <code>kubelet.extraOpts = "--root-dir=/var/lib/kubelet";</code>
 * reboot all your nodes
 * continue with [https://rook.io/docs/rook/v1.2/ceph-quickstart.html the official quickstart guide]
 * in <code>operator.yaml</code>, set <code>CSI_FORCE_CEPHFS_KERNEL_CLIENT</code> to <code>false</code>
@@ Line 247: / Line 249: @@
 Make <code>nvidia-docker</code> your default docker runtime:
-<syntaxhighlight>
+<syntaxhighlight lang=nix>
 virtualisation.docker = {
      enable = true;
@@ Line 259: / Line 261: @@
 Apply their Daemonset:
-<syntaxhighlight>
+<syntaxhighlight lang=bash>
 kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/1.0.0-beta4/nvidia-device-plugin.yml
 </syntaxhighlight>
@@ Line 267: / Line 269: @@
 Some applications need enough shared memory to work properly.
 Create a new volumeMount for your Deployment:
-<syntaxhighlight>
+<syntaxhighlight lang=bash></syntaxhighlight>
-...
+<syntaxhighlight lang=bash>
 volumeMounts:
 - mountPath: /dev/shm
    name: dshm
-...
+<syntaxhighlight lang=bash></syntaxhighlight>
-</syntaxhighlight>
 and mark its <code>medium</code> as <code>Memory</code>:
-<syntaxhighlight>
+<syntaxhighlight lang=bash></syntaxhighlight>
-...
+<syntaxhighlight lang=bash>
 volumes:
 - name: dshm
    emptyDir:
    medium: Memory
-...
 </syntaxhighlight>
+<syntaxhighlight lang=bash></syntaxhighlight>
 == Tooling ==
@@ Line 302: / Line 303: @@
 [[Category:Applications]]
 [[Category:Servers]]
+[[Category:orchestration]]