Kubernetes: Difference between revisions

imported>Nix
m add applications and server categories
imported>Vater
mNo edit summary
Line 1: Line 1:
If you are new to kubernetes you might want to check out [[k3s]] first as it is easier to set up (less moving parts)
== [[wikipedia:en:KISS principle|KISS]] ==
 
If you are new to [[Kubernetes]] you might want to check out [[K3s]] first as it is easier to set up (less moving parts).


== 1 Master and 1 Node ==
== 1 Master and 1 Node ==
Line 21: Line 23:
Add to your <code>configuration.nix</code>:
Add to your <code>configuration.nix</code>:


<syntaxhighlight lang="nix">
<syntaxhighlight lang=nix>
{ config, pkgs, ... }:
{ config, pkgs, ... }:
let
let
Line 62: Line 64:
Link your <code>kubeconfig</code> to your home directory:
Link your <code>kubeconfig</code> to your home directory:


<syntaxhighlight lang="bash">
<syntaxhighlight lang=bash>
ln -s /etc/kubernetes/cluster-admin.kubeconfig ~/.kube/config
ln -s /etc/kubernetes/cluster-admin.kubeconfig ~/.kube/config
</syntaxhighlight>
</syntaxhighlight>
Line 68: Line 70:
Now, executing <code>kubectl cluster-info</code> should yield something like this:
Now, executing <code>kubectl cluster-info</code> should yield something like this:


<syntaxhighlight>
<syntaxhighlight lang=shell>
Kubernetes master is running at https://10.1.1.2
Kubernetes master is running at https://10.1.1.2
CoreDNS is running at https://10.1.1.2/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
CoreDNS is running at https://10.1.1.2/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
Line 77: Line 79:
You should also see that the master is also a node using <code>kubectl get nodes</code>:
You should also see that the master is also a node using <code>kubectl get nodes</code>:


<syntaxhighlight>
<syntaxhighlight lang=shell>
NAME      STATUS  ROLES    AGE  VERSION
NAME      STATUS  ROLES    AGE  VERSION
direwolf  Ready    <none>  41m  v1.16.6-beta.0
direwolf  Ready    <none>  41m  v1.16.6-beta.0
Line 86: Line 88:
Add to your <code>configuration.nix</code>:
Add to your <code>configuration.nix</code>:


<syntaxhighlight lang="nix">
<syntaxhighlight lang=nix>
{ config, pkgs, ... }:
{ config, pkgs, ... }:
let
let
Line 129: Line 131:
According to the [https://github.com/NixOS/nixpkgs/blob/18ff53d7656636aa440b2f73d2da788b785e6a9c/nixos/tests/kubernetes/rbac.nix#L118 NixOS tests], make your Node join the cluster:
According to the [https://github.com/NixOS/nixpkgs/blob/18ff53d7656636aa440b2f73d2da788b785e6a9c/nixos/tests/kubernetes/rbac.nix#L118 NixOS tests], make your Node join the cluster:


<syntaxhighlight lang="bash">
on the master, grab the apitoken
# on the master, grab the apitoken
<syntaxhighlight lang=bash>
cat /var/lib/kubernetes/secrets/apitoken.secret
cat /var/lib/kubernetes/secrets/apitoken.secret
</syntaxhighlight>


# on the node, join the node with
on the node, join the node with
<syntaxhighlight lang=bash>
echo TOKEN | nixos-kubernetes-node-join
echo TOKEN | nixos-kubernetes-node-join
</syntaxhighlight>
</syntaxhighlight>
Line 139: Line 143:
After that, you should see your new node using <code>kubectl get nodes</code>:
After that, you should see your new node using <code>kubectl get nodes</code>:


<syntaxhighlight>
<syntaxhighlight lang=shell>
NAME      STATUS  ROLES    AGE    VERSION
NAME      STATUS  ROLES    AGE    VERSION
direwolf  Ready    <none>  62m    v1.16.6-beta.0
direwolf  Ready    <none>  62m    v1.16.6-beta.0
Line 152: Line 156:
== Troubleshooting ==
== Troubleshooting ==


<syntaxhighlight>
<syntaxhighlight lang=bash>
systemctl status kubelet
systemctl status kubelet
</syntaxhighlight>
<syntaxhighlight lanh="bash">
systemctl status kube-apiserver
systemctl status kube-apiserver
</syntaxhighlight>
<syntaxhighlight lanh="bash">
kubectl get nodes
kubectl get nodes
</syntaxhighlight>
</syntaxhighlight>
Line 162: Line 170:
If you face issues while running the <code>nixos-kubernetes-node-join</code> script:
If you face issues while running the <code>nixos-kubernetes-node-join</code> script:


<syntaxhighlight>
<syntaxhighlight lang=shell>
Restarting certmgr...
Restarting certmgr...
Job for certmgr.service failed because a timeout was exceeded.
Job for certmgr.service failed because a timeout was exceeded.
Line 170: Line 178:
Go investigate with <code>journalctl -u certmgr</code>:
Go investigate with <code>journalctl -u certmgr</code>:


<syntaxhighlight>
<syntaxhighlight lang=shell>
... certmgr: loading from config file /nix/store/gj7qr7lp6wakhiwcxdpxwbpamvmsifhk-certmgr.yaml
... certmgr: loading from config file /nix/store/gj7qr7lp6wakhiwcxdpxwbpamvmsifhk-certmgr.yaml
... manager: loading certificates from /nix/store/4n41ikm7322jxg7bh0afjpxsd4b2idpv-certmgr.d
... manager: loading certificates from /nix/store/4n41ikm7322jxg7bh0afjpxsd4b2idpv-certmgr.d
Line 187: Line 195:
Check if coredns is running via <code>kubectl get pods -n kube-system</code>:
Check if coredns is running via <code>kubectl get pods -n kube-system</code>:


<syntaxhighlight>
<syntaxhighlight lang=shell>
NAME                      READY  STATUS    RESTARTS  AGE
NAME                      READY  STATUS    RESTARTS  AGE
coredns-577478d784-bmt5s  1/1    Running  2          163m
coredns-577478d784-bmt5s  1/1    Running  2          163m
Line 195: Line 203:
Run a pod to check with <code>kubectl run curl --restart=Never --image=radial/busyboxplus:curl -i --tty</code>:
Run a pod to check with <code>kubectl run curl --restart=Never --image=radial/busyboxplus:curl -i --tty</code>:


<syntaxhighlight>
<syntaxhighlight lang=shell>
If you don't see a command prompt, try pressing enter.
If you don't see a command prompt, try pressing enter.
[ root@curl:/ ]$ nslookup google.com
[ root@curl:/ ]$ nslookup google.com
Line 208: Line 216:
In case DNS is still not working I found that sometimes, restarting services helps:
In case DNS is still not working I found that sometimes, restarting services helps:


<syntaxhighlight>
<syntaxhighlight lang=bash>
systemctl restart kube-proxy flannel kubelet
systemctl restart kube-proxy flannel kubelet
</syntaxhighlight>
</syntaxhighlight>
Line 215: Line 223:


Sometimes it helps to have a clean state on all instances:
Sometimes it helps to have a clean state on all instances:
* comment kubernetes-related code in <code>configuration.nix</code>
* comment kubernetes-related code in <code>configuration.nix</code>
* <code>nixos-rebuild switch</code>
* <code>nixos-rebuild switch</code>
Line 231: Line 238:


To do so, I found it necessary to change a few things (tested with <code>rook v1.2</code>):
To do so, I found it necessary to change a few things (tested with <code>rook v1.2</code>):
* you need the <code>ceph</code> kernel module: <code>boot.kernelModules = [ "ceph" ];</code>
* you need the <code>ceph</code> kernel module: <code>boot.kernelModules = [ "ceph" ];</code>
* change the root dir of the kubelet: <code>kubelet.extraOpts = "--root-dir=/var/lib/kubelet";</code>
* change the root dir of the kubelet: <code>kubelet.extraOpts = "--root-dir=/var/lib/kubelet";</code>
* reboot all your nodes
* reboot all your nodes
* continue with [https://rook.io/docs/rook/v1.2/ceph-quickstart.html the official quickstart guide]
* continue with [https://rook.io/docs/rook/v1.2/ceph-quickstart.html the official quickstart guide]
* in <code>operator.yaml</code>, set <code>CSI_FORCE_CEPHFS_KERNEL_CLIENT</code> to <code>false</code>
* in <code>operator.yaml</code>, set <code>CSI_FORCE_CEPHFS_KERNEL_CLIENT</code> to <code>false</code>


Line 247: Line 249:


Make <code>nvidia-docker</code> your default docker runtime:
Make <code>nvidia-docker</code> your default docker runtime:
<syntaxhighlight>
<syntaxhighlight lang=nix>
virtualisation.docker = {
virtualisation.docker = {
     enable = true;
     enable = true;
Line 259: Line 261:
Apply their Daemonset:
Apply their Daemonset:


<syntaxhighlight>
<syntaxhighlight lang=bash>
kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/1.0.0-beta4/nvidia-device-plugin.yml
kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/1.0.0-beta4/nvidia-device-plugin.yml
</syntaxhighlight>
</syntaxhighlight>
Line 267: Line 269:
Some applications need enough shared memory to work properly.
Some applications need enough shared memory to work properly.
Create a new volumeMount for your Deployment:
Create a new volumeMount for your Deployment:
<syntaxhighlight>
<syntaxhighlight lang=bash></syntaxhighlight>
...
<syntaxhighlight lang=bash>
volumeMounts:
volumeMounts:
- mountPath: /dev/shm
- mountPath: /dev/shm
   name: dshm
   name: dshm
...
<syntaxhighlight lang=bash></syntaxhighlight>
</syntaxhighlight>


and mark its <code>medium</code> as <code>Memory</code>:
and mark its <code>medium</code> as <code>Memory</code>:
<syntaxhighlight>
<syntaxhighlight lang=bash></syntaxhighlight>
...
<syntaxhighlight lang=bash>
volumes:
volumes:
- name: dshm
- name: dshm
   emptyDir:
   emptyDir:
   medium: Memory
   medium: Memory
...
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang=bash></syntaxhighlight>


== Tooling ==
== Tooling ==
Line 302: Line 303:
[[Category:Applications]]
[[Category:Applications]]
[[Category:Servers]]
[[Category:Servers]]
[[Category:orchestration]]