Kubernetes: Difference between revisions

imported>Vater
mNo edit summary
Zetsu (talk | contribs)
m fix styling
 
(18 intermediate revisions by 11 users not shown)
Line 1: Line 1:
[https://kubernetes.io/ Kubernetes] is an open-source container orchestration system for automating software deployment, scaling, and management.
This wiki article extends the documentation in [https://nixos.org/manual/nixos/stable/#sec-kubernetes NixOS manual].
== [[wikipedia:en:KISS principle|KISS]] ==
== [[wikipedia:en:KISS principle|KISS]] ==


Line 13: Line 17:
Caveats:
Caveats:


* this was only tested on <code>20.09pre215024.e97dfe73bba (Nightingale)</code> (<code>unstable</code>)
* This was only tested on <code>20.09pre215024.e97dfe73bba (Nightingale)</code> (<code>unstable</code>)
* this is probably not best-practice
* This is probably not best-practice
** for a production-grade cluster you shouldn't use <code>easyCerts</code>
** For a production-grade cluster you shouldn't use <code>easyCerts</code>
* If you experience inability to reach service CIDR from pods, disable firewall via <code>networking.firewall.enable = false;</code> or otherwise make sure that it doesn't interfere with packet forwarding.
* If you experience inability to reach service CIDR from pods, disable firewall via <code>networking.firewall.enable = false;</code> or otherwise make sure that it doesn't interfere with packet forwarding.
* Make sure to set <code>docker0</code> in promiscuous mode <code>ip link set docker0 promisc on</code>
* Make sure to set <code>docker0</code> in promiscuous mode <code>ip link set docker0 promisc on</code>
Line 23: Line 27:
Add to your <code>configuration.nix</code>:
Add to your <code>configuration.nix</code>:


<syntaxhighlight lang=nix>
<syntaxhighlight lang="nix">
{ config, pkgs, ... }:
{ config, pkgs, ... }:
let
let
  # When using 'easyCerts = true;', the IP address must resolve to the master at the time of creation.
  # In this case, set 'kubeMasterIP = "127.0.0.1";'. Otherwise, you may encounter the following issue: https://github.com/NixOS/nixpkgs/issues/59364.
   kubeMasterIP = "10.1.1.2";
   kubeMasterIP = "10.1.1.2";
   kubeMasterHostname = "api.kube";
   kubeMasterHostname = "api.kube";
Line 93: Line 99:
   kubeMasterIP = "10.1.1.2";
   kubeMasterIP = "10.1.1.2";
   kubeMasterHostname = "api.kube";
   kubeMasterHostname = "api.kube";
   kubeMasterAPIServerPort = 443;
   kubeMasterAPIServerPort = 6443;
in
in
{
{
Line 131: Line 137:
According to the [https://github.com/NixOS/nixpkgs/blob/18ff53d7656636aa440b2f73d2da788b785e6a9c/nixos/tests/kubernetes/rbac.nix#L118 NixOS tests], make your Node join the cluster:
According to the [https://github.com/NixOS/nixpkgs/blob/18ff53d7656636aa440b2f73d2da788b785e6a9c/nixos/tests/kubernetes/rbac.nix#L118 NixOS tests], make your Node join the cluster:


on the master, grab the apitoken
On the master, grab the apitoken
<syntaxhighlight lang=bash>
<syntaxhighlight lang=bash>
cat /var/lib/kubernetes/secrets/apitoken.secret
cat /var/lib/kubernetes/secrets/apitoken.secret
</syntaxhighlight>
</syntaxhighlight>


on the node, join the node with
On the node, join the node with
<syntaxhighlight lang=bash>
<syntaxhighlight lang=bash>
echo TOKEN | nixos-kubernetes-node-join
echo TOKEN | nixos-kubernetes-node-join
Line 148: Line 154:
drake      Ready    <none>  102m  v1.16.6-beta.0
drake      Ready    <none>  102m  v1.16.6-beta.0
</syntaxhighlight>
</syntaxhighlight>


== N Masters (HA) ==
== N Masters (HA) ==
Line 159: Line 164:
systemctl status kubelet
systemctl status kubelet
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lanh="bash">
<syntaxhighlight lang="bash">
systemctl status kube-apiserver
systemctl status kube-apiserver
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lanh="bash">
<syntaxhighlight lang="bash">
kubectl get nodes
kubectl get nodes
</syntaxhighlight>
</syntaxhighlight>
Line 203: Line 208:
Run a pod to check with <code>kubectl run curl --restart=Never --image=radial/busyboxplus:curl -i --tty</code>:
Run a pod to check with <code>kubectl run curl --restart=Never --image=radial/busyboxplus:curl -i --tty</code>:


If you don't see a command prompt, try pressing enter.
<syntaxhighlight lang=shell>
[ root@curl:/ ]$
</syntaxhighlight>
<syntaxhighlight lang=bash>
nslookup google.com
</syntaxhighlight>
<syntaxhighlight lang=shell>
<syntaxhighlight lang=shell>
If you don't see a command prompt, try pressing enter.
[ root@curl:/ ]$ nslookup google.com
Server:    10.0.0.254
Server:    10.0.0.254
Address 1: 10.0.0.254 kube-dns.kube-system.svc.cluster.local
Address 1: 10.0.0.254 kube-dns.kube-system.svc.cluster.local
Line 220: Line 230:
</syntaxhighlight>
</syntaxhighlight>


=== reset to a clean state ===
=== Reset to a clean state ===


Sometimes it helps to have a clean state on all instances:
Sometimes it helps to have a clean state on all instances:
Line 238: Line 248:


To do so, I found it necessary to change a few things (tested with <code>rook v1.2</code>):
To do so, I found it necessary to change a few things (tested with <code>rook v1.2</code>):
* you need the <code>ceph</code> kernel module: <code>boot.kernelModules = [ "ceph" ];</code>
* You need the <code>ceph</code> kernel module: <code>boot.kernelModules = [ "ceph" ];</code>
* change the root dir of the kubelet: <code>kubelet.extraOpts = "--root-dir=/var/lib/kubelet";</code>
* Change the root dir of the kubelet: <code>kubelet.extraOpts = "--root-dir=/var/lib/kubelet";</code>
* reboot all your nodes
* Reboot all your nodes
* continue with [https://rook.io/docs/rook/v1.2/ceph-quickstart.html the official quickstart guide]
* Continue with [https://rook.io/docs/rook/v1.2/ceph-quickstart.html the official quickstart guide]
* in <code>operator.yaml</code>, set <code>CSI_FORCE_CEPHFS_KERNEL_CLIENT</code> to <code>false</code>
* In <code>operator.yaml</code>, help the CSI plugins find the hosts' ceph kernel modules by adding (or uncommenting -- they're in the example config) these entries:
  CSI_CEPHFS_PLUGIN_VOLUME: |
  - name: lib-modules
    hostPath:
      path: /run/current-system/kernel-modules/lib/modules/
  CSI_RBD_PLUGIN_VOLUME: |
  - name: lib-modules
    hostPath:
      path: /run/current-system/kernel-modules/lib/modules/


=== NVIDIA ===
=== NVIDIA ===
Line 269: Line 287:
Some applications need enough shared memory to work properly.
Some applications need enough shared memory to work properly.
Create a new volumeMount for your Deployment:
Create a new volumeMount for your Deployment:
<syntaxhighlight lang=bash></syntaxhighlight>
<syntaxhighlight lang=bash>
<syntaxhighlight lang=bash>
volumeMounts:
volumeMounts:
- mountPath: /dev/shm
- mountPath: /dev/shm
   name: dshm
   name: dshm
<syntaxhighlight lang=bash></syntaxhighlight>
</syntaxhighlight>


and mark its <code>medium</code> as <code>Memory</code>:
and mark its <code>medium</code> as <code>Memory</code>:
<syntaxhighlight lang=bash></syntaxhighlight>
<syntaxhighlight lang=bash>
<syntaxhighlight lang=bash>
volumes:
volumes:
Line 284: Line 300:
   medium: Memory
   medium: Memory
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang=bash></syntaxhighlight>
 
=== Arm64 ===
Nix might pull in <code>coredns</code> and <code>etcd</code> images that are incompatible with arm, To resolve this add the following to your master node's configuration:
==== etcd ====
<syntaxhighlight lang=nix>
  ...
  services.kubernetes = {...};
  systemd.services.etcd = {
    environment = {
      ETCD_UNSUPPORTED_ARCH = "arm64";
    };
  };
  ...
</syntaxhighlight>
==== coredns ====
<syntaxhighlight lang=nix>
  services.kubernetes = {
    ...
    # use coredns
    addons.dns = {
      enable = true;
      coredns = {
        finalImageTag = "1.10.1";
        imageDigest = "sha256:a0ead06651cf580044aeb0a0feba63591858fb2e43ade8c9dea45a6a89ae7e5e";
        imageName = "coredns/coredns";
        sha256 = "0c4vdbklgjrzi6qc5020dvi8x3mayq4li09rrq2w0hcjdljj0yf9";
      };
    };
  ...
  };
</syntaxhighlight>


== Tooling ==
== Tooling ==
Line 290: Line 336:
There are various community projects aimed at facilitating working with Kubernetes combined with Nix:
There are various community projects aimed at facilitating working with Kubernetes combined with Nix:
* [https://github.com/saschagrunert/kubernix kubernix]: simple setup of development clusters using Nix
* [https://github.com/saschagrunert/kubernix kubernix]: simple setup of development clusters using Nix
* [https://github.com/cmollekopf/kube-nix kube-nix]
* [https://kubenix.org/ kubenix]: [https://github.com/hall/kubenix GitHub (updated 2023)]
* [https://github.com/justinas/nixos-ha-kubernetes nixos-ha-kubernetes]
* [https://github.com/nix-community/nixhelm nixhelm]: generates nix expressions from a selection of helm charts
* [https://github.com/reMarkable/helmfile-nix helmfile-nix]: wrapper around [[Helm and Helmfile|Helmfile]] to allow writing helmfiles in the nix language


== References ==
== References ==


* [https://github.com/NixOS/nixpkgs/issues/39327 Issue #39327]: kubernetes support is missing some documentation
* [https://github.com/NixOS/nixpkgs/issues/39327 Issue #39327]: Kubernetes support is missing some documentation
* [https://discourse.nixos.org/t/kubernetes-using-multiple-nodes-with-latest-unstable/3936 NixOS Discourse]: Using multiple nodes on unstable
* [https://discourse.nixos.org/t/kubernetes-using-multiple-nodes-with-latest-unstable/3936 NixOS Discourse]: Using multiple nodes on unstable
* [https://kubernetes.io/docs/home/ Kubernetes docs]
* [https://kubernetes.io/docs/home/ Kubernetes docs]
Line 302: Line 351:


[[Category:Applications]]
[[Category:Applications]]
[[Category:Servers]]
[[Category:Server]]
[[Category:orchestration]]
[[Category:Container]]
[[Category:NixOS Manual]]