Ceph: Difference between revisions

From NixOS Wiki
imported>C4lliope
mNo edit summary
imported>C4lliope
mNo edit summary
Line 117: Line 117:


Many users aspire to run Ceph on NixOS, and recommend varying approaches in different forums online.
Many users aspire to run Ceph on NixOS, and recommend varying approaches in different forums online.
Here is a collection of links that can lead you along, though please consider; these experiences come from older versions of Ceph, such as v10, while (as of now) Ceph is on v19.
Here is a collection of links that can lead you along, though please consider; these experiences come from older versions of Ceph, such as v10, while (as of 2023-12) Ceph is on v19.


* https://d.moonfire.us/blog/2022/12/10/ceph-and-nixos/
* https://d.moonfire.us/blog/2022/12/10/ceph-and-nixos/
* https://github.com/NixOS/nixpkgs/issues/147801
* https://github.com/NixOS/nixpkgs/issues/147801
* https://www.reddit.com/r/ceph/comments/14otjyo/ceph_on_nixos/
* https://www.reddit.com/r/ceph/comments/14otjyo/ceph_on_nixos/

Revision as of 20:55, 19 December 2023

The Ceph Nix package has been hard-pressed to keep up with Ceph, as filesystem concerns are a larger challenge on NixOS than on other Linux lineages. This is a problem seeking a solution; read ahead to see some of the remaining issues this guide should address. Please make a Wiki account and add your experiences, if you made progress running a modern Ceph version.


Here is a quick collection of commands I used on a 3-node Ceph mesh.

Describe your ceph user, alongside your normal login user:

  users.users = {
    mesh = { isNormalUser = true; extraGroups = [ "wheel" "docker" ]; };
    ceph = { isNormalUser = true; extraGroups = [ "wheel" "ceph" ]; };
  };
  users.groups.ceph = {};

Be sure you rebuild so you can assign some paths to the ceph user.

Run uuidgen and assign the response as your fsid; describe your Ceph nodes:

  services.ceph = {
    global.fsid = "4b687c5c-5a20-4a77-8774-487989fd0bc7";
    osd = {
      enable = true;
      daemons = ["0"];
    };
    mon = {
      enable = false;
      extraConfig = {
        "mon initial members" = "mesh-a,mesh-b,mesh-c";
        "mon host" = "10.0.0.11,10.0.0.12,10.0.0.13";
      };
    };
  };

Make your OSD volume; run these commands on each node: (based on https://docs.ceph.com/en/quincy/install/manual-deployment/ )

export IP=<your-node-IP-on-local-LAN>
export FSID=4b687c5c-5a20-4a77-8774-487989fd0bc7

sudo -u ceph mkdir -p /etc/ceph
sudo -u ceph mkdir -p /var/lib/ceph/bootstrap-osd
sudo -u ceph mkdir -p /tmp/monmap
sudo -u ceph mkdir -p /var/lib/ceph/mon/ceph-$(hostname)
sudo -u ceph mkdir /var/lib/ceph/mon/ceph-mon-$(hostname)

sudo ceph-authtool --create-keyring /etc/ceph/ceph.client.admin.keyring --gen-key -n client.admin --cap mon 'allow *' --cap osd 'allow *' --cap mds 'allow *' --cap mgr 'allow *'
sudo mkdir -p /var/lib/ceph/bootstrap-osd && sudo ceph-authtool --create-keyring /var/lib/ceph/bootstrap-osd/ceph.keyring --gen-key -n client.bootstrap-osd --cap mon 'profile bootstrap-osd' --cap mgr 'allow r'
sudo ceph-authtool /tmp/ceph.mon.keyring --import-keyring /etc/ceph/ceph.client.admin.keyring sudo ceph-authtool /tmp/ceph.mon.keyring --import-keyring /var/lib/ceph/bootstrap-osd/ceph.keyring
sudo chown ceph:ceph /tmp/ceph.mon.keyring
sudo monmaptool --create --add mesh-a $IP --fsid $FSID /tmp/monmap
sudo -u ceph ceph-mon --mkfs -i mon-$(hostname) --monmap /tmp/monmap --keyring /tmp/ceph.mon.keyring

Bind all Ceph OSD shares using systemd (based on u/imspacekitteh's example):

  systemd.services.ceph-mesh = {
    enable = true;
    description = "Ceph OSD Bindings";
    unitConfig = {
      After = "local-fs.target";
      Wants = "local-fs.target";
    };
    serviceConfig = {
      Type = "oneshot";
      KillMode = "none";
      Environment = "CEPH_VOLUME_TIMEOUT=10000 PATH=$PATH:/run/current-system/sw/bin/";
      ExecStart = "/bin/sh -c 'timeout $CEPH_VOLUME_TIMEOUT /run/current-system/sw/bin/ceph-volume lvm activate --all --no-systemd'";
      TimeoutSec = 0;
    };
    wantedBy = ["multi-user.target"];
  };

Though these commands seem reliable enough, there are some issues...

mesh@mesh-c:~/.build/ > sudo ceph -s
Error initializing cluster client: ObjectNotFound('RADOS object not found (error calling conf_read_file)')

nothing has appeared inside lsblk, so as an inexperienced Ceph user I can only assume additional actions are necessary.

Here is a summary of records produced inside /var/lib/ceph:

mesh@mesh-c:~/.build/ > nix-shell -p tree

[nix-shell:~/.build]$ sudo -u ceph tree /var/lib/ceph
/var/lib/ceph
├── bootstrap-osd
│   └── ceph.keyring
└── mon
    ├── ceph-mesh-c
    └── ceph-mon-mesh-c
        ├── keyring
        ├── kv_backend
        └── store.db
            ├── 000004.log
            ├── CURRENT
            ├── IDENTITY
            ├── LOCK
            ├── MANIFEST-000005
            └── OPTIONS-000007

6 directories, 9 files



Many users aspire to run Ceph on NixOS, and recommend varying approaches in different forums online. Here is a collection of links that can lead you along, though please consider; these experiences come from older versions of Ceph, such as v10, while (as of 2023-12) Ceph is on v19.