Rook with Ceph doesn't provision my Persistent Volume Claims!
Sep 9, 2019
- Categories
- DevOps & SRE
- Tags
- PVC
- Linux
- Rook
- Ubuntu
- Ceph
- Cluster
- Internship
- Kubernetes [more][less]
Never miss our publications about Open Source, big data and distributed systems, low frequency of one email every two months.
Ceph installation inside Kubernetes can be provisioned using Rook. Currently doing an internship at Adaltas, I was in charge of participating in the setup of a Kubernetes (k8s) cluster. To avoid breaking anything on our production cluster, we decided to experiment the installation of a k8s cluster on 3 virtual machines (one master node n1, two worker nodes n2 and n3) using Vagrant with VirtualBox on the backend and Ubuntu 18.10 as the OS.
During the installation of the test cluster, we encountered a problem with Rook using Ceph that prevented it from provisioning any Persistent Volume Claims (PVC). This article will detail how to make a basic installation of Rook with Ceph on virtual machines, the problem we experienced and how to solve it. But first…
…a quick reminder about the role of PVCs!
When a pod needs to store various data (logs or metrics for example) in a persistent fashion, it has to describe what kind of storage it needs (size, performance, …) in a PVC. The cluster will then provision a Persistent Volume (PV) if one matches the requirements of the PVC. The PV can either be provisionned statically if an administrator manually created a matching PV or dynamically. Manually creating PVs can be time consuming if a lot of them are required by pods, which is why it is interesting for the cluster to be able to provision them dynamically. To make the cluster able to dynamically provision a PV, the PVC must indicate a Storage Class it wants to use. If such a Storage Class is available on the cluster, a PV will be dynamically provisionned to the pod.
Here are some links you can follow if you want to learn more about PVCs, PVs and Storage Classes:
Installing Rook on your k8s cluster
The installation of a k8s cluster being out of the scope of this article, I will assume you already have a working k8s cluster up and running. If it is not the case you can easily find some documentation on the internet on how to quickly bootstrap a k8s cluster.
The process of installing Rook isn’t hard, it’s just a matter of applying some manifests. First step, clone the Rook git repo:
git clone https://github.com/rook/rook
Then switch to the latest release tag (which is v1.0.1 at the time of this writing) using:
git checkout v1.0.1
The files of interests (which are listed below) are located inside the folder cluster/examples/kubernetes/ceph
.
common.yaml
operator.yaml
cluster.yaml
storageclass.yaml
Apply each of them in the order listed above using:
kubectl apply -f <file.yaml>
One last step is to set the storageclass
resource, defined inside of the storageclass.yaml
file we just applied, to be as the default storageclass
in our cluster. This is achieved with the command:
kubectl patch storageclass rook-ceph-block \
-p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
The problem
The Rook cluster will take some time to deploy, while it pulls the Rook images and deploy the pods. After a few minutes the output of kubectl get pods -n rook-ceph
should look like this:
NAME READY STATUS RESTARTS AGE
rook-ceph-agent-8zv7p 1/1 Running 0 4m8s
rook-ceph-agent-ghwgl 1/1 Running 0 4m8s
rook-ceph-mgr-a-6d8cf6d5d7-txnrj 1/1 Running 0 102s
rook-ceph-mon-a-588475cbdb-htt4h 1/1 Running 0 2m55s
rook-ceph-mon-b-5b7cdc894f-q6wwr 1/1 Running 0 2m47s
rook-ceph-mon-c-846fc479cb-96sjq 1/1 Running 0 119s
rook-ceph-operator-765ff54667-q5qk4 1/1 Running 0 4m43s
rook-ceph-osd-prepare-n2.k8s.test-d4p9w 0/2 Completed 0 80s
rook-ceph-osd-prepare-n3.k8s.test-lrkbc 0/2 Completed 0 80s
rook-discover-hxxtl 1/1 Running 0 4m8s
rook-discover-mmdl5 1/1 Running 0 4m8s
As we can see here, there are two pods called rook-ceph-osd-prepare...
for which status is “Completed”. We expected some Object Storage Device (OSD) pods to appear once the status of the rook-ceph-osd-prepare...
pods is completed but it is not the case here. Since the OSD pods are not appearing, anytime we will have a PVC, it won’t be provisioned by Rook and stays pending. We can see an example of this happening when trying to deploy a Gitlab instance with Helm. Here is the result of kubectl get pvc -n gitlab
:
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
gitlab-minio Pending rook-ceph-block 6m7s
gitlab-postgresql Pending rook-ceph-block 6m7s
gitlab-prometheus-server Pending rook-ceph-block 6m7s
gitlab-redis Pending rook-ceph-block 6m7s
repo-data-gitlab-gitaly-0 Pending rook-ceph-block 6m6s
We can see no PVC is being provisioned even though they are assigned the correct Storage Class.
The solution
After some research we found that in fact, in order for Rook to work, we need to have a dedicated Storage Device that it can use to store the PVs. To fix this, we needed in our case to add a new virtual disk to our VMs through the VagrantFile
file.
To create and attach a new virtual disk to a VirtualBox VM, we need to use the vboxmanage
command or we can more conveniently define it directly in the VagrantFile
like in this extract:
#[...]
config.vm.define :n2 do |node|
node.vm.box = box
node.vm.hostname = "n2"
node.vm.network :private_network, ip: "10.10.10.53"
node.vm.provider "virtualbox" do |d|
d.customize ["modifyvm", :id, "--memory", 4096]
d.customize ["modifyvm", :id, "--cpus", 2]
d.customize ["modifyvm", :id, "--ioapic", "on"]
# Creating a virtual disk called "disk_osd-n2" with a size of 125GB
d.customize ["createhd", "--filename", "disk_osd-n2", "--size", 125 * 1024]
# Attaching the newly created virtual disk to our node
d.customize ["storageattach", :id, "--storagectl", "SCSI", "--port", 3, "--device", 0, "--type", "hdd", "--medium", "disk_osd-n2.vdi"]
end
end
#[...]
The field following "--storagectl"
in the last line needs to match the exact name of one of your VM’s storage controllers. Those names can be obtained from the command below, where the VM name comes from VBoxManage list vms
:
VBoxManage showvminfo <vm-name>| grep "Storage Controller"
Select the name of a Storage Controller with free ports from the output, and replace SCSI
in the above config with this name.
If we run again the whole installation process, we can see that the OSD pods are appearing:
NAME READY STATUS RESTARTS AGE
rook-ceph-agent-gs4sn 1/1 Running 0 3m55s
rook-ceph-agent-hwrrf 1/1 Running 0 3m55s
rook-ceph-mgr-a-dbdffd588-v2x2b 1/1 Running 0 75s
rook-ceph-mon-a-f5d5d4654-nmk6j 1/1 Running 0 2m28s
rook-ceph-mon-b-6c98476587-jq2s5 1/1 Running 0 104s
rook-ceph-mon-c-6f9f7f5bd6-8r8qw 1/1 Running 0 91s
rook-ceph-operator-765ff54667-vqj4p 1/1 Running 0 4m29s
rook-ceph-osd-0-5cf569ddf5-rw827 1/1 Running 0 28s <== Here!
rook-ceph-osd-1-7577f777f9-vjxml 1/1 Running 0 22s <== Also here
rook-ceph-osd-prepare-n2.k8s.test-bdw2g 0/2 Completed 0 51s
rook-ceph-osd-prepare-n3.k8s.test-26d86 0/2 Completed 0 51s
rook-discover-mblm6 1/1 Running 0 3m55s
rook-discover-wsk2z 1/1 Running 0 3m55s
And if we check the PVCs created by the Helm installation of Gitlab:
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
gitlab-minio Bound pvc-5f32d9f5-7d76-11e9-b3fe-02897c39bcfa 10Gi RWO rook-ceph-block 19s
gitlab-postgresql Bound pvc-5f342615-7d76-11e9-b3fe-02897c39bcfa 8Gi RWO rook-ceph-block 19s
gitlab-prometheus-server Bound pvc-5f34feb5-7d76-11e9-b3fe-02897c39bcfa 8Gi RWO rook-ceph-block 19s
gitlab-redis Bound pvc-5f3a0d3d-7d76-11e9-b3fe-02897c39bcfa 5Gi RWO rook-ceph-block 19s
repo-data-gitlab-gitaly-0 Bound pvc-5fe63ad2-7d76-11e9-b3fe-02897c39bcfa 50Gi RWO rook-ceph-block 17s
The PVCs are finally provisioned!
A step further, customizing cluster.yaml
You may have noticed that we didn’t give any information to Rook on how to find an appropriate device to use for storage; it just autonomously detected the one we attached to the VM and used it. For a lot of obvious reasons, this certainly isn’t a desired behavior as in a real-life context we could have numerous devices attached having different purpose than simply providing storage to Rook. It is of course possible to customize the way it finds and use the storage. It is defined inside the cluster.yaml
manifest. The related category of the manifest is storage
. Below is the default configuration:
storage:
useAllNodes: true
useAllDevices: true # <==
deviceFilter:
location:
config:
The useAllDevices
field is set to true
. From the official documentation of Rook: it indicates “whether all devices found on nodes in the cluster should be automatically consumed by OSDs”. The solution is to indicate to Rook where to look instead of automatically select any available device. If we set useAllDevices
to false
, we can use the following fields:
deviceFilter
to set a regex filter; for example^sd[a-d]
to find a device that starts with “sd” followed by a, b, c or d,devices
to define a list of individual devices that will be used,directories
to set a list of directories which will be used as the cluster storage.
It is also possible to define per-node configurations by setting useAllNodes
to false
, but this is out of the scope of this article. If you want to learn more about storage configuration for Rook, please take a look a the documention.
The end
Thank you for having read this article, I hope it has brought you some light if you were having the same problem!