Kubernetes Storage – Part 2 – GlusterFS complete tutorial
If you are interested in Kubernetes Storage, this series of articles is for you. In this part, I will explain how to up and run GlusterFS and use it in Kubernetes.
Follow our social media:
https://www.linkedin.com/in/ssbostan
https://www.linkedin.com/company/kubedemy
https://www.youtube.com/@kubedemy
Resource Requirements:
- A running Kubernetes cluster. 1.18+ is suggested.
- Three storage nodes to run GlusterFS storage with replication.
I will run the steps on Ubuntu-based systems. I suggest you do so.
Tip: k3s does not support GlusterFS volumes.
A three-node Kubernetes cluster:
Node1
OS: Ubuntu 20.04
Kubernetes: 1.20.7 (installed via kubespray)
FQDN: node001.b9tcluster.local
IP Address: 192.168.12.4
Node2
OS: Ubuntu 20.04
Kubernetes: 1.20.7 (installed via kubespray)
FQDN: node002.b9tcluster.local
IP Address: 192.168.12.5
Node3
OS: Ubuntu 20.04
Kubernetes: 1.20.7 (installed via kubespray)
FQDN: node003.b9tcluster.local
IP Address: 192.168.12.6
Three nodes for GlusterFS:
Storage1
OS: Ubuntu 20.04
FQDN: node004.b9tcluster.local
IP Address: 192.168.12.7
Storage: /dev/sdb1 mounted on /gluster/volume
Storage: /dev/sdb2 mounted on /gluster/heketi (Heketi)
Role: Replica
Storage2
OS: Ubuntu 20.04
FQDN: node005.b9tcluster.local
IP Address: 192.168.12.8
Storage: /dev/sdb1 mounted on /gluster/volume
Storage: /dev/sdb2 mounted on /gluster/heketi (Heketi)
Role: Replica
Storage3
OS: Ubuntu 20.04
FQDN: node006.b9tcluster.local
IP Address: 192.168.12.9
Storage: /dev/sdb1 mounted on /gluster/volume
Storage: /dev/sdb2 mounted on /gluster/heketi (Heketi)
Role: Arbiter
In the above config for the GlusterFS cluster, the Arbiter node is a node that does not replicate data. Instead of data, it saves metadata of files. We use it to prevent storage split-brain with two replicas only.
1- Up and Run GlusterFS cluster:
To install and configure GlusterFS, follow the following steps:
# Install GlusterFS server on all STORAGE nodes.
apt install -y glusterfs-server
systemctl enable --now glusterd.service
Setup a Gluster volume with two replicas and one arbiter:
# Only on Node1 of STORAGE nodes.
gluster peer probe node005.b9tcluster.local
gluster peer probe node006.b9tcluster.local
gluster volume create k8s-volume replica 2 arbiter 1 transport tcp \
node004:/gluster/volume \
node005:/gluster/volume \
node006:/gluster/volume
gluster volume start k8s-volume
To view the volume information, run the following command:
gluster volume info k8s-volume
Volume Name: k8s-volume
Type: Replicate
Volume ID: dd5aac80-b160-4281-9b22-00ae95f4bc0c
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: node004:/gluster/volume
Brick2: node005:/gluster/volume
Brick3: node006:/gluster/volume (arbiter)
Options Reconfigured:
transport.address-family: inet
storage.fips-mode-rchecksum: on
nfs.disable: on
performance.client-io-threads: off
2- Prepare Kubernetes worker nodes:
To enable Kubernetes workers to connect and use GlusterFS volume, you need to install glusterfs-client in WORKER nodes.
apt install glusterfs-client
Important tip! Each storage solution may need client packages to connect to the storage server. You should install them in all Kubernetes worker nodes.
For GlusterFS, the glusterfs-client package is required.
3- Discovering GlusterFS in Kubernetes:
GlusterFS cluster should be discovered in the Kubernetes cluster. To do that, you need to add an Endpoints object points to the servers of the GlusterFS cluster.
apiVersion: v1
kind: Endpoints
metadata:
name: glusterfs-cluster
labels:
storage.k8s.io/name: glusterfs
storage.k8s.io/part-of: kubernetes-complete-reference
storage.k8s.io/created-by: ssbostan
subsets:
- addresses:
- ip: 192.168.12.7
hostname: node004
- ip: 192.168.12.8
hostname: node005
- ip: 192.168.12.9
hostname: node006
ports:
- port: 1
4- Using GlusterFS in Kubernetes:
Method 1 – Connecting to GlusterFS directly with Pod manifest:
To connect to the GlusterFS volume directly with the Pod manifest, use the GlusterfsVolumeSource in the PodSpec. Here is an example:
apiVersion: v1
kind: Pod
metadata:
name: test
labels:
app.kubernetes.io/name: alpine
app.kubernetes.io/part-of: kubernetes-complete-reference
app.kubernetes.io/created-by: ssbostan
spec:
containers:
- name: alpine
image: alpine:latest
command:
- touch
- /data/test
volumeMounts:
- name: glusterfs-volume
mountPath: /data
volumes:
- name: glusterfs-volume
glusterfs:
endpoints: glusterfs-cluster
path: k8s-volume
readOnly: no
Method 2 – Connecting using the PersistentVolume resource:
Use the following manifest to create the PersistentVolume object for the GlusterFS volume. The storage size does not take any effect.
apiVersion: v1
kind: PersistentVolume
metadata:
name: glusterfs-volume
labels:
storage.k8s.io/name: glusterfs
storage.k8s.io/part-of: kubernetes-complete-reference
storage.k8s.io/created-by: ssbostan
spec:
accessModes:
- ReadWriteOnce
- ReadOnlyMany
- ReadWriteMany
capacity:
storage: 10Gi
storageClassName: ""
persistentVolumeReclaimPolicy: Recycle
volumeMode: Filesystem
glusterfs:
endpoints: glusterfs-cluster
path: k8s-volume
readOnly: no
Method 3 — Dynamic provisioning using StorageClass:
It’s time to explain the most challenging section, GlusterFS with Kubernetes StorageClass, to achieve dynamic storage provisioning. To accomplish this method, in addition to GlusterFS cluster, we need Heketi. Heketi is RESTful-based volume management for GlusterFS. We must use Heketi to allow Kubernetes to create volumes dynamically.
Requirements:
- A running GlusterFS to store the Heketi database.
- Available raw storage devices on GlusterFS cluster nodes.
- SSH key to connect Heketi to GlusterFS nodes.
Architecture and Scenario:
To construct volumes, the Kubernetes-deployed Heketi must connect with GlusterFS nodes. This communication should be accomplished by using SSH. So, we need a private key authorized on GlusterFS nodes. On the other hand, Heketi should be aware of which raw devices are accessible for creating partitions and GlusterFS bricks. A topology file should contain a list of cluster nodes as well as a list of accessible raw storage. Heketi has its database to save all about created bricks. To protect Heketi data, I have used the existing GlusterFS cluster we deployed in previous sections.
Let’s assume some existing materials:
- The private key for connecting to GlusterFS nodes via SSH: heketi-ssh-key
- Available raw devices on all GlusterFS nodes: /dev/sdc (100GB)
3.1: Create a GlusterFS volume to store the Heketi database:
gluster volume create heketi-db-volume replica 3 transport tcp \
node004:/gluster/heketi \
node005:/gluster/heketi \
node006:/gluster/heketi
gluster volume start heketi-db-volume
3.2: Create a Kubernetes secret to store SSH private key:
kubectl create secret generic heketi-ssh-key-file \
--from-file=heketi-ssh-key
3.3: Create Heketi “config.json” file:
{
"_port_comment": "Heketi Server Port Number",
"port": "8080",
"_use_auth": "Enable JWT authorization.",
"use_auth": true,
"_jwt": "Private keys for access",
"jwt": {
"_admin": "Admin has access to all APIs",
"admin": {
"key": "ADMIN-HARD-SECRET"
}
},
"_glusterfs_comment": "GlusterFS Configuration",
"glusterfs": {
"executor": "ssh",
"_sshexec_comment": "SSH username and private key file",
"sshexec": {
"keyfile": "/heketi/heketi-ssh-key",
"user": "root",
"port": "22"
},
"_db_comment": "Database file name",
"db": "/var/lib/heketi/heketi.db",
"loglevel" : "debug"
}
}
3.4: Create the cluster “topology.json” file:
More than one raw device can be used. The Heketi knows how to manage them. In addition, Heketi can manage several clusters simultaneously.
{
"clusters": [
{
"nodes": [
{
"node": {
"hostnames": {
"manage": [
"node004.b9tcluster.local"
],
"storage": [
"192.168.12.7"
]
},
"zone": 1
},
"devices": [
"/dev/sdc"
]
},
{
"node": {
"hostnames": {
"manage": [
"node005.b9tcluster.local"
],
"storage": [
"192.168.12.8"
]
},
"zone": 1
},
"devices": [
"/dev/sdc"
]
},
{
"node": {
"hostnames": {
"manage": [
"node006.b9tcluster.local"
],
"storage": [
"192.168.12.9"
]
},
"zone": 1
},
"devices": [
"/dev/sdc"
]
}
]
}
]
}
3.5: Create a Kubernetes ConfigMap for Heketi config and topology:
kubectl create configmap heketi-config \
--from-file=heketi.json \
--from-file=topology.json
3.6: Up and Run the Heketi in Kubernetes:
apiVersion: v1
kind: Service
metadata:
name: heketi
labels:
app.kubernetes.io/name: heketi
app.kubernetes.io/part-of: glusterfs
app.kubernetes.io/origin: kubernetes-complete-reference
app.kubernetes.io/created-by: ssbostan
spec:
type: NodePort
selector:
app.kubernetes.io/name: heketi
app.kubernetes.io/part-of: glusterfs
app.kubernetes.io/origin: kubernetes-complete-reference
app.kubernetes.io/created-by: ssbostan
ports:
- port: 8080
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: heketi
labels:
app.kubernetes.io/name: heketi
app.kubernetes.io/part-of: glusterfs
app.kubernetes.io/origin: kubernetes-complete-reference
app.kubernetes.io/created-by: ssbostan
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: heketi
app.kubernetes.io/part-of: glusterfs
app.kubernetes.io/origin: kubernetes-complete-reference
app.kubernetes.io/created-by: ssbostan
template:
metadata:
labels:
app.kubernetes.io/name: heketi
app.kubernetes.io/part-of: glusterfs
app.kubernetes.io/origin: kubernetes-complete-reference
app.kubernetes.io/created-by: ssbostan
spec:
containers:
- name: heketi
image: heketi/heketi:10
ports:
- containerPort: 8080
volumeMounts:
- name: ssh-key-file
mountPath: /heketi
- name: config
mountPath: /etc/heketi
- name: data
mountPath: /var/lib/heketi
volumes:
- name: ssh-key-file
secret:
secretName: heketi-ssh-key-file
- name: config
configMap:
name: heketi-config
- name: data
glusterfs:
endpoints: glusterfs-cluster
path: heketi-db-volume
3.7: Load the cluster topology into Heketi:
kubectl exec POD-NAME -- heketi-cli \
--user admin \
--secret ADMIN-HARD-SECRET \
topology load --json /etc/heketi/topology.json
Replace POD-NAME with the name of your Heketi pod.
If everything goes well, you should be able to get the cluster id from Heketi.
kubectl exec POD-NAME -- heketi-cli \
--user admin \
--secret ADMIN-HARD-SECRET \
cluster list
Clusters:
Id:c63d60ee0ddf415097f4eb82d69f4e48 [file][block]
3.8: Get Heketi NodePort info:
kubectl get svc
heketi NodePort 10.233.29.206 <none> 8080:31310/TCP 41d
3.9: Create Secret of Heketi “admin” user:
kubectl create secret generic heketi-admin-secret \
--type=kubernetes.io/glusterfs \
--from-literal=key=ADMIN-HARD-SECRET
3.10: Create StorageClass for GlusterFS dynamic provisioning:
Replace needed info with your own.
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: glusterfs
labels:
storage.k8s.io/name: glusterfs
storage.k8s.io/provisioner: heketi
storage.k8s.io/origin: kubernetes-complete-reference
storage.k8s.io/created-by: ssbostan
provisioner: kubernetes.io/glusterfs
parameters:
resturl: http://127.0.0.1:31310
clusterid: c63d60ee0ddf415097f4eb82d69f4e48
restauthenabled: !!str true
restuser: admin
secretNamespace: default
secretName: heketi-admin-secret
volumetype: replicate:3
3.11: Create PersistentVolumeClaim to test dynamic provisioning:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: testvol
labels:
storage.k8s.io/name: glusterfs
storage.k8s.io/provisioner: heketi
storage.k8s.io/origin: kubernetes-complete-reference
storage.k8s.io/created-by: ssbostan
spec:
accessModes:
- ReadWriteOnce
storageClassName: glusterfs
resources:
requests:
storage: 10Gi # Storage size takes effect.
4- Kubernetes and GlusterFS storage specification:
Before using GlusterFS, please consider the following tips:
- ReadWriteOnce, ReadOnlyMany, and ReadWriteMany access modes.
- GlusterFS volumes can be isolated with partitions and bricks.
- GlusterFS can also be deployed on Kubernetes like native storage.
If you like this series of articles, please share them and write your thoughts as comments here. Your feedback encourages me to complete this massively planned program.
Follow my LinkedIn https://www.linkedin.com/in/ssbostan
Follow Kubedemy LinkedIn https://www.linkedin.com/company/kubedemy
Follow Kubedemy Telegram https://telegram.me/kubedemy