This is the fifth article in our series on running the ClickHouse® database on Kubernetes with the Altinity® Kubernetes Operator. We have installed the operator and run a first cluster. That cluster used whatever default storage the operator picked, which is fine for a demo and dangerous for real data. This article makes storage explicit and durable.
Why storage is the part you must get right
A ClickHouse pod can be deleted and recreated at any time. If its data lived on the pod, it would vanish on every restart. As we saw earlier, Kubernetes solves this by keeping data on a PersistentVolume that outlives the pod, requested through a PersistentVolumeClaim. The operator gives you a clean way to describe these claims inside the CHI, so you never write a StatefulSet by hand.
StorageClasses: where volumes come from
A StorageClass describes a kind of storage your cluster can create on demand. When a claim asks for storage and names a StorageClass, the cluster provisions a matching volume automatically. Most clusters have a default StorageClass, so a claim that names none still gets a disk. List what your cluster offers:
kubectl get storageclassOn minikube you will see a class named standard marked (default). On a cloud cluster you will see classes backed by the provider's disks. You can let ClickHouse use the default, or name a specific class for faster or encrypted disks.
Adding a data volume to a CHI
The operator uses two pieces that work together: a volumeClaimTemplate under templates that describes the disk, and a reference to it under defaults.templates.dataVolumeClaimTemplate. Here is a single node with an explicit 10 gigabyte data volume. Save it as ch-storage.yaml:
apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
name: "storage-demo"
spec:
defaults:
templates:
dataVolumeClaimTemplate: data-volume
configuration:
clusters:
- name: "main"
layout:
shardsCount: 1
replicasCount: 1
templates:
podTemplates:
- name: clickhouse-pod
spec:
containers:
- name: clickhouse
image: clickhouse/clickhouse-server:26.3
volumeClaimTemplates:
- name: data-volume
spec:
# Omit storageClassName to use the cluster default,
# or set it explicitly, for example: storageClassName: standard
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10GiThe data-volume template requests a 10 gigabyte ReadWriteOnce disk, which means the volume is mounted read-write by a single node, exactly what a ClickHouse server needs. Because we did not set storageClassName, the cluster's default class provisions it. Apply it:
kubectl create namespace ch
kubectl apply -n ch -f ch-storage.yamlThen confirm the claim was bound to a real volume:
kubectl get pvc -n chNAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS
data-volume-chi-storage-demo... Bound pvc-... 10Gi RWO standardThe operator created the claim, the cluster provisioned the disk, and ClickHouse now stores its data on storage that survives pod restarts.
Separating data and log volumes
ClickHouse writes its table data to /var/lib/clickhouse and its logs to /var/log/clickhouse-server. On a busy server it is good practice to keep logs on their own smaller volume so log growth can never fill the data disk. The operator supports this with a second template and a second reference:
spec:
defaults:
templates:
dataVolumeClaimTemplate: data-volume
logVolumeClaimTemplate: log-volume
templates:
volumeClaimTemplates:
- name: data-volume
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 10Gi
- name: log-volume
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 1GiNow data and logs live on independent disks, each sized for its job.
Reclaim policy: what happens when you delete the cluster
Every StorageClass has a reclaim policy that decides the fate of a volume when its claim is removed. The common default is Delete, which destroys the underlying disk when you delete the CHI. That is convenient for learning and risky for production, where you may prefer Retain so the data sticks around even if the cluster object is deleted by mistake. Check your class with kubectl get storageclass <name> -o yaml and look at reclaimPolicy. Choose deliberately; this setting protects you from accidental data loss.
Growing a volume without downtime
Data grows, and eventually 10 gigabytes is not enough. If your StorageClass allows it, you can expand a volume in place. Two things must be true. First, the StorageClass must permit expansion, which is set with allowVolumeExpansion: true on the class. Second, you increase the requested size in your CHI and reapply.
Here is a StorageClass that allows expansion, as an example of the property to look for:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: expandable
provisioner: kubernetes.io/your-provisioner
allowVolumeExpansion: true
reclaimPolicy: RetainWith an expandable class in use, raise the storage request, for example from 10Gi to 50Gi, in your volumeClaimTemplates, and reapply the CHI. The operator updates the claim and the volume grows without recreating the pod, so the database keeps serving queries throughout. Always confirm your provider supports online expansion before relying on this.
A note on choosing sizes and classes
For learning on minikube, the default standard class and small sizes are perfect. For production, pick a StorageClass backed by fast SSD or NVMe disks, size the data volume for your expected dataset plus headroom for background merges, give logs a modest separate volume, and set the reclaim policy to protect your data. We return to production storage choices, including tiered storage to object stores like S3, later in the series.
Clean up
kubectl delete namespace chWhat is next
Your ClickHouse data is now durable. In the next article we configure the database itself through the operator: users, profiles, quotas, server settings, and storing passwords safely in Kubernetes Secrets instead of plain text.



