This is the third article in our series on running the ClickHouse® database on Kubernetes with the Altinity® Kubernetes Operator. So far we have learned the concepts and built a local cluster. Now we deploy a single ClickHouse node onto it, writing every piece of Kubernetes YAML ourselves.
We do this the manual way on purpose. It is the best way to understand what a ClickHouse deployment is actually made of, and by the end you will feel the friction that the operator removes.
What we are building
A single ClickHouse server running as a Kubernetes StatefulSet, with its data on a PersistentVolumeClaim so it survives pod restarts, reachable through a Service. We will connect, create a table, run an analytical query, and then talk honestly about why this approach does not scale to a real cluster.
You should have a running local cluster from the previous article. Confirm it with kubectl get nodes.
Step 1: Create a namespace
Keeping our work in its own namespace keeps the cluster tidy:
kubectl create namespace clickhouse-manualStep 2: Write the manifest
Recall from the first article that a database needs three things from Kubernetes: a stable identity (StatefulSet), storage that survives restarts (a PersistentVolumeClaim), and a stable address (a Service). Our manifest provides all three. Save this as clickhouse-single.yaml:
# A headless Service gives the StatefulSet pod a stable DNS name.
apiVersion: v1
kind: Service
metadata:
name: clickhouse
namespace: clickhouse-manual
spec:
clusterIP: None
selector:
app: clickhouse
ports:
- name: http
port: 8123
- name: native
port: 9000
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: clickhouse
namespace: clickhouse-manual
spec:
serviceName: clickhouse
replicas: 1
selector:
matchLabels:
app: clickhouse
template:
metadata:
labels:
app: clickhouse
spec:
containers:
- name: clickhouse
image: clickhouse/clickhouse-server:26.3
ports:
- name: http
containerPort: 8123
- name: native
containerPort: 9000
env:
- name: CLICKHOUSE_USER
value: demo
- name: CLICKHOUSE_PASSWORD
value: demo_password
- name: CLICKHOUSE_DEFAULT_ACCESS_MANAGEMENT
value: "1"
volumeMounts:
- name: data
mountPath: /var/lib/clickhouse
resources:
requests:
cpu: "1"
memory: 2Gi
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 5GiA few things to notice. We pin the image to clickhouse/clickhouse-server:26.3, the current Long Term Support release, rather than latest, so the deployment is reproducible. The volumeClaimTemplates block is the StatefulSet feature that gives the pod its own persistent disk, mounted at ClickHouse's data directory. The official image reads the CLICKHOUSE_USER and CLICKHOUSE_PASSWORD environment variables to create a user on first start, which saves us writing a configuration file for now.
Step 3: Apply it and watch it start
kubectl apply -f clickhouse-single.yaml
kubectl get pods -n clickhouse-manual -wThe -w flag watches the pod. It moves from Pending to ContainerCreating to Running over a minute or so, as Kubernetes provisions the disk and pulls the image. Press Ctrl+C once it is Running. Notice the pod is named clickhouse-0: the StatefulSet gave it that stable, predictable identity, and if it ever restarts it will come back as clickhouse-0 with the same disk.
Confirm the storage was provisioned automatically:
kubectl get pvc -n clickhouse-manualYou will see a bound claim named data-clickhouse-0, the disk tied to this pod.
Step 4: Connect and run a query
Open a ClickHouse client inside the pod:
kubectl exec -it clickhouse-0 -n clickhouse-manual -- \
clickhouse-client -u demo --password demo_passwordYou are now at a ClickHouse prompt. Create a small table and load demo data generated on the fly, so this works even with no internet access from the pod:
CREATE TABLE trips
(
id UInt64,
city String,
fare Float64
)
ENGINE = MergeTree
ORDER BY id;
INSERT INTO trips
SELECT number,
['Chennai', 'Paris', 'Tokyo', 'New York'][number % 4 + 1],
round(10 + rand() % 90 + rand() / 4294967295, 2)
FROM numbers(100000);Now run the kind of analytical query ClickHouse is built for:
SELECT city, count() AS trips, round(avg(fare), 2) AS avg_fare
FROM trips
GROUP BY city
ORDER BY trips DESC;You scanned a hundred thousand rows and aggregated them in a blink. Type exit to leave the client. Your single-node ClickHouse is working.
Step 5: Reach it from your machine
To query from your laptop rather than from inside the pod, forward the HTTP port:
kubectl port-forward -n clickhouse-manual svc/clickhouse 8123:8123Then in another terminal:
curl 'http://localhost:8123/?user=demo&password=demo_password' \
--data-binary 'SELECT count() FROM trips'It returns the row count. You now have a real, queryable ClickHouse server on Kubernetes.
Step 6: Prove the data survives a restart
Delete the pod and watch Kubernetes rebuild it:
kubectl delete pod clickhouse-0 -n clickhouse-manual
kubectl get pods -n clickhouse-manual -wA new clickhouse-0 appears. Connect again and run SELECT count() FROM trips: your rows are still there, because the StatefulSet reattached the same PersistentVolumeClaim. This is the payoff of doing storage properly.
Why this does not scale: the case for an operator
Everything above is fine for a single node. Now imagine what a real production deployment needs, and how much of it you would have to do by hand.
For replication, ClickHouse needs a coordination service called ClickHouse Keeper, plus carefully written configuration that tells each server about its peers, plus ReplicatedMergeTree tables wired with the correct cluster macros. None of that is in our manifest.
For sharding, you would hand-write a remote_servers configuration describing every shard and replica, and keep it in sync every time the topology changes.
For users, settings, profiles, and quotas, you would maintain a pile of XML configuration files mounted into the pods.
For scaling, you would edit the StatefulSet and the cluster configuration together and hope you kept them consistent.
For upgrades, you would manage a careful rolling restart yourself to avoid downtime.
Doing all of this correctly, and keeping it correct as the cluster grows and changes, is a real job. This is precisely the problem the Altinity Kubernetes Operator solves. You describe the cluster you want in one concise resource, and the operator generates and maintains all of the StatefulSets, Services, volumes, configuration, and coordination for you.
Clean up
kubectl delete namespace clickhouse-manualThis removes the pod, Service, and claim in one step.
What is next
You have run ClickHouse on Kubernetes the hard way, and you have seen its limits. In the next article we meet the Altinity Kubernetes Operator, learn why it exists and where it came from, install it, and deploy our first operator-managed ClickHouse cluster with a fraction of the YAML.



