All posts
Tiered Storage for ClickHouse® on Kubernetes: Hot Disks and S3 Cold Storage

Tiered Storage for ClickHouse® on Kubernetes: Hot Disks and S3 Cold Storage

June 16, 20265 min readGayathri
Share:

This is the fifteenth article in our series on running the ClickHouse® database on Kubernetes with the Altinity® Kubernetes Operator. We have a production cluster. This article tackles cost: as data grows, keeping everything on fast block storage gets expensive. Tiered storage keeps recent, frequently queried data on fast disks and moves older, rarely touched data to cheap object storage like Amazon S3.

The idea behind tiered storage

Most analytical queries touch recent data far more than old data. So you do not need every row on your fastest, most expensive disk. Tiered storage splits storage into a hot tier of fast local volumes and a cold tier backed by object storage, then automatically moves data from hot to cold as it ages. You keep fast queries on recent data and pay object-storage prices for the long tail. ClickHouse supports this natively through storage policies, and the operator lets you configure them.

How ClickHouse models storage

Three concepts combine here. A disk is a place to store data, such as a local volume or an S3 bucket. A volume groups one or more disks. A storage policy defines an ordered set of volumes, for example a hot volume then a cold volume, and tables are assigned a policy. When you mark a table with a policy that has hot and cold volumes, ClickHouse can move parts between them based on rules you set.

Configuring an S3 disk and a tiered policy

You supply this configuration through the operator's files block, which mounts it into the ClickHouse pods. The configuration defines an S3 disk, an optional local cache in front of it to speed up repeated reads, and a policy with a hot local volume and a cold S3 volume. Here is the storage configuration:

spec:
  configuration:
    files:
      config.d/storage.xml: |
        <clickhouse>
          <storage_configuration>
            <disks>
              <s3_disk>
                <type>s3</type>
                <endpoint>https://your-bucket.s3.us-east-1.amazonaws.com/clickhouse/</endpoint>
                <!-- Prefer instance/role credentials over inline keys. -->
                <use_environment_credentials>true</use_environment_credentials>
                <metadata_path>/var/lib/clickhouse/disks/s3_disk/</metadata_path>
              </s3_disk>
              <s3_cache>
                <type>cache</type>
                <disk>s3_disk</disk>
                <path>/var/lib/clickhouse/disks/s3_cache/</path>
                <max_size>50Gi</max_size>
              </s3_cache>
            </disks>
            <policies>
              <hot_cold>
                <volumes>
                  <hot>
                    <disk>default</disk>
                  </hot>
                  <cold>
                    <disk>s3_cache</disk>
                  </cold>
                </volumes>
              </hot_cold>
            </policies>
          </storage_configuration>
        </clickhouse>

The default disk is the local volume your pod already has. The s3_disk points at your bucket, and s3_cache wraps it with a local read cache so repeated reads of cold data do not always hit S3. The hot_cold policy lists the hot local volume first and the cold S3 volume second.

A note on credentials

Never put S3 access keys in plain text in your manifest. The cleanest approach on AWS is to give the pods an IAM role, for example with IAM Roles for Service Accounts on EKS, and set use_environment_credentials so ClickHouse picks up the role automatically, as shown above. If you must use static keys, store them in a Kubernetes Secret, expose them to the container as environment variables through the pod template, and let use_environment_credentials read them, rather than writing the keys into the configuration file. This keeps your storage credentials out of version control, consistent with the security practices from earlier in the series.

Using the policy on a table

With the policy defined, you assign it to a table and tell ClickHouse when to move data to the cold tier using a TTL rule. Create a table on the cold-capable policy and add a move rule:

CREATE TABLE events_tiered
(
    event_time DateTime,
    user_id    UInt64,
    action     String
)
ENGINE = MergeTree
ORDER BY (event_time, user_id)
TTL event_time + INTERVAL 30 DAY TO VOLUME 'cold'
SETTINGS storage_policy = 'hot_cold';

This table starts new data on the hot local volume. Once a row's event_time is older than 30 days, ClickHouse moves the containing data part to the cold S3 volume in the background. Recent queries stay fast on local disk, while old data lives cheaply in object storage, and queries that reach back in time still work transparently, just a little slower for the cold parts.

Watching data move between tiers

You can see which volume and disk each data part lives on through a system table:

SELECT table, partition, name, disk_name, rows
FROM system.parts
WHERE table = 'events_tiered' AND active
ORDER BY modification_time;

The disk_name column shows default for hot parts and the S3-backed disk for parts that have aged into the cold tier. Watching this is the easiest way to confirm your TTL move rule is working.

When tiered storage is worth it

Tiered storage shines when you have a large history that is queried rarely but must remain available, for example logs, events, or metrics kept for compliance. It is less useful when nearly all queries touch all the data, since then everything stays hot anyway. As a rule of thumb, if most of your storage is old data that is seldom read, moving it to S3 can cut storage cost dramatically while keeping recent queries fast.

What is next

You can now manage storage cost as your data grows. Even a well-built cluster eventually misbehaves, so in the next article we build a practical troubleshooting toolkit: reading cluster status, finding the right logs, and fixing the most common failures on Kubernetes.

References

Work with Quantrail

Expert ClickHouse services

We design, migrate, tune, and run ClickHouse for teams that own their data, from first architecture through day-two operations. Tell us what you are building and we will help.

Talk to an expert

Manage ClickHouse with CHOps

CHOps is our free, open-source ClickHouse admin tool: monitoring, query profiling, backups, visual access control, and alerting in one self-hosted interface, with zero agents on your servers.

Explore CHOps
Share: