All posts
ClickHouse® Query Profiling: Identifying Bottlenecks

ClickHouse® Query Profiling: Identifying Bottlenecks

July 4, 20267 min readSanjeeev Kumar G
Share:

ClickHouse® Query Profiling is the process of analyzing query execution to identify performance bottlenecks before applying optimizations. Rather than relying on assumptions, profiling helps determine whether a query is limited by CPU, memory, disk I/O, network communication, or an inefficient execution plan. ClickHouse® provides built-in profiling tools that expose detailed execution statistics, enabling engineers to make evidence-based performance improvements.

This article explains how to profile queries in ClickHouse®, interpret the collected metrics, and identify common performance bottlenecks.


Why Query Profiling Matters

A slow query does not always indicate a slow database. The actual bottleneck may be:

  • Reading more data than necessary
  • Poor primary key filtering
  • Excessive sorting or aggregation
  • Memory-intensive joins
  • Network overhead in distributed queries
  • CPU-heavy expressions
  • Large data scans caused by ineffective pruning

Without profiling, these issues are difficult to distinguish because they often produce similar symptoms—high latency.

Query profiling provides visibility into:

  • Query execution time
  • Rows and bytes processed
  • Memory consumption
  • CPU utilization
  • Disk reads
  • Thread-level execution
  • Execution plan details
  • Low-level performance counters

The objective is to determine why a query is slow before attempting to optimize it.


Understanding the Query Profiling Workflow

A practical profiling workflow in ClickHouse® typically follows these steps:

  1. Identify slow queries using system.query_log.
  2. Examine the execution plan with EXPLAIN.
  3. Inspect resource usage and execution statistics.
  4. Analyze thread-level behavior if required.
  5. Review ProfileEvents to identify resource bottlenecks.
  6. Modify the query or schema.
  7. Measure the results again.

ClickHouse® engineering documentation recommends measuring performance before and after every optimization instead of applying multiple changes simultaneously.


Step 1: Finding Slow Queries with system.query_log

The system.query_log table records metadata for executed queries, including execution duration, rows read, bytes processed, memory usage, exceptions, and ProfileEvents. It stores execution statistics rather than query results. Query logging can be configured through server settings, and the logs can be flushed immediately using SYSTEM FLUSH LOGS when necessary.

A common query for identifying expensive queries is:

SELECT
    event_time,
    query_duration_ms,
    read_rows,
    read_bytes,
    memory_usage,
    query
FROM system.query_log
WHERE type = 'QueryFinish'
ORDER BY query_duration_ms DESC
LIMIT 10;

The most useful columns include:

ColumnWhat it Indicates
query_duration_msTotal execution time
read_rowsNumber of rows scanned
read_bytesAmount of data read
memory_usagePeak memory consumption
queryExecuted SQL statement
normalized_query_hashGroups structurally identical queries

The normalized_query_hash column is particularly useful because it groups similar queries together, helping identify expensive query patterns that execute repeatedly rather than isolated slow queries.


Step 2: Analyze the Execution Plan with EXPLAIN

Execution time alone does not explain why a query is slow.

The EXPLAIN statement reveals how ClickHouse® plans to execute a query.

Example:

EXPLAIN indexes = 1
SELECT *
FROM sales
WHERE order_date >= '2025-01-01';

The output shows information such as:

  • Index usage
  • Parts selected
  • Granules scanned
  • Predicate pushdown
  • Read operations

One of the most important indicators is the number of granules read.

If the query scans nearly every granule in a table, the primary key is providing little benefit, resulting in an expensive table scan. Conversely, reading only a small fraction of granules indicates effective data pruning.


Step 3: Inspect Query Resource Usage

Execution time alone rarely tells the complete story.

Consider the following metrics together.

Rows Read

read_rows

A query returning only a few hundred rows but scanning hundreds of millions usually indicates poor filtering or an ineffective primary key.


Bytes Read

read_bytes

Large values indicate heavy disk or storage activity.

This often suggests:

  • Full table scans
  • Reading unnecessary columns
  • Inefficient filtering

Peak Memory Usage

memory_usage

High memory consumption commonly occurs during:

  • Large joins
  • GROUP BY operations
  • ORDER BY
  • DISTINCT

Memory-intensive queries may eventually spill to disk depending on server settings.


Step 4: Use system.query_thread_log

Some slow queries are not uniformly slow.

Different execution threads may spend time performing different tasks.

The system.query_thread_log table records execution statistics for individual query threads, making it possible to determine:

  • CPU time per thread
  • Memory usage
  • Read statistics
  • Execution duration

This level of detail is especially valuable when diagnosing parallel execution or identifying thread imbalance.


Step 5: Understand ProfileEvents

One of the most valuable profiling features in ClickHouse® is the ProfileEvents map available in system.query_log.

ProfileEvents contains low-level execution counters collected while a query runs. These metrics can reveal whether a query is constrained by CPU, storage, or network activity.

Examples include:

  • CPU time
  • Disk reads
  • Network bytes sent
  • Selected rows
  • Selected bytes
  • Cache hits
  • File operations

Example:

SELECT
    ProfileEvents
FROM system.query_log
WHERE query_id = 'your-query-id';

Some commonly observed events include:

Profile EventInterpretation
SelectedRowsRows processed
SelectedBytesData processed
NetworkSendBytesNetwork traffic generated
ReadCompressedBytesCompressed data read from storage
UserTimeMicrosecondsCPU time spent in user space
SystemTimeMicrosecondsCPU time spent in kernel space

Looking at ProfileEvents often makes it clear whether the dominant cost is computation, I/O, or communication rather than simply observing a long execution time.


Identifying Common Bottlenecks

1. Excessive Data Scanning

Symptoms:

  • High read_rows
  • High read_bytes
  • Long execution time

Typical causes:

  • Missing filters
  • Poor primary key design
  • Ineffective partition pruning

2. CPU Bottlenecks

Symptoms:

  • High CPU ProfileEvents
  • Moderate data reads
  • Complex expressions

Common causes:

  • Expensive functions
  • Large aggregations
  • Complex JOIN conditions

3. Memory Bottlenecks

Symptoms:

  • High memory_usage
  • Slow aggregations
  • Slow joins

Often caused by:

  • Large hash tables
  • Large GROUP BY operations
  • Wide intermediate datasets

4. Disk I/O Bottlenecks

Symptoms:

  • High compressed bytes read
  • Large storage reads
  • Long execution time despite moderate CPU usage

Possible reasons include:

  • Reading excessive columns
  • Large table scans
  • Poor data locality

5. Network Bottlenecks

Distributed queries introduce additional overhead.

Indicators include:

  • High NetworkSendBytes
  • Large data exchange
  • Slow distributed aggregations

Reducing intermediate data movement can significantly improve distributed query performance.


Practical Example

Suppose a query requires 20 seconds to complete.

Profiling reveals:

MetricValue
Duration20 s
Rows Read450 million
Result Rows150
Memory180 MB
CPUModerate
Read BytesVery High

This profile suggests that CPU is not the limiting factor.

Instead, the query spends most of its time scanning data.

The appropriate optimization is to improve data pruning through better filtering, schema design, or primary key selection—not to increase CPU resources.


Best Practices for Query Profiling

  • Enable query logging in production environments.
  • Analyze execution plans before modifying queries.
  • Review ProfileEvents alongside execution time.
  • Track recurring slow queries using normalized_query_hash.
  • Measure the impact of every optimization.
  • Test changes with realistic datasets rather than synthetic examples.
  • Avoid optimizing based on a single metric.

ClickHouse® also recommends testing optimizations iteratively and, when evaluating storage-related changes, measuring both cold-cache and warm-cache performance to understand real-world behavior.


Conclusion

Query optimization is fundamentally an exercise in identifying bottlenecks rather than making isolated performance tweaks.

ClickHouse® provides comprehensive profiling capabilities through system.query_log, system.query_thread_log, EXPLAIN, and ProfileEvents, enabling engineers to understand how a query is executed and which resources it consumes.

By combining execution plans with runtime metrics, it becomes possible to determine whether a query is constrained by CPU, memory, storage, or network activity. This evidence-driven approach leads to targeted optimizations that improve performance while avoiding unnecessary changes to infrastructure or configuration.

References

clickhouse-docs

Work with Quantrail

Expert ClickHouse services

We design, migrate, tune, and run ClickHouse for teams that own their data, from first architecture through day-two operations. Tell us what you are building and we will help.

Talk to an expert

Manage ClickHouse with CHOps

CHOps is our free, open-source ClickHouse admin tool: monitoring, query profiling, backups, visual access control, and alerting in one self-hosted interface, with zero agents on your servers.

Explore CHOps
Share: