ClickHouse Functions: A Practical Overview
Technology

ClickHouse Functions: A Practical Overview

ClickHouse provides a powerful collection of built-in functions that simplify working with large datasets, from date arithmetic to unique ID generation. Here's a concise breakdown of the most essential categories.

Mafiree
Mafiree
3 min read

ClickHouse provides a powerful collection of built-in functions that simplify working with large datasets, from date arithmetic to unique ID generation. Here's a concise breakdown of the most essential categories.

Array Functions include arrayMap(), which applies an expression to every element in an array and returns a transformed result. groupArray() collects column values into arrays per group, with an optional size cap. argMax() and argMin() retrieve a related column value tied to the highest or lowest value in another column — great for identifying top or bottom performers.

Window Functions cover row_number(), which assigns sequential ranks to rows within a partition based on a specified order, and runningDifference(), which computes the gap between consecutive row values — ideal for spotting trends in stock prices or sales over time.

Date and Time Functions offer toStartOfYear() to normalize dates to January 1st for year-based aggregations, addDays(date, n) to shift dates forward by a set number of days, and the SQL-standard INTERVAL syntax for clean, static date offsets in filters. timeDiff() calculates the gap between two DateTime values in seconds, useful for measuring event durations.

Aggregate Functions include quantile() for percentile calculations such as median (p50), p90, and p99 — particularly valuable in server performance monitoring. stddevPop() and stddevSamp() measure how spread out data values are from the mean. Aggregate combinators like -If (e.g., sumIf) and -Array (e.g., sumArray) extend standard functions to handle conditional rows or array elements respectively, and can be combined with Array always preceding If.

Full-Text Search is handled by match(), which applies regular expression patterns to string columns for filtering or validation tasks.

UUID Functions include generateUUIDv4() for random unique identifiers and generateUUIDv7() (v24.1+) for time-sortable UUIDs suited to time-series primary keys.

Visual Representation is supported by bar(), which renders ASCII bar charts directly in query output without any external tools.

User Defined Functions (UDFs) allow custom logic using SQL lambda syntax, while Executable UDFs extend this by invoking external scripts like Python — for example, masking sensitive email or phone data.

Readable Formatting Functions — formatReadableSize(), formatReadableQuantity(), and formatReadableTimeDelta() — convert raw bytes, large numbers, and seconds into human-friendly representations.

Recent updates have added generateUUIDv7(), compound INTERVAL support, arrayFold(), Variant & Dynamic Types, and extended Date32 range support across date functions.

For a detailed understanding of each function with examples and query outputs, refer to our blog Clickhouse Functions

Discussion (0 comments)

0 comments

No comments yet. Be the first!