Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,11 @@ linkTitle: "-State & -Merge combinators"
description: >
-State & -Merge combinators
---
The ClickHouse® -State combinator doesn't actually store information about -If combinator, so aggregate functions with -If and without have the same serialized data.

The -State combinator in ClickHouse® does not store additional information about the -If combinator, which means that aggregate functions with and without -If have the same serialized data structure. This can be verified through various examples, as demonstrated below.

**Example 1**: maxIfState and maxState
In this example, we use the maxIfState and maxState functions on a dataset of numbers, serialize the result, and merge it using the maxMerge function.

```sql
$ clickhouse-local --query "SELECT maxIfState(number,number % 2) as x, maxState(number) as y FROM numbers(10) FORMAT RowBinary" | clickhouse-local --input-format RowBinary --structure="x AggregateFunction(max,UInt64), y AggregateFunction(max,UInt64)" --query "SELECT maxMerge(x), maxMerge(y) FROM table"
Expand All @@ -13,7 +17,11 @@ $ clickhouse-local --query "SELECT maxIfState(number,number % 2) as x, maxState(
9 10
```

-State combinator have the same serialized data footprint regardless of parameters used in definition of aggregate function. That's true for quantile\* and sequenceMatch/sequenceCount functions.
In both cases, the -State combinator results in identical serialized data footprints, regardless of the conditions in the -If variant. The maxMerge function merges the state without concern for the original -If condition.

**Example 2**: quantilesTDigestIfState
Here, we use the quantilesTDigestIfState function to demonstrate that functions like quantile-based and sequence matching functions follow the same principle regarding serialized data consistency.


```sql
$ clickhouse-local --query "SELECT quantilesTDigestIfState(0.1,0.9)(number,number % 2) FROM numbers(1000000) FORMAT RowBinary" | clickhouse-local --input-format RowBinary --structure="x AggregateFunction(quantileTDigestWeighted(0.5),UInt64,UInt8)" --query "SELECT quantileTDigestWeightedMerge(0.4)(x) FROM table"
Expand All @@ -22,6 +30,12 @@ $ clickhouse-local --query "SELECT quantilesTDigestIfState(0.1,0.9)(number,numbe
$ clickhouse-local --query "SELECT quantilesTDigestIfState(0.1,0.9)(number,number % 2) FROM numbers(1000000) FORMAT RowBinary" | clickhouse-local --input-format RowBinary --structure="x AggregateFunction(quantilesTDigestWeighted(0.5),UInt64,UInt8)" --query "SELECT quantilesTDigestWeightedMerge(0.4,0.8)(x) FROM table"
[400000,800000]

```

**Example 3**: Quantile Functions with -Merge
This example shows how the quantileState and quantileMerge functions work together to calculate a specific quantile.

```sql
SELECT quantileMerge(0.9)(x)
FROM
(
Expand All @@ -34,6 +48,9 @@ FROM
└───────────────────────┘
```

**Example 4**: sequenceMatch and sequenceCount Functions with -Merge
Finally, we demonstrate the behavior of sequenceMatchState and sequenceMatchMerge, as well as sequenceCountState and sequenceCountMerge, in ClickHouse.

```sql
SELECT
sequenceMatchMerge('(?2)(?3)')(x) AS `2_3`,
Expand All @@ -48,6 +65,11 @@ FROM
┌─2_3─┬─1_4─┬─1_2_3─┐
│ 1 │ 1 │ 0 │
└─────┴─────┴───────┘
```

Similarly, sequenceCountState and sequenceCountMerge functions behave consistently when merging states:

```sql

SELECT
sequenceCountMerge('(?1)(?2)')(x) AS `2_3`,
Expand All @@ -64,3 +86,4 @@ FROM
│ 3 │ 0 │ 2 │
└─────┴─────┴───────┘
```
ClickHouse's -State combinator stores serialized data in a consistent manner, irrespective of conditions used with -If. The same applies to a wide range of functions, including quantile and sequence-based functions. This behavior ensures that functions like maxMerge, quantileMerge, sequenceMatchMerge, and sequenceCountMerge work seamlessly, even across varied inputs.
Loading