Skip to content

Fix CapnProto empty Data field crash for UInt256/UInt128 during schema evolution#1619

Open
BorisTyshkevich wants to merge 1 commit intoantalya-26.1from
fix-capnproto-empty-data-default
Open

Fix CapnProto empty Data field crash for UInt256/UInt128 during schema evolution#1619
BorisTyshkevich wants to merge 1 commit intoantalya-26.1from
fix-capnproto-empty-data-default

Conversation

@BorisTyshkevich
Copy link
Copy Markdown
Collaborator

Summary

  • When adding a new UInt256/UInt128/Int256/Int128/Decimal128/Decimal256/IPv6 field to a CapnProto schema, old messages (produced before the field existed) cause ClickHouse to crash with Unexpected size of UInt256 value: 0
  • Root cause: CapnProtoFixedSizeRawDataSerializer::insertData() does a strict size check — absent Data pointer fields return a zero-length blob, which fails the check
  • Fix: insert a column default (zero) when the Data blob is empty, preserving the size check for non-zero wrong sizes (real corruption)
  • This is a 6-line change in one function

Related issues

Root cause

In src/Formats/CapnProtoSerializer.cpp, CapnProtoFixedSizeRawDataSerializer::insertData() unconditionally requires data.size() == expected_value_size (32 for UInt256, 16 for UInt128). When a Data pointer field is absent in an old CapnProto message, the capnp library returns a zero-length blob (size() == 0), which fails the check. There is no setting (input_format_defaults_for_omitted_fields or otherwise) that affects this code path.

This makes backward-compatible schema evolution impossible for Data-backed types, especially inside Tuple columns.

Fix

void insertData(IColumn & column, capnp::Data::Reader data)
{
    if (data.size() == 0)        // absent field in old message
    {
        column.insertDefault();  // zero for UInt256/UInt128/etc.
        return;
    }

    if (data.size() != expected_value_size)  // real corruption
        throw Exception(...);

    column.insertData(...);
}

Test plan

  • Test 1: Flat struct — old message missing UInt256/UInt128/Int256/Decimal128 fields → defaults to 0
  • Test 2: Nested struct (Tuple) — old message missing UInt256 inside Tuple → defaults to 0
  • Test 3: All fields populated → no regression, values preserved
  • Test 4: Wrong non-zero Data size still errors
  • Manually verified with sample_msg_01.bin from the reproduction repo on ClickHouse 25.12

🤖 Generated with Claude Code

@BorisTyshkevich BorisTyshkevich force-pushed the fix-capnproto-empty-data-default branch from 623a757 to 766db87 Compare April 7, 2026 14:09
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 7, 2026

Workflow [PR], commit [7f29d3a]

…a evolution

When reading a CapnProto message with a schema that has more Data fields
than the message was produced with, absent Data pointer fields return a
zero-length blob. The strict size check in CapnProtoFixedSizeRawDataSerializer
throws "Unexpected size of UInt256 value: 0" instead of inserting a default.

This makes backward-compatible schema evolution impossible for types backed
by CapnProto Data fields (UInt256, UInt128, Int256, Int128, Decimal128,
Decimal256, IPv6) — especially inside Tuple columns.

Fix: insert a column default (zero) when the Data blob is empty, while
preserving the size check for non-zero wrong sizes (real corruption).

Closes ClickHouse#86864

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Boris Tyshkevich <68195949+bvt123@users.noreply.github.com>
@BorisTyshkevich BorisTyshkevich force-pushed the fix-capnproto-empty-data-default branch from 766db87 to 7f29d3a Compare April 7, 2026 14:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants