Fix CapnProto empty Data field crash for UInt256/UInt128 during schema evolution#1619
Open
BorisTyshkevich wants to merge 1 commit intoantalya-26.1from
Open
Fix CapnProto empty Data field crash for UInt256/UInt128 during schema evolution#1619BorisTyshkevich wants to merge 1 commit intoantalya-26.1from
BorisTyshkevich wants to merge 1 commit intoantalya-26.1from
Conversation
623a757 to
766db87
Compare
…a evolution When reading a CapnProto message with a schema that has more Data fields than the message was produced with, absent Data pointer fields return a zero-length blob. The strict size check in CapnProtoFixedSizeRawDataSerializer throws "Unexpected size of UInt256 value: 0" instead of inserting a default. This makes backward-compatible schema evolution impossible for types backed by CapnProto Data fields (UInt256, UInt128, Int256, Int128, Decimal128, Decimal256, IPv6) — especially inside Tuple columns. Fix: insert a column default (zero) when the Data blob is empty, while preserving the size check for non-zero wrong sizes (real corruption). Closes ClickHouse#86864 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Boris Tyshkevich <68195949+bvt123@users.noreply.github.com>
766db87 to
7f29d3a
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
UInt256/UInt128/Int256/Int128/Decimal128/Decimal256/IPv6field to a CapnProto schema, old messages (produced before the field existed) cause ClickHouse to crash withUnexpected size of UInt256 value: 0CapnProtoFixedSizeRawDataSerializer::insertData()does a strict size check — absentDatapointer fields return a zero-length blob, which fails the checkDatablob is empty, preserving the size check for non-zero wrong sizes (real corruption)Related issues
Root cause
In
src/Formats/CapnProtoSerializer.cpp,CapnProtoFixedSizeRawDataSerializer::insertData()unconditionally requiresdata.size() == expected_value_size(32 for UInt256, 16 for UInt128). When aDatapointer field is absent in an old CapnProto message, the capnp library returns a zero-length blob (size() == 0), which fails the check. There is no setting (input_format_defaults_for_omitted_fieldsor otherwise) that affects this code path.This makes backward-compatible schema evolution impossible for Data-backed types, especially inside Tuple columns.
Fix
Test plan
sample_msg_01.binfrom the reproduction repo on ClickHouse 25.12🤖 Generated with Claude Code