Skip to content

feat: support null value messages (tombstones) for compacted topics#304

Merged
BewareMyPower merged 1 commit into
apache:mainfrom
grishaf:feat/null-value-messages
Jun 17, 2026
Merged

feat: support null value messages (tombstones) for compacted topics#304
BewareMyPower merged 1 commit into
apache:mainfrom
grishaf:feat/null-value-messages

Conversation

@grishaf

@grishaf grishaf commented May 18, 2026

Copy link
Copy Markdown
Contributor

Motivation

Currently the Python client cannot send null value messages, which are needed as tombstones on compacted topics to delete entries for specific keys. Attempting to call producer.send(None, ...) raises a TypeError because BytesSchema.encode() rejects None, and _build_msg has a second _check_type(bytes, data, 'data') guard.

The C++ client added MessageBuilder::setNullValue() and Message::hasNullValue() in apache/pulsar-client-cpp#563 (merged April 3, 2026, milestone 4.2.0). The Java client has supported value(null) tombstones since v2.8+ via TypedMessageBuilderImpl.beforeSend() which sets msgMetadata.setNullValue(true) when the value is null.

This PR wraps the new C++ API for Python, using the same pattern as the Java client.

Modifications

pybind11 bindings (src/message.cc)

  • Added set_null_value binding on MessageBuilderMessageBuilder::setNullValue()
  • Added has_null_value binding on MessageMessage::hasNullValue()

Python wrapper (pulsar/__init__.py)

  • Message.has_null_value() — check if a received message is a tombstone
  • Producer._build_msg() — when content is None, skip schema encoding and call mb.set_null_value() instead of mb.content(data) (same pattern as Java's beforeSend())
  • Updated send() and send_async() docstrings to document None content

Dependencies (dependencies.yaml)

  • Bumped pulsar-cpp from 4.1.0 to 4.2.0

Tests (tests/pulsar_test.py)

  • test_null_value_message — send/receive null and non-null messages, verify has_null_value()
  • test_null_value_vs_empty_bytes — verify b"" and None are distinct
  • test_null_value_compaction — tombstoned keys disappear after topic compaction
  • test_null_value_table_view — tombstoned keys removed from TableView
  • test_null_value_with_properties — properties survive on null-value messages

Blocked on

This PR requires pulsar-client-cpp >= 4.2.0, which has not been released yet (v4.1.0 is the latest as of May 2026). The setNullValue()/hasNullValue() APIs are merged into C++ main but not yet in a release. CI will not pass until 4.2.0 ships.

References:

Made with Cursor

@grishaf grishaf force-pushed the feat/null-value-messages branch 3 times, most recently from dbe4502 to d293511 Compare May 19, 2026 08:21
@grishaf

grishaf commented May 19, 2026

Copy link
Copy Markdown
Contributor Author

Blocked on two upstream dependencies:

  1. Waiting for pulsar-client-cpp >= 4.2.0 release — the setNullValue() / hasNullValue() C++ APIs were merged into main (apache/pulsar-client-cpp#563) but not yet released. dependencies.yaml is set to 4.2.0 anticipating the release.

  2. Waiting for broker fix apache/pulsar#25817 — non-batched null-value messages are not removed during topic compaction due to a bug in extractKeyAndSize(). The Python compaction test (test_null_value_compaction) depends on this broker fix to pass.

@grishaf grishaf marked this pull request as ready for review June 16, 2026 07:45
@grishaf grishaf closed this Jun 16, 2026
@grishaf grishaf reopened this Jun 16, 2026
@grishaf

grishaf commented Jun 16, 2026

Copy link
Copy Markdown
Contributor Author

Update: both blockers from my earlier comment are now resolved on this branch.

  1. pulsar-client-cpp dependency — reverted dependencies.yaml back to the released 4.1.0 (commit 57e0a9f). The new set_null_value/has_null_value bindings compile and link fine against 4.1.0, and the CI run on that commit confirmed the build succeeds with 92/93 tests passing.
  2. Broker compaction fix — bumped the CI test broker to apachepulsar/pulsar:4.2.2 (commit 94bdd68), the latest 4.x release, which contains the null-value compaction fix from [fix][broker] Fix non-batched null-value messages not removed during topic compaction pulsar#25817 (shipped in release/4.2.2; also backported to 4.0.11 and 4.1.4). The previously failing test_null_value_compaction depended on this broker-side fix.

With these two commits the PR should be fully green. CI for the latest commit is currently in action_required (waiting on a maintainer to approve the workflow run on this fork PR) — could a committer kick it off? Thanks!

Add support for sending and detecting null value messages, which are
used as tombstones on compacted topics to delete entries for specific
keys. This wraps the C++ client's MessageBuilder::setNullValue() and
Message::hasNullValue() APIs added in pulsar-client-cpp#563.

Changes:
- Bump pulsar-cpp dependency to 4.2.0
- Add pybind11 bindings for set_null_value and has_null_value
- Allow Producer.send(None) to produce a null value message
- Add Message.has_null_value() to detect tombstone messages
- Skip schema encoding when content is None (mirrors Java client)
- Add integration tests for null values, compaction, and table view

Requires pulsar-client-cpp >= 4.2.0 (not yet released).

Co-authored-by: Cursor <cursoragent@cursor.com>
@grishaf grishaf force-pushed the feat/null-value-messages branch from 94bdd68 to 4321a5a Compare June 17, 2026 05:14
@BewareMyPower BewareMyPower added this to the 3.13.0 milestone Jun 17, 2026
@BewareMyPower BewareMyPower merged commit 5483ca9 into apache:main Jun 17, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants