Skip to content

[config] Add validation for numeric config fields#449

Open
Prajwal-banakar wants to merge 2 commits intoapache:mainfrom
Prajwal-banakar:Config-validation
Open

[config] Add validation for numeric config fields#449
Prajwal-banakar wants to merge 2 commits intoapache:mainfrom
Prajwal-banakar:Config-validation

Conversation

@Prajwal-banakar
Copy link
Copy Markdown
Contributor

Purpose

Linked issue: close #385

Add a validate_numeric_fields() method to Config that validates all numeric fields have sensible values (e.g. sizes > 0, timeouts >= 0). Validation is called during FlussConnection::new() alongside the existing validate_security() and validate_scanner_fetch() checks.

Brief change log

  • Added validate_numeric_fields() in config.rs covering 10 numeric fields
  • Called validate_numeric_fields() in FlussConnection::new() in connection.rs
  • Added 11 unit tests covering default validity and each invalid case

Tests

cargo fmt --all
cargo clippy -p fluss-rs --all-features
cargo test -p fluss-rs

API and Format

API: No breaking changes
Storage format: No

Documentation

No doc changes needed.

@Prajwal-banakar
Copy link
Copy Markdown
Contributor Author

Prajwal-banakar commented Apr 1, 2026

Hi @fresh-borzoni PTAL when you get a chance, Thanks!

Copy link
Copy Markdown
Contributor

@fresh-borzoni fresh-borzoni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Prajwal-banakar Ty for the PR, left some comments, PTAL

if self.scanner_remote_log_prefetch_num == 0 {
return Err("scanner_remote_log_prefetch_num must be > 0".to_string());
}
if self.scanner_remote_log_read_concurrency == 0 {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have .max(1) in remote_log.rs, let's remove it from there

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the .max(1) fallback from remote_log.rs

if self.connect_timeout_ms == 0 {
return Err("connect_timeout_ms must be > 0".to_string());
}
Ok(())
Copy link
Copy Markdown
Contributor

@fresh-borzoni fresh-borzoni Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to have these checks as well:

  • writer_batch_size <= writer_request_max_size - otherwise batches never drain as they exceed max size defined for request
  • writer_batch_size <= writer_buffer_memory_size - or If a single full batch exceeds max_size, it doesn't fit buffer and just keeps piling, we have runtime check for it, but better to validate at config level

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the requested checks in config validation

.map_err(|msg| Error::IllegalArgument { message: msg })?;
arg.validate_scanner_fetch()
.map_err(|msg| Error::IllegalArgument { message: msg })?;
arg.validate_numeric_fields()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's structure this better:

  • validate_scanner - to include all scanner related configs
  • validate_writer - to inclide writer configs and idempotence checks
  • validate_security - sasl checks

so that it's more scoped and it's harder to miss smth

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored validation into scoped methods

if self.scanner_remote_log_read_concurrency == 0 {
return Err("scanner_remote_log_read_concurrency must be > 0".to_string());
}
if self.scanner_log_max_poll_records == 0 {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in Java it's allowed, though arguably makes no sense - we need to match, let's drop this for now and file an issue in main repo.
The same for connect_timeout_ms

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can open a follow-up issue in the main Fluss repo

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to get it merged in the main repo first, and then we can add additional stuff here, since we plan to ship more features.
Just remove it and leave some comment for now

Copy link
Copy Markdown
Contributor

@fresh-borzoni fresh-borzoni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Prajwal-banakar Ty for the update, 2 small comments

  • Java parity
  • remove validate_idempotency call from writer_client and just leave validate_writer during connection initialization

Mb later we wish to have validation during config creation, but this is out of scope here

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we remove this method then?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, removed

@Prajwal-banakar
Copy link
Copy Markdown
Contributor Author

Hi @fresh-borzoni thank you for the review, i'll open a issue in main repo, Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Config validation for numeric fields

2 participants