Skip to content

test: improve arrow-row fuzz tests#9222

Merged
Jefffrey merged 6 commits intoapache:mainfrom
rluvaton:improve-tests-for-row-convertor-fuzz-test
Jan 30, 2026
Merged

test: improve arrow-row fuzz tests#9222
Jefffrey merged 6 commits intoapache:mainfrom
rluvaton:improve-tests-for-row-convertor-fuzz-test

Conversation

@rluvaton
Copy link
Copy Markdown
Member

@rluvaton rluvaton commented Jan 19, 2026

Which issue does this PR close?

N/A

Rationale for this change

Make the arrow-row more resilient to bugs

What changes are included in this PR?

  1. Make the fuzz tests deterministic
  2. Make the fuzz tests produce more unique values (by reusing the same rng which will modify it's internal state
  3. Test Boolean array as well in the fuzz tests
  4. Test different null behaviors in the fuzz tests
  5. test that different underlying nulls values doesn't change the output rows in the fuzz tests

Are these changes tested?

N/A

Are there any user-facing changes?

Nope


I suggest that when reviewing this code to hide whitespace changes to see actual code changes

1. Make them deterministic
2. Make them produce more unique values
3. Test Boolean array as well
4. Test different null behaviors
5. test that different underlying nulls values doesn't change the output rows
@github-actions github-actions Bot added the arrow Changes to the arrow crate label Jan 19, 2026
Comment thread arrow-row/src/lib.rs Outdated
))
}

fn change_underline_null_values_for_primitive<T: ArrowPrimitiveType>(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fn change_underline_null_values_for_primitive<T: ArrowPrimitiveType>(
fn change_underlying_null_values_for_primitive<T: ArrowPrimitiveType>(

Comment thread arrow-row/src/lib.rs
#[test]
#[cfg_attr(miri, ignore)]
fn fuzz_test() {
let mut rng = StdRng::seed_from_u64(42);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're fuzzing is it better to have differing seeds so we can test more random states or its better to be deterministic instead?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally I would say more test cases, BUT the impact if it starts to fail and someone just rerun it, it will remove trust in the test and will be ignored or just rerun to make the test pass rather than fixing the issue.

for example, think about some PR that updates a comment and this fails, so the author will have less trust in this test and when that author will touch arrow-row later in the future and that test fail again, they will just rerun the CI rather than fix the bug.

Ideally you would bound the seed to the current PR so:

  1. rerunning it will not help except creating a new PR which is less likely
  2. you can still reproduce

but it will be less intuitive how to reproduce the test

…fuzz-test' into improve-tests-for-row-convertor-fuzz-test
@Jefffrey Jefffrey merged commit 922a0c4 into apache:main Jan 30, 2026
14 checks passed
@Jefffrey
Copy link
Copy Markdown
Contributor

Thanks @rluvaton

@rluvaton rluvaton deleted the improve-tests-for-row-convertor-fuzz-test branch February 8, 2026 16:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arrow Changes to the arrow crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants