Skip to content

Add tests for WaveActiveAllEqual#930

Open
bob80905 wants to merge 11 commits intollvm:mainfrom
bob80905:add_waveactiveallequal_tests
Open

Add tests for WaveActiveAllEqual#930
bob80905 wants to merge 11 commits intollvm:mainfrom
bob80905:add_waveactiveallequal_tests

Conversation

@bob80905
Copy link
Contributor

@bob80905 bob80905 commented Mar 4, 2026

This PR adds tests for WaveActiveAllEqual.
It does not test matrices.
It tests ints and int2s for wavesizes 32, 64, and 128.
Fixes #915


# RUN: split-file %s %t
# RUN: %dxc_target -T cs_6_6 -Fo %t.o %t/source.hlsl
# RUN: %offloader %t/pipeline.yaml %t.o No newline at end of file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the file endings look weird

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also add tests for non-int types? Is an aggregate type interesting here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thought the wave tests would be sufficient to test things work appropriately, but yes, the issue does say we should test all overload types as well.

Copy link
Contributor

@tex3d tex3d left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test design seems odd, and verbose for the scenarios it tests. It seems like you could test more interesting scenarios even outside the explicit WaveSize based tests.

{
unsigned int index = 0;
bool Result = WaveActiveAllEqual(In[TID.x]);
InterlockedAdd(Out[index++], (unsigned int)Result);
Copy link
Contributor

@inbelic inbelic Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like we shouldn't depend on InterlockedAdd in a unit test for WaveActiveAllEqual. And we do need to test that that the false case runs correctly.

Couldn't we do something like:

bool Result0 = WaveActiveAllEqual(In0[TID.x]);
if (TID.x == 0) {
  Out[0] = Result ? 10 : 5;
}
bool Result1 = WaveActiveAllEqual(In1[TID.x]);
if (TID.x == 1) {
  Out[1] = Result ? 10 : 5;
}

So then we should expect 10 if true and 5 if false. Then fill the output buffer with 0 at the start to ensure that the values are being overriden as expected.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hear you. My reasoning is:

  1. The .Wave tests test the builtin when it returns false. Its purpose is to, rather than test a variety of data types, test different return values and more of a variety of flow control conditions.
  2. Your alternative looks like it's only testing one lane per input resource. I would like to verify that all lanes return true in these tests. And with this structure, if I were to test all the lanes, I would need to repeat that structure 42 times, and it would be a long shader and maybe kind of ugly.
  3. Though depending on InterlockedAdd is not ideal, I don't see a way to avoid a data race in checking the output of all 4 threads, without blowing up the outputs to check by a factor of 4. My hunch is that a driver implementation of InterlockedAdd is not that complicated, and we should be able to have separate tests for it that verify it works for all targets. Assuming our InterlockedAdd tests will be complete in coverage, and considering the simple use case, I don't think it will get in the way of testing the builtin.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Ah sorry, missed the .wave when reviewing on just the most recent commits. Makes sense.
    2 & 3. I guess I would be of the opinion that checking an individual lane in each invocation would be sufficient to avoid them, but I won't block on this.

{
unsigned int index = 0;
bool Result = WaveActiveAllEqual(In[TID.x]);
InterlockedAdd(Out[index++], (unsigned int)Result);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Ah sorry, missed the .wave when reviewing on just the most recent commits. Makes sense.
    2 & 3. I guess I would be of the opinion that checking an individual lane in each invocation would be sufficient to avoid them, but I won't block on this.

Copy link
Contributor

@tex3d tex3d left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan on the use of InterlockedAdd. I'm even less of a fan considering: it's not implemented in Clang yet, you're blocked on Intel and Vulkan due to some bug there.

Alternatives include:

  • Using another Wave intrinsic, like WaveActiveCountBits, to gather values across the lanes, then do a normal write to the buffer on the first lane.
  • Writing out from each lane.

You could also reduce the output by summing the components of vector results. This could keep output for the write-each-lane option small. It also makes it easier to see the separate vector sizes in the output, since the expected value will be scaled by vector size.

Comment on lines +15 to +20
// Expect all trues, all elements will be the same.
// Output buffers start off as true, and each
// lane writes its result value anded with the existing
// result in the output buffer.
// Since we expect all results to be true, output buffers
// should remain all true at the end.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is no longer correct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add test for WaveActiveAllEqual

3 participants