Skip to content

Conversation

@FBumann
Copy link
Contributor

@FBumann FBumann commented Jan 24, 2026

Disclaimer

Im not sure how much value this adds, as i didnt use masking that much before. Im happy to discuss the ups and downsides of such a feature. I would have raised an issue, but i thought adding a benchmark and proof of concept would be more valuable.

Summary

Adds auto_mask parameter to Model that automatically masks variables and constraints where input data contains NaN values.

# Before: manual masking required
m = Model()
mask = gen_capacity.notnull()
x = m.add_variables(lower=0, upper=gen_capacity, mask=mask, name="x")

# After: automatic
m = Model(auto_mask=True)
x = m.add_variables(lower=0, upper=gen_capacity, name="x")  # NaN auto-masked

Benefits

  • Convenience: No manual mask creation/tracking
  • Safety: Can't forget to mask NaN values
  • Performance: Uses optimized numpy internals (np.where instead of xarray where = 38x faster)

Auto-mask conditions

Variables masked out where:

  • lower is NaN, OR
  • upper is NaN

Constraints masked out where:

  • All variable references are invalid (null expression), OR
  • rhs is NaN

Performance (30% NaN, includes external mask creation)

Variables

Potential Active No Mask Manual Auto
10K 7K 14ms 2ms 2ms
100K 70K 2ms 2ms 2ms
500K 350K 3ms 3ms 3ms
2M 1.4M 6ms 6ms 5ms

Constraints

Count Terms No Mask Manual Auto
5K 500 17ms 16ms 14ms
10K 1000 44ms 45ms 30ms
20K 1000 109ms 61ms 46ms
50K 500 171ms 101ms 61ms

Auto-mask is equal or faster than manual masking due to optimized numpy operations.

Checklist

  • Code documented
  • Tests added (62 pass)
  • Release notes added
  • MIT license consent

@FBumann FBumann changed the title Feature/auto masking1 Add auto_mask parameter to Model class Jan 24, 2026
Performance improvements:
- Use np.where() instead of xarray where() for mask application (~38x faster)
- Use max() == -1 instead of all() == -1 for null expression check (~30% faster)

These optimizations make auto_mask have minimal overhead compared to manual masking.
The switch from xarray's where() to numpy's where() broke dimension-aware
broadcasting. A 1D mask with shape (10,) was being broadcast to (1, 10)
instead of (10, 1), applying to the wrong dimension.

Fix: Explicitly broadcast mask to match data.labels shape before using np.where.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant