Skip to content

Fix IndexOutOfRangeException in FrozenHashTable when creating from huge collections#125555

Open
Copilot wants to merge 9 commits intomainfrom
copilot/fix-frozenset-creation-exception
Open

Fix IndexOutOfRangeException in FrozenHashTable when creating from huge collections#125555
Copilot wants to merge 9 commits intomainfrom
copilot/fix-frozenset-creation-exception

Conversation

Copy link
Contributor

Copilot AI commented Mar 14, 2026

ToFrozenSet()/ToFrozenDictionary() throws IndexOutOfRangeException for collections with more than ~1.07 billion items due to integer overflow in FrozenHashTable.

Description

In CalcNumBuckets, int minNumBuckets = uniqueCodesCount * 2 silently overflows for uniqueCodesCount > int.MaxValue / 2, producing a negative value. This causes the prime-table search loop to skip entirely (minPrimeIndexInclusive stays 0), leaving maxPrimeIndexExclusive = 0, and then primes[-1] throws.

State Working limit Behavior when exceeded
Before PR ~1.07B entries IndexOutOfRangeException (crash accessing primes[-1])
After PR ~1.07B entries OutOfMemoryException (clear, intentional)

Changes

  • CalcNumBuckets: Use long for minNumBuckets to prevent overflow. For inputs exceeding the prime table range, the existing early-return via HashHelpers.GetPrime(uniqueCodesCount) now fires correctly.

  • Create: Add an Array.MaxLength guard before ArrayPool.Rent(numBuckets + hashCodes.Length). Without it, the sum overflows int for ~1B+ item collections, producing a misleading ArgumentOutOfRangeException. Uses the same #if NET / #else fallback pattern as LengthBuckets.cs.

  • Tests: Added two [OuterLoop] regression tests — ToFrozenSet_LargeSet_ExceedsPrimeTable and ToFrozenDictionary_LargeDictionary_ExceedsPrimeTable — with 4M elements. These exercise the CalcNumBuckets early-return path (where uniqueCodesCount * 2 exceeds the precomputed primes table) and are practical for CI (~100MB memory). The integer overflow fix boundary (>1.07B items) was verified locally on a 64GB machine — the exception changed from IndexOutOfRangeException to OutOfMemoryException.

// Before: throws IndexOutOfRangeException in CalcNumBuckets
var frozen = new HashSet<int>(1_188_074_137) { ... }.ToFrozenSet();

// After: throws OutOfMemoryException (correct failure mode for infeasible allocation)
Original prompt

This section details on the original issue you should resolve

<issue_title>Creating a FrozenSet with from huge HashSet throws an exception</issue_title>
<issue_description>### Describe the bug
Creating a FrozenSet with from HashSet throws an exception

To Reproduce

using System.Collections.Frozen;

const int capacity = 1188074137;

var set = new HashSet<int>(capacity);
for (int i = 0; i < capacity; i++) set.Add(i);

Console.WriteLine("Forming frozen set");
try
{
    var frozen = set.ToFrozenSet();
    Console.WriteLine(frozen.Count);
}
catch (Exception ex)
{
    Console.WriteLine($"Handled exception: {ex}");
}

Exceptions (if any)

Handled exception: System.IndexOutOfRangeException: Index was outside the bounds of the array.
   at System.Collections.Frozen.FrozenHashTable.CalcNumBuckets(ReadOnlySpan`1 hashCodes, Boolean hashCodesAreUnique)
   at System.Collections.Frozen.FrozenHashTable.Create(Span`1 hashCodes, Boolean hashCodesAreUnique)
   at System.Collections.Frozen.Int32FrozenSet..ctor(HashSet`1 source)
   at System.Collections.Frozen.FrozenSet.CreateFromSet[T](HashSet`1 source)
   at System.Collections.Frozen.FrozenSet.ToFrozenSet[T](IEnumerable`1 source, IEqualityComparer`1 comparer)
   at Program.<Main>$(String[] args) in /home/cert/RiderProjects/FrozenBug/FrozenBug/Program.cs:line 13

Further technical details

details of dotnet --info

.NET SDK: Version: 11.0.100-preview.3.26163.101 Commit: 1b989af698 Workload version: 11.0.100-manifests.b985b9e9 MSBuild version: 18.6.0-preview-26163-101+1b989af69

Runtime Environment:
OS Name: cachyos
OS Version:
OS Platform: Linux
RID: linux-x64
Base Path: /home/cert/.dotnet/sdk/11.0.100-preview.3.26163.101/

.NET workloads installed:
There are no installed workloads to display.
Configured to use workload sets when installing new manifests.
No workload sets are installed. Run "dotnet workload restore" to install a workload set.

Host:
Version: 11.0.0-preview.3.26163.101
Architecture: x64
Commit: 1b989af698

.NET SDKs installed:
8.0.419 [/home/cert/.dotnet/sdk]
9.0.312 [/home/cert/.dotnet/sdk]
10.0.300-preview.0.26163.113 [/home/cert/.dotnet/sdk]
11.0.100-preview.3.26163.101 [/home/cert/.dotnet/sdk]

.NET runtimes installed:
Microsoft.AspNetCore.App 8.0.25 [/home/cert/.dotnet/shared/Microsoft.AspNetCore.App]
Microsoft.AspNetCore.App 9.0.14 [/home/cert/.dotnet/shared/Microsoft.AspNetCore.App]
Microsoft.AspNetCore.App 10.0.3 [/home/cert/.dotnet/shared/Microsoft.AspNetCore.App]
Microsoft.AspNetCore.App 11.0.0-preview.3.26163.101 [/home/cert/.dotnet/shared/Microsoft.AspNetCore.App]
Microsoft.NETCore.App 8.0.25 [/home/cert/.dotnet/shared/Microsoft.NETCore.App]
Microsoft.NETCore.App 9.0.14 [/home/cert/.dotnet/shared/Microsoft.NETCore.App]
Microsoft.NETCore.App 10.0.3 [/home/cert/.dotnet/shared/Microsoft.NETCore.App]
Microsoft.NETCore.App 11.0.0-preview.3.26163.101 [/home/cert/.dotnet/shared/Microsoft.NETCore.App]

Other architectures found:
None

Environment variables:
DOTNET_ROOT [/home/cert/.dotnet]

global.json file:
Not found

- JetBrains Rider 2026.1 EAP 8

Tested on .NET 8, .NET 9, .NET 10 and .NET 11

full log

❯ dotnet run --framework net11.0 --configuration Release FrozenBug.csproj
Forming frozen
Handled exception: System.IndexOutOfRangeException: Index was outside the bounds of the array.
   at System.Collections.Frozen.FrozenHashTable.CalcNumBuckets(ReadOnlySpan`1 hashCodes, Boolean hashCodesAreUnique)
   at System.Collections.Frozen.FrozenHashTable.Create(Span`1 hashCodes, Boolean hashCodesAreUnique)
   at System.Collections.Frozen.Int32FrozenSet..ctor(HashSet`1 source)
   at System.Collections.Frozen.FrozenSet.CreateFromSet[T](HashSet`1 source)
   at Program.$(String[] args) in /home/cert/RiderProjects/FrozenBug/FrozenBug/Program.cs:line 13
❯ dotnet run --framework net10.0 --configuration Release FrozenBug.csproj
Forming frozen
Handled exception: System.IndexOutOfRangeException: Index was outside the bounds of the array.
   at System.Collections.Frozen.FrozenHashTable.CalcNumBuckets(ReadOnlySpan`1 hashCodes, Boolean hashCodesAreUnique)
   at System.Collections.Frozen.FrozenHashTable.Create(Span`1 hashCodes, Boolean hashCodesAreUnique)
   at System.Collections.Frozen.Int32FrozenSet..ctor(HashSet`1 source)
   at System.Collections.Frozen.FrozenSet.CreateFromSet[T](HashSet`1 source)
   at Program.$(String[] args) in /home/cert/RiderProjects/FrozenBug/FrozenBug/Program.cs:line 13
❯ dotnet run --framework net9.0 --configuration Release FrozenBug.csproj
Forming frozen
Handled exception: System.IndexOutOfRangeException: Index was outside the bounds of the array.
   at System.Collections.Frozen.FrozenHashTable.CalcNumBuckets(ReadOnlySpan`1 hashCodes, Boolean hashCodesAreUnique)
   at System.Collections.F...
  • Fixes dotnet/runtime#125553

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

When creating a FrozenSet/FrozenDictionary from a collection with
more than ~1 billion items, `CalcNumBuckets` overflows int on the
computation `uniqueCodesCount * 2`, producing a negative value that
causes `primes[-1]` to be accessed.

Fix by using `long` for `minNumBuckets` in CalcNumBuckets, and add
an Array.MaxLength check in Create to throw OutOfMemoryException
instead of a confusing ArgumentOutOfRangeException when the combined
bucket+hashcode allocation would overflow int.

Co-authored-by: danmoseley <6385855+danmoseley@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix FrozenSet creation from HashSet to prevent exception Fix IndexOutOfRangeException in FrozenHashTable when creating from huge collections Mar 14, 2026
Copilot AI requested a review from danmoseley March 14, 2026 14:14
@danmoseley
Copy link
Member

@copilot you need to add tests. Make sure they do fail without the fix

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes an IndexOutOfRangeException in System.Collections.Frozen when building frozen sets/dictionaries from extremely large inputs by preventing integer overflow during bucket-count calculation and guarding against invalid pooled-array sizes.

Changes:

  • Update FrozenHashTable.CalcNumBuckets to compute the “minimum bucket count” using long to avoid int overflow for very large uniqueCodesCount.
  • Add a max-array-length guard in FrozenHashTable.Create before renting a combined ArrayPool<int> buffer sized numBuckets + hashCodes.Length.

These tests create collections with > int.MaxValue / 2 items, which is
the minimum count that triggers the overflow in `uniqueCodesCount * 2`
in CalcNumBuckets. Without the fix, both tests throw
IndexOutOfRangeException; with the fix they complete successfully.

Co-authored-by: danmoseley <6385855+danmoseley@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Mar 14, 2026

@copilot you need to add tests. Make sure they do fail without the fix

Added two [OuterLoop] regression tests in commit 86ab3ea:

  • FrozenSet_Generic_Tests_int.ToFrozenSet_HugeSet_NoIntegerOverflow in FrozenSetTests.cs
  • FrozenDictionary_Generic_Tests_int_int.ToFrozenDictionary_HugeDictionary_NoIntegerOverflow in FrozenDictionaryTests.cs

Both create a collection with 1,073,741,825 items (just over int.MaxValue / 2, the minimum that triggers the overflow in uniqueCodesCount * 2). I confirmed they fail without the fix — throwing IndexOutOfRangeException in FrozenHashTable.CalcNumBuckets — and pass with the fix. The tests require several GB of available memory so they're appropriately gated behind [OuterLoop].

Copilot AI requested a review from danmoseley March 14, 2026 14:55
@danmoseley danmoseley marked this pull request as ready for review March 14, 2026 16:07
Copilot AI review requested due to automatic review settings March 14, 2026 16:07
@danmoseley danmoseley enabled auto-merge (squash) March 14, 2026 16:07
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes an IndexOutOfRangeException in System.Collections.Frozen when building frozen sets/dictionaries from extremely large inputs by preventing integer overflow in bucket-count calculation and guarding large temporary allocations.

Changes:

  • Prevent overflow in FrozenHashTable.CalcNumBuckets by computing minNumBuckets as long.
  • Add an Array.MaxLength-based guard before renting the combined bucketStarts + nexts buffer in FrozenHashTable.Create.
  • Add two [OuterLoop] regression tests intended to cover the huge-collection scenario for ToFrozenSet / ToFrozenDictionary.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
src/libraries/System.Collections.Immutable/src/System/Collections/Frozen/FrozenHashTable.cs Uses long for bucket sizing math; adds max-length guard before renting the combined temporary buffer.
src/libraries/System.Collections.Immutable/tests/Frozen/FrozenSetTests.cs Adds an [OuterLoop] test attempting to freeze a ~1.07B element HashSet<int>.
src/libraries/System.Collections.Immutable/tests/Frozen/FrozenDictionaryTests.cs Adds an [OuterLoop] test attempting to freeze a ~1.07B entry Dictionary<int,int>.

@danmoseley
Copy link
Member

Checking for other places

The previous tests allocated 1B+ elements requiring tens of GB, making them
impractical for CI even as OuterLoop. Replace with 4M-element tests that
exercise the CalcNumBuckets early-return path (where uniqueCodesCount * 2
exceeds the precomputed primes table) with ~100MB memory.

These tests don't directly verify the integer overflow fix (which requires
>1.07B items to trigger) but add coverage for a previously untested code path.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Per @EgorBo's review: instead of an inline #if NET / 0x7FFFFFC7 / #endif
in FrozenHashTable.Create(), define the polyfill const in Polyfills.cs
under #if !NET (as a partial struct extension) and the NET property in
FrozenHashTable.cs. The usage site becomes a simple `ArrayMaxLength`
reference with no preprocessor guards.

Co-authored-by: EgorBo <523221+EgorBo@users.noreply.github.com>
@danmoseley
Copy link
Member

Could you request outer loop appropriately? I'm not familiar with which of the several flavors to use here

Copilot AI requested a review from EgorBo March 14, 2026 23:38
@danmoseley
Copy link
Member

The tests i added do not require anything like that memory. Sorry if that wasn't clear above. And I found in practice it used up to 40Gb!

@EgorBo
Copy link
Member

EgorBo commented Mar 14, 2026

/azp list

@azure-pipelines

This comment was marked as resolved.

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Reverts commit 6a9eeae: the partial struct extension in Polyfills.cs
and matching #if NET property in FrozenHashTable.cs are removed.
FrozenHashTable.cs is restored to using the inline #if NET / #else /
0x7FFFFFC7 / #endif pattern as in commit 339f322.

Co-authored-by: EgorBo <523221+EgorBo@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Mar 14, 2026

Could you request outer loop appropriately? I'm not familiar with which of the several flavors to use here

The tests are already marked with [OuterLoop("Allocates a large collection")], which is what the outerloop.yml pipeline picks up via -testscope outerloop. To trigger it on this PR, you can comment /azp run runtime-libraries-outerloop — it also runs automatically on a daily schedule against main.

- Add blank line after OOM throw block (stephentoub nit)
- Early-return in CalcNumBuckets when minNumBuckets + hashCodes.Length
  exceeds Array.MaxLength, avoiding the expensive collision-counting
  loop for sizes that will fail in Create anyway

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@danmoseley
Copy link
Member

addressed

// exercising the CalcNumBuckets early-return path.
// The integer overflow fix for >1B items (long minNumBuckets)
// cannot be practically tested without multi-GB allocations.
const int count = 4_000_000;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the test. Why is a dictionary with only 4M elements going to hit the minNumBuckets + hashCodes.Length > Array.MaxLength code path?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch -- it does not hit the Array.MaxLength path. It exercises the prime-table-exceeded early return (largest precomputed prime is 7,199,369, so 4M * 2 overflows the table). Fixed the comment in e169e7e to say that accurately.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot made these replies, I did not tell it to do so, only push. sigh.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't understand. Are you saying this test is for something unrelated to this PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, this is unrelated to the PR (description attempted to note this). It's testing the path when min buckets needed is larger than largest precomputed prime.

But, now I see it seems this IS covered for FrozenDictionary, so removed current two and just added a test for FrozenSet that is the same pattern.

Meanwhile as noted, this PR fix cannot be tested without using 20-40GB memory: so I tested it locally.

-- Dan

danmoseley and others added 2 commits March 15, 2026 08:22
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…tionary test

Remove the two custom 4M-element ExceedsPrimeTable tests (FrozenDictionary
and FrozenSet) which were redundant with the existing 8M ulong dictionary
test. Add a matching CreateHugeSet_Success test in
FrozenSet_Generic_Tests_ulong to provide parallel FrozenSet coverage of the
same large-collection code path.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
[OuterLoop("Takes several seconds")]
[Theory]
[InlineData(8_000_000)]
public void CreateHugeSet_Success(int largeCount)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dictionary one has AllowVeryLargeSizes. Set doesn't have that variable, but it's redundant anyway as in neither case is the fixture subclassed, and the purpose of this I guess is to limit executions of slow tests.

Copy link
Member

@danmoseley danmoseley Mar 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these comments are by Dan.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants