Skip to content

[Draft] Use hybrid C# reference map for post-processing#10976

Draft
live1206 wants to merge 46 commits into
microsoft:mainfrom
live1206:mtg-hybrid-reference-map
Draft

[Draft] Use hybrid C# reference map for post-processing#10976
live1206 wants to merge 46 commits into
microsoft:mainfrom
live1206:mtg-hybrid-reference-map

Conversation

@live1206

@live1206 live1206 commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds a hybrid reference-map replacement for C# generated-code post-processing.

The hybrid path replaces broad Roslyn reference-map construction with:

  • provider metadata for generated-code references
  • explicit provider dependencies for known generated body-only references
  • a small Roslyn scan only for custom/shared code roots

No benchmark measurement/profiling code is included in the production generator path in this PR.

Why

The earlier experimental PR measured full generation and identified Roslyn reference-map construction inside GeneratedCodeWorkspace.PostProcessAsync() as the largest hotspot.

This PR keeps generated output parity with the Roslyn cleanup path while moving generated-code reachability to provider metadata.

Latest Benchmark Data

Latest data from benchmark PR #10885 after porting the current #10976 hybrid analyzer changes (including generated body invocation edges and base-preserved reachability).

Benchmark output root: /tmp/typespec-final-hybrid-bench-20260624-0943.

Full-generation BenchmarkDotNet, averaged across 3 local runs:

Mode Avg Mean Avg Allocated
Roslyn reference maps 895.8 ms 67.78 MB
Provider reference map 859.9 ms 57.75 MB

Approximate full-generation improvement:

Time:       ~4.0% faster
Allocation: ~14.8% less

Focused profile data, using 72 Roslyn-mode and 72 provider-mode invocations from the same runs (median):

Path Median Time Median Allocated
Roslyn reference-map construction 357.2 ms 23.70 MB
Provider map analysis + candidate consumption 263.2 ms 17.83 MB
Provider candidate consumption only 0.79 ms 0.06 MB

Approximate focused reference-map improvement:

Time:       ~26.3% faster
Allocation: ~24.8% less

Profile notes:

  • Data comes from POSTPROCESSING_BENCHMARK_PROFILE_STEPS=true on the full-generation benchmark so both paths run against real TypeProvider output.
  • Roslyn rows are PostProcessor.Internalize.BuildPublicReferenceMapAsync and PostProcessor.Remove.BuildAllReferenceMapAsync.
  • Provider rows are Generation.ProviderReferenceMapShadowAnalysis, PostProcessor.Internalize.UseShadowCandidates, PostProcessor.Internalize.UseShadowPublicizeCandidates, PostProcessor.Remove.UseShadowCandidates, and PostProcessor.Remove.BuildShadowReferencedSet.
  • Runtime: .NET 10.0.9, Ubuntu 26.04, AMD EPYC 7763.

Correctness Notes

The hybrid implementation preserves Roslyn cleanup behavior for generated output parity:

  • model factory signatures and bodies do not keep otherwise-unused models alive
  • MRW context/buildable attributes do not keep buildable-only models alive
  • serialization providers are removable together with their owning model
  • retained serialization providers report explicit helper dependencies such as ChangeTrackingDictionary and Optional
  • collection-result providers report explicit body dependencies instead of relying on Roslyn body scanning
  • client providers report explicit body dependencies for collection results, service method types, operation parameters, and operation response body/header types
  • rest-client providers report explicit helper dependencies for generated collection parameter null checks, including ChangeTrackingList and ChangeTrackingDictionary
  • generated body-only references are still handled for static helpers
  • public discriminator subtypes stay public by matching Roslyn public-reference-map derived-class behavior
  • current union/variant roots participate in internalization, matching Roslyn _typesToKeep behavior
  • remove reachability includes current discriminator derived-model edges after root discovery
  • previous generated files under src/Generated are not scanned; reachability is based on current providers/generated workspace plus custom/API roots, matching Roslyn's workspace inputs
  • internal/non-public discriminator constructor/property references do not make discriminator enum types public

Azure SDK DPG Regen Note

Azure/azure-sdk-for-net#60128 completed a full DPG regen with no sdk/**/api/** changes.

The regen has a few extra generated internal model/serialization files. This is expected: the hybrid provider map keeps conservative client body dependencies from service metadata to avoid broad Roslyn body-reference scans and preserve correctness for body-only generated references. These files are internal implementation details and do not affect public API.

Validation

Local validation performed while stabilizing the PR included:

  • dotnet build packages/http-client-csharp/generator/Microsoft.TypeSpec.Generator/src/Microsoft.TypeSpec.Generator.csproj -c Release
  • focused generator tests covering post-processing/workspace/customization scenarios
  • full generator test assembly: 1530/1530 passed
  • regenerated and built previously failing generated projects/scenarios, including:
    • Sample-TypeSpec
    • Spector/http/authentication/api-key
    • Spector/http/parameters/collection-format
    • Spector/http/documentation
    • Spector/http/special-headers/repeatability
    • discriminator inheritance scenarios used by Spector tests

Implemented Generated Dependency Handling

This PR now avoids Roslyn body scanning for several generated cases:

  • collection-result body dependencies are reported by CollectionResultDefinition
  • client body dependencies are reported by ClientProvider
  • rest-client helper dependencies are reported by RestClientProvider
  • serialization provider helper dependencies are reported by serialization providers
  • model factory is treated specially so unreachable model factory methods do not root models
  • non-root MRW context/buildable attributes are excluded from model reachability

Custom/shared code references still use Roslyn because arbitrary user C# can reference generated types in ways providers cannot reliably describe.

Follow-Up Performance Opportunities

Potential next improvements, ordered by reward/risk:

Rank Improvement Reward Risk Notes
1 Precompute name lookup maps for AddMatchingName Medium Low Avoid repeated full-node scans for helper/root matching.
2 Cache flattened provider lists and provider names Medium Low Avoid repeated lazy provider/name materialization during analysis.
3 Conservative custom-code syntax prefilter Medium Medium/High Can reduce custom Roslyn semantic work but must not miss arbitrary custom references.

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

Latest Safe Provider-Map Optimization (2026-06-24)

Pushed safe optimization commits:

  • Production PR: 88bea5758
  • Benchmark PR: 7b3c0f1e5

Changes:

  • Removed the global Roslyn SymbolFinder.FindReferencesAsync helper lookup from generated-body dependency discovery.
  • Kept correctness by scanning current generated body syntax for both type references and invocation containing types, even when provider body dependencies are available.
  • Reused the flattened generated-provider list during graph construction/base-preserved dependency traversal.
  • Skipped XML doc parsing unless the doc contains a type cref marker.

Safe two-pass focused profile aggregate (/tmp/typespec-provider-map-safe-final-20260624-1320, /tmp/typespec-provider-map-safe-final2-20260624-1320) compared with the previous hotspot baseline (/tmp/typespec-provider-map-hotspots-20260624-1320):

Path Before Median After Median Delta
Provider map analysis + candidate consumption 231.5 ms, 18.03 MB 188.8 ms, 7.89 MB ~18.4% faster, ~56.2% less allocation
Provider analysis only 231.3 ms, 17.98 MB 188.0 ms, 7.83 MB ~18.7% faster, ~56.4% less allocation

Validation after the safe correction:

  • npm run build:generator
  • pwsh ./eng/scripts/Generate.ps1 -filter Sample-TypeSpec with no checked-in generated diffs

Latest 3-Run Benchmark Data (2026-06-24 14:03 UTC)

Benchmark branch: mtg-manual-name-reduction-experiment at 7b3c0f1e5. Artifacts: /tmp/typespec-latest-safe-bench-20260624-1403/run-{1,2,3}.

Full-generation BenchmarkDotNet

Mode Run means Avg mean Avg allocated
Roslyn reference maps 1295.0, 1029.8, 1633.4 ms 1319.4 ms 67.52 MB
Provider reference map 1008.0, 785.1, 795.0 ms 862.7 ms 47.95 MB

Approximate full-generation improvement:

Time:       ~34.6% faster
Allocation: ~29.0% less

Focused reference-map profile

Profile data uses 71 Roslyn-mode and 71 provider-mode invocations from the same three runs (median):

Path Median Time Median Allocated
Roslyn reference-map construction 429.7 ms 23.65 MB
Provider map analysis + candidate consumption 198.7 ms 7.87 MB
Provider analysis only 197.0 ms 7.81 MB
Provider candidate consumption only 0.98 ms 0.06 MB

Approximate focused reference-map replacement improvement:

Time:       ~53.8% faster
Allocation: ~66.7% less

Latest Helper-Fix Benchmark Data (2026-06-24 14:36 UTC)

Benchmark branch: mtg-manual-name-reduction-experiment at b14b1f282. Artifacts: /tmp/typespec-helper-fix-bench-20260624-1436/run-{1,2,3}. This includes the CI helper-root fix that explicitly preserves generated ClientUriBuilder, ClientPipelineExtensions, and CancellationTokenExtensions without restoring the broad Roslyn SymbolFinder helper lookup.

Full-generation BenchmarkDotNet

Mode Run means Avg mean Avg allocated
Roslyn reference maps 1025.7, 881.6, 1007.9 ms 971.7 ms 67.64 MB
Provider reference map 809.8, 693.8, 842.7 ms 782.1 ms 48.12 MB

Approximate full-generation improvement:

Time:       ~19.5% faster
Allocation: ~28.9% less

Focused reference-map profile

Profile data uses 72 Roslyn-mode and 72 provider-mode invocations from the same three runs (median):

Path Median Time Median Allocated
Roslyn reference-map construction 349.0 ms 23.57 MB
Provider map analysis + candidate consumption 172.6 ms 7.88 MB
Provider analysis only 171.3 ms 7.82 MB
Provider candidate consumption only 0.80 ms 0.06 MB

Approximate focused reference-map replacement improvement:

Time:       ~50.6% faster
Allocation: ~66.6% less

@microsoft-github-policy-service microsoft-github-policy-service Bot added the emitter:client:csharp Issue for the C# client emitter: @typespec/http-client-csharp label Jun 12, 2026
@pkg-pr-new

pkg-pr-new Bot commented Jun 12, 2026

Copy link
Copy Markdown

Open in StackBlitz

npm i https://pkg.pr.new/@typespec/http-client-csharp@10976

commit: 18b8131

@github-actions

Copy link
Copy Markdown
Contributor

No changes needing a change description found.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@azure-sdk-automation

azure-sdk-automation Bot commented Jun 19, 2026

Copy link
Copy Markdown

You can try these changes here

🛝 Playground 🌐 Website 🛝 VSCode Extension

live1206 and others added 17 commits June 19, 2026 09:38
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

emitter:client:csharp Issue for the C# client emitter: @typespec/http-client-csharp

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant