Skip to content

Test: ResolverGather_TimeoutFromPrimaryRepositoryThrows is flaky under CPU contention #14833

@dtivel

Description

@dtivel

Issue

Test ResolverGather_TimeoutFromPrimaryRepositoryThrows intermittently fails with:

 Assert.Throws() Failure: Exception type was not an exact match
 Expected: typeof(System.InvalidOperationException)
 Actual:   typeof(System.OperationCanceledException)

Root cause

GatherPackageAsync(...) catches OperationCanceledException and wraps it in InvalidOperationException for source timeouts. However, three token.ThrowIfCancellationRequested() calls outside that try...catch race with the worker task's exception conversion:

  • GatherAsync(...) while loop (line 132)
  • StartTasksAndProcessWork(...) do loop (line 260)
  • GatherPackageAsync(...) entry (line 393)

Under CPU contention, these fire before the worker task completes and ProcessResultsAsync(...) can surface the converted InvalidOperationException, producing a raw OperationCanceledException instead.

Evidence

CI failure on dev branch (March 20, 2026):
https://dev.azure.com/dnceng-public/public/_build/results?buildId=1345784&view=results

Stack trace from that build:

 Failed NuGet.Test.ResolverGatherTests.ResolverGather_TimeoutFromPrimaryRepositoryThrows [1 s]
 Error Message:
  Assert.Throws() Failure: Exception type was not an exact match
 Expected: typeof(System.InvalidOperationException)
 Actual:   typeof(System.OperationCanceledException)
 ---- System.OperationCanceledException : The operation was canceled.
 Stack Trace:
    at System.Threading.CancellationToken.ThrowOperationCanceledException()
    at System.Threading.CancellationToken.ThrowIfCancellationRequested()
    at NuGet.PackageManagement.ResolverGather.GatherPackageAsync(GatherRequest request,
       CancellationToken token) in ...ResolverGather.cs:line 393
    at NuGet.PackageManagement.ResolverGather.<>c__DisplayClass27_1.<<StartWorkerTasks>b__0>d.MoveNext()
       in ...ResolverGather.cs:line 381

PR NuGet/NuGet.Client#7241 also hit this same failure.

Local reproduction using single-core CPU affinity (ProcessorAffinity = 1), 10 iterations per branch:

Branch Failure rate
dev 4/10

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions