-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Add 'Eliminate Duplicate SDK Files' one-pager #51886
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds comprehensive documentation for the "Eliminate Duplicate SDK Files" initiative, which aims to reduce .NET SDK installation size by removing duplicate assemblies. The proposal addresses a significant bloat issue where duplicates account for 35% of the SDK size on Linux x64 (53 MB compressed, 140 MB on disk), impacting high-volume download scenarios like container images and CI/CD pipelines.
Key changes:
- Comprehensive analysis of duplicate files across .NET SDK versions with detailed metrics and categorization
- Proposed technical approach for consolidating assemblies into a shared location and updating component loading mechanisms
- Performance impact analysis showing 23% faster downloads and extraction times
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
Do we need to consider AOT strategy? Which parts of the SDK are we going to AOT? E.g. if we AOT dotnet-watch we can't load Roslyn binaries from the shared location since they would be part of the dotnet-watch native binary. |
@JeremyKuhne and @baronfel are working on the AOT strategy. They're aware of this effort. |
JeremyKuhne
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good from my perspective.
|
|
||
| ## Customer Impact: Why SDK Size Matters | ||
|
|
||
| While we often envision the .NET SDK as something installed once on a developer's machine, the reality is that most SDK installations occur in ephemeral, high-volume scenarios where the SDK is repeatedly downloaded and extracted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@baronfel do we have any rough stats that we can add here to give this statement more weight? For example, the ratio of when the .NET SDK is downloaded in a CI / CD pipeline vs. everything else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can reach out to the Actions folks to get usage rates for the setup-dotnet action? Or the .NET releases team for CDN download rates of the install scripts themselves.
We do have rates of dotnet CLI usage in CI pipelines vs in 'interactive' use that could be used as a proxy.
|
|
||
| Containers represent one of the largest and most measurable areas of impact. Official .NET SDK container images are pulled approximately [750,000 times per week](https://msit.powerbi.com/groups/6b5ffb99-5fd3-492b-bd02-724f09fe9eff/reports/7e5d7fef-a86c-4f94-8aa3-d356c3125ee0?ctid=72f988bf-86f1-41af-91ab-2d7cd011db47&pbi_source=linkShare&bookmarkGuid=f44da1fd-c619-4158-aa51-f050b379a2b3). | ||
| When developers build within containers—whether for local development or CI/CD pipelines—they're pulling that full SDK image. | ||
| A 50 MB reduction in compressed size translates to 37.5 TB per week in bandwidth saved from container pulls alone. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great use of data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's clearly a win. However, I'd suggest that the data at rest in the MCR registry is a much higher value win. Can you guestimate that win for a 3-year LTS release?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, I'd suggest that the data at rest in the MCR registry is a much higher value win
@richlander, I'm not following what you mean by this. Can you try expanding or rewording for me? TIA.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
translates to 37.5 TB per week in bandwidth
That's "bandwidth". What about registry "storage"? I agree that bandwidth is important.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, yes I can take a stab at guestimating that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@baronfel - do you have any input here? ? Would having two archive types on Windows cause confusion? I feel like we would have a lot of users continue to use the zip archives. Would we ever get to a state where we would eliminate them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a really interesting question - Windows 11 natively supports tarballs (previous versions do not), and older versions of Windows are no longer in standard support by Microsoft. Presumably this work would be oriented for .NET 11, so we could say that since .NET 11 is only supported on Windows 11 and above that we would only produce tarballs from that point on. With a breaking change notice I think we could swing it - especially if we updated the install-scripts to account for this since I'm bet they are doing zip-based detection and updated the 'manual installation' documentation for Windows to account for the tarballs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How many people actually actively download the zip? My assumption is the 99% use case of the tar / zip is through dotnet-install.ps1. If we changed that script to handle tar would zip really matter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is customary for Windows Dockerfiles installing dotnet to use zips versus tooling like the install-scripts. At least for scenarios in which the resulting images are public. Using direct zips/tarballs provides more transparency of where content is coming from over install-scripts. Because we utilize this pattern in the official .NET docker images, they get copied frequently by customers needing various customizations.
These customer would have to make respond to the change when they upgrade to 11.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
amazingly, it looks like your example @MichaelSimons uses the tar utility so if $dotnet_file was changed there to the tarball it would still work! but yes, this is one of many long-tail scenarios we'd had to have examples/updates ready for. This seems like the kind of change we could do by preview1/preview2 and evaluate the impact of.
| The overall direction of this effort is to eliminate the vast majority of duplicate assemblies within the .NET SDK so that each shared dependency is carried only once. | ||
| There may be a few special cases where different versions, etc. need to be retained. | ||
| Achieving this requires solving two distinct but related problems. | ||
| First, from a runtime and execution perspective, SDK components must be able to reliably load a single shared copy of each assembly from a common location. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Part of the problem with a shared location is that it makes it easy to take dependencies the component wasn't intending. Essentially once a DLL is loaded from a folder there are a number of ways where the runtime can start resolving other DLLs from the same folder. Think part of the statement here should be that it loads declared dependencies from a single common location.
Another way of stating this: if my component is currently failing to load X.dll, it shouldn't start succeeding to load X.dll because it was put in the common location.
|
|
||
| **Directory Structure Considerations:** | ||
|
|
||
| While version folders aren't needed, we must account for framework versus core components. Some framework-specific assemblies are shared and would need to be placed in a subdirectory to distinguish them from core assemblies (e.g., `shared/net472/` for framework components, with core assemblies directly in `shared/`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Putting all core assemblies into one folder means that we cannot build the same component with two different TFMs. Is that happening today in any places (seems like your tool could detect it). If this is the intended design do we call this out anywhere as an explicit requirement?
I can't think of any cases off hand where we necessarily need two different core TFM of the same DLL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some cases - IIRC the HotReload utils libraries do build for non-current TFMs so they can be reused in a broader array of VS environments. cc @tmat for thoughts there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, dotnet watch has a couple of multi-targeted binaries. But those would not be shared anyways.
AOT is orthogonal. It has no relationship to this and doesn't need to be considered. This is inherently a CoreCLR targeted project. |
Why is that? If we AOT dotnet-watch then dotnet-watch can't/won't be loading Roslyn assemblies from the shared location. |
Let's unpack this. I think you are saying that if we did the AOT project now, it would be status-quo w/rt "duplication". If we do this project, then AOT re-creates duplication. In reality, it still has nothing to do with this project. It makes AOT more (relatively) expensive than it would have been. I don't see a single design choice or tradeoff here that is influenced by the choice to AOT some fo the SDK, in this release or the future. Can you outline any design considerations? Another way to look at this is that de-duplication is buying us some budget to spend on AOT. This is in fact how I've always seen this project. |
|
I see what you mean. What I had on mind is that if we do deduplication now we will need to undo it later for projects that get AOT'd. Let's say there are only two components that share Roslyn: dotnet-watch and dotnet-format. We implement loading of Roslyn assemblies from shared location for both. Then we decide to AOT both. At that point we will remove loading the assemblies and also remove the shared assemblies from the shared location because they won't be loaded by any components anymore. In the limit - if we decide to AOT everything than we don't need sharing at all. |
|
Ah. That context is useful. I didn't see that one as a design consideration. That leaves two options:
Another question is why do we have some of these tools in the SDK. IMO, it would be great to remove all tools. My ideal experience is that there are tool packs, exactly like the VS Code extension packs. We could have a gesture to acquire common curated tools as a single gesture. And then much of this discussion goes away. Seems like a missing feature resolves a bunch of problems. |
| | Microsoft.CodeAnalysis.CSharp.dll | neutral | net9.0 | 3 | 24.7 | | ||
| | Microsoft.CodeAnalysis.dll | neutral | net9.0 | 3 | 10.8 | | ||
| | Microsoft.CodeAnalysis.Features.dll | neutral | net9.0 | 2 | 5.3 | | ||
| | Newtonsoft.Json.dll | neutral | net6.0 | 7 | 5.2 | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should solve this not by deduplicating, but removing this reference entirely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's convenient to say but potentially impractical if it's coming from a dependency we don't control. Have you verified which is the case here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, I checked before I wrote this and I believe most sources are addressable. The SDK itself is the biggest unknown.
The full list is
/Users/andy/Library/Application Support/dnvm/dn/sdk/9.0.306/Sdks/Microsoft.NET.Sdk.Razor/tools/Newtonsoft.Json.dll
/Users/andy/Library/Application Support/dnvm/dn/sdk/9.0.306/Sdks/Microsoft.NET.Sdk/tools/net472/Newtonsoft.Json.dll
/Users/andy/Library/Application Support/dnvm/dn/sdk/9.0.306/TestHostNetFramework/Newtonsoft.Json.dll
/Users/andy/Library/Application Support/dnvm/dn/sdk/9.0.306/Newtonsoft.Json.dll
/Users/andy/Library/Application Support/dnvm/dn/sdk/9.0.306/DotnetTools/dotnet-user-secrets/9.0.10-servicing.25475.17/tools/net9.0/any/Newtonsoft.Json.dll
/Users/andy/Library/Application Support/dnvm/dn/sdk/9.0.306/DotnetTools/dotnet-format/BuildHost-net472/Newtonsoft.Json.dll
/Users/andy/Library/Application Support/dnvm/dn/sdk/9.0.306/DotnetTools/dotnet-format/Newtonsoft.Json.dll
/Users/andy/Library/Application Support/dnvm/dn/sdk/9.0.306/DotnetTools/dotnet-format/BuildHost-netcore/Newtonsoft.Json.dll
/Users/andy/Library/Application Support/dnvm/dn/sdk/9.0.306/DotnetTools/dotnet-watch/9.0.306-servicing.25476.4/tools/net9.0/any/BuildHost-net472/Newtonsoft.Json.dll
/Users/andy/Library/Application Support/dnvm/dn/sdk/9.0.306/DotnetTools/dotnet-watch/9.0.306-servicing.25476.4/tools/net9.0/any/Newtonsoft.Json.dll
/Users/andy/Library/Application Support/dnvm/dn/sdk/9.0.306/DotnetTools/dotnet-watch/9.0.306-servicing.25476.4/tools/net9.0/any/BuildHost-netcore/Newtonsoft.Json.dll
Razor is here and appears to use it directly:
| <PackageReference Include="Newtonsoft.Json" /> |
dotnet-format is from Workspaces.MSBuild,
| <PackageReference Include="Microsoft.CodeAnalysis.Workspaces.MSBuild" /> |
dotnet-watch is from a transitive reference through the SDK (dotnet.dll) itself, as well as MSBuild Workspaces BuildHost. As mentioned in this document above, dotnet.dll is a target for Native AOT which means Newtonsoft must be removed as a dependency since it is not trim-compatible.
dotnet-user-secrets is here: https://github.com/dotnet/aspnetcore/blob/46b20bb24a8f8a39dea4ec93ae0c51c21f7b4a03/src/Tools/dotnet-user-secrets/src/dotnet-user-secrets.csproj#L31
The only thing I didn't verify was what potential SDK dependencies are hiding. For example, I know NuGet uses Newtonsoft.Json, but that's part of the SDK, so it'll have to be solved as part of the AOT effort. I've also created a fork where I've added an STJ backend for Newtonsoft.
There could definitely be other transitive pieces of the SDK that depend on Newtonsoft, but that's the kind of stuff that is unlikely to be found without going through the effort of deleting it and seeing what breaks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked before I wrote this and I believe most sources are addressable.
Any reason you didn't include that in the original comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hadn’t yet figured out dotnet-user-secrets, took me a bit to trace it to asp.net. I planned to post this when I had found everything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw here's the commit that introduces STJ support for NuGet.Core, NuGet.Protocol, and NuGet.Client agocke/dn@9c518c3#diff-7ca0f37e3e72071eeba18978d973fab7eb199499cae19b7ec50f8232ffe7e3ea
Those are the main packages used in the SDK. For AOT you can likely adapt to just use the STJ overloads, but if you actually wanted to eliminate the Newtonsoft dependency entirely, including in the non-AOT bits of the SDK, I think NuGet would have to take a breaking change in their API surface.
We might (unfortunately) have to live with the SDK Newtonsoft dependency. On the plus side, it would be only one copy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While doing this research I have discovered other items that should be cleaned up in other ways. A couple examples are:
I am sure others will be identified as well. Point being, I am on the lookout for these types of things. This was something on my radar but thanks for calling it out and doing the legwork. Unless you want to, I will log separate issues for migrate these components off Newtonsoft.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assumed that all dotnet CLI and .NET SDK Task usage of Newtonsoft would migrate to S.T.J as part of the AOT work, leaving only our dependencies holding the bag on this dependency. Nothing we do in the CLI requires anything that S.T.J can't do at this point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw I’ve done a lot of this conversion work for other projects for aot and am happy to help here. If you need a proof of concept conversion, point me to the location and I can make the changes.
| The plan is to eliminate these differences by standardizing on CPU-specific versions. | ||
|
|
||
| Initial analysis indicates these differences stem from AnyCPU builds coexisting with CPU-specific builds of the same assembly. | ||
| The general approach will be to prefer the CPU-specific version over the AnyCPU version when eliminating these duplicates, as CPU-specific builds can offer better performance characteristics for the target platform. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They do not. .NET Core only supports IL-only binaries.
If you are referring to R2R binaries, that's a completely different matter.
| - `dependencies` / `deps` — descriptive and clear. | ||
| - `libs` — short and familiar in many build systems. | ||
|
|
||
| #### Side Effects of a Common Assembly Location |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Frankly, I think this will end up in a very large amount of engineering pain. Attempting to share dependencies among independently built components turns a technical problem into a coordination problem, which is usually far more difficult and expensive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's only a coordination problem if you force sharing. I do not believe that is what this document is going for. Rather I see this as a two part problem:
- When two components share the same dependency let's only put it in the .NET SDK once. This is an engineering problem that requires no coordination (outside the initial setup to get components to load from a shared location).
- When two components share the same dependency at different versions have a policy mechanism that pushes them to unify or attest that the difference is needed.
As long as these two parts are kept separate the situation seems quite workable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As long as these two parts are kept separate the situation seems quite workable.
I agree with this. If this is the approach, I think it's workable. What I'm worried about is a situation where one component can't move forward with a new version until every user is individually moved forward. Perhaps this could also be addressed by the VMR -- but I still worry about the actual cost of doing dependency updates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still worry about the actual cost of doing dependency updates.
Why? If the parts are separate what is the cost you're worried about? The system allows for each component to update independently of each other (same as today). The only cost is that repos migrate to the desired version before GA. That is just a matter of policy, not a forcing function. Such migrations could be delayed if there were pressing reasons why.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, agreed. I was explicitly referring to the vmr in the sense of trying to update the vmr in one commit.
| | Microsoft.CodeAnalysis.Workspaces.dll | sdk/DotnetTools/dotnet-watch | New Copy | 4.0 | | ||
| | **Total** | | | **35.4** | | ||
|
|
||
| ## Proposed Approach |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to use hard links and keep the current layout instead of loading from shared location?
|
As a second or third phase, it should seek to remove all uses of Newtonsoft.Json from the SDK. We were talking about this at lunch. Some of the mechanical reasons why we have it were explained to me, but none of them make sense as a long-term plan. My understanding is that NuGet will be the primary concern. This is more than a size issue. For example, once the Memory Safety v2 project is fully-enabled, we should mandate that every single library and tool in the SDK is compiled with the new mode. This will put more pressure on legacy dependencies. Similarly, we should strongly encourage trim safety and discourage reflection. Some of this will likely require breaking changes. |
Related to #41128