Skip to content

stdenv: PURL fetcher introduction#454333

Open
h0nIg wants to merge 2 commits intoNixOS:masterfrom
h0nIg:purl-featureflag
Open

stdenv: PURL fetcher introduction#454333
h0nIg wants to merge 2 commits intoNixOS:masterfrom
h0nIg:purl-featureflag

Conversation

@h0nIg
Copy link
Contributor

@h0nIg h0nIg commented Oct 21, 2025

#421125 was merged and reverted later, because of regressions.

the background is described here: #421125 (comment)

@wolfgangwalther outlined the conditions and would like to enhance CI - over time. This is a continuous approach, which is in line with packages which have been found to be defunct and which need a fix. There may be more packages which have problems and we would like to prevent further fallout by a feature flag (prevents accessing + inheritance of drv.src / drv.srcs).

packages list: #453322 (comment)
list of real broken packages: #453322 (comment)
broken packages fix (deferrable PR: #457769)

With the old PR + the broken&platform check fix from #453291 + the feature flag, we enable maintainers to gather experience with PURL and set appropriate information (e.g. jq example, where fetchurl is used instead of fetchFromGithub)

nix-repl> xx = (import /my/nixpkgs {config={derivationPURLInheritance = true;};})
nix-repl> xx.python3Packages.boto3.meta.identifiers
{
  cpeParts = { ... };
  possibleCPEs = [ ... ];
  purl = "pkg:github/boto/boto3@1.40.18";
  purlParts = { ... };
  purls = [ ... ];
  v1 = { ... };
}

nix-repl> xx = (import /my/nixpkgs {})
nix-repl> xx.python3Packages.boto3.meta.identifiers
{
  cpeParts = { ... };
  possibleCPEs = [ ... ];
  purlParts = { ... };
  purls = [ ... ];
  v1 = { ... };
}

Things done

  • Built on platform:
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • Tested, as applicable:
  • Ran nixpkgs-review on this PR. See nixpkgs-review usage.
  • Tested basic functionality of all binary files, usually in ./result/bin/.
  • Nixpkgs Release Notes
    • Package update: when the change is major or breaking.
  • NixOS Release Notes
    • Module addition: when adding a new NixOS module.
    • Module update: when the change is significant.
  • Fits CONTRIBUTING.md, pkgs/README.md, maintainers/README.md and other READMEs.

Add a 👍 reaction to pull requests you find important.

@h0nIg h0nIg mentioned this pull request Oct 21, 2025
13 tasks
@nixpkgs-ci nixpkgs-ci bot added 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux. 10.rebuild-darwin: 1-10 This PR causes between 1 and 10 packages to rebuild on Darwin. 10.rebuild-darwin: 1 This PR causes 1 package to rebuild on Darwin. 8.has: changelog This PR adds or changes release notes 6.topic: ruby A dynamic, open source programming language with a focus on simplicity and productivity. 6.topic: fetch Fetchers (e.g. fetchgit, fetchsvn, ...) 6.topic: stdenv Standard environment 8.has: documentation This PR adds or changes documentation labels Oct 21, 2025
@h0nIg
Copy link
Contributor Author

h0nIg commented Oct 21, 2025

we need to squash all commits later, just for transparency let's keep the commits separated in order to understand what has been changed.

@h0nIg h0nIg marked this pull request as ready for review October 21, 2025 21:04
@h0nIg h0nIg marked this pull request as draft October 21, 2025 21:04
@h0nIg h0nIg marked this pull request as ready for review October 21, 2025 21:14
@wolfgangwalther
Copy link
Contributor

@h0nIg You should take a step back and stop trying to push everyone to work according to your timeline. You are being incredibly pushy. If you want others to give input to your stuff, answer their questions and be patient. This is open source, so major changes need time. I know this is sometimes frustrating, we have all been there - but it's reality.

@mweinelt
Copy link
Member

mweinelt commented Nov 5, 2025

@mweinelt do you want to reduce the work for the security team, by enabling others to track vulnerabilities? CVE schema with PURL information has been released: CVEProject/cve-schema@v5.2.0 (release)

And please don't try to make this about me. Yes, I want those things, but we need to deal with the concerns other committers have.

@leona-ya
Copy link
Member

leona-ya commented Nov 8, 2025

This is a way to complicated PR to rush before branch-off. We (@jopejoe1 and I) won't accept this for 25.11, we don't want more breakage. Please wait for branch-off at least.

@leona-ya leona-ya added the 2.status: wait for branch‐off Waiting for the next Nixpkgs branch‐off label Nov 8, 2025
@nixpkgs-ci nixpkgs-ci bot added the 2.status: merge conflict This PR has merge conflicts with the target branch label Dec 9, 2025
@SuperSandro2000
Copy link
Member

The 25.11 release is out of the door now. How do we continue here?

@h0nIg
Copy link
Contributor Author

h0nIg commented Dec 20, 2025

@SuperSandro2000 I don't know, european companies will have to stop using nixpkgs for their commercial products because of the Cyber Resillience Act. I tried to compile a problem statement here: #472828

@samueldr
Copy link
Member

european companies will have to stop using nixpkgs for their commercial products because of the Cyber Resillience Act

For the record, this is FUD and inaccurate.

@arianvp
Copy link
Member

arianvp commented Dec 21, 2025

I have very little interest collaborating and moving this PR forward if you keep behaving like you are.

Emotional blackmail is not welcome and you've been warned twice now about this

@arianvp
Copy link
Member

arianvp commented Dec 21, 2025

Just to let you know how this looks from the outside:

  1. Supersandro clearly wants to move this forward. Solve a stalemate. And restart work on this
  2. Your first reaction to this is a again an emotional appeal that immediately digs your heels in the sand and drawing a red line

Instead of doing that, it'll be helpful to suggest collaborating with Sandro to get this in a mergeable state again? He clearly is interested in this landing and seems eager to help. So use that to your advantage.

@raboof
Copy link
Member

raboof commented Dec 21, 2025

With tools like https://github.com/tiiuae/sbomnix, https://github.com/nikstur/bombon, https://github.com/tweag/genealogos and others, while we might be 'behind the curve' on SBOMs in some respects, blanket statements such as "european companies will have to stop using nixpkgs for their commercial products because of the Cyber Resillience Act" are false and unhelpful.

I don't think anyone disagrees that those existing tools have their limitations, and that it would be great to improve in that respect - I do agree there is a lot of potential here, and if we get this right we can leapfrog other systems and have something that's actually much better. What is less clear (as mentioned before) is whether this PR is what's missing to make meaningful improvements. Back then I was in favor of merging the PR, to allow downstream experimentation and learn what changes would be needed.

I'm not so sure anymore, perhaps it would be better to keep this on a branch until we have a clear motivating PoC SBOM tool that actually does produce better output with these changes?

@raboof
Copy link
Member

raboof commented Dec 24, 2025

I noticed we've been doing a lot of talking 'in the abstract' and felt the need to summarize the topic and get some concrete examples of what things look like today. I wrote something up at https://arnout.engelen.eu/blog/nix-state-of-the-sbom/ . I tried to introduce the topic so it'd be helpful for someone new to the topic to get spun up on it, so it might be a bit verbose for y'all already participating in this thread.

Nonetheless the 6 example SBOMs might be helpful to pour over. The post is still rather draft-y, feedback welcome - I do plan to keep it updated as my understanding/opinions and the tools improve.

Based on that, I get the impression that we may not initially need the 'inheritance' part of this PR, which seems to be the controversial part: bombon can show the 'inferred' information just fine (though it currently puts it into externalReferences rather than the purl), and while sbomnix doesn't yet, AFAICT it seems like relatively low-hanging fruit there (famous last words...).

That said, having the fields to manually include PURL metadata in nixpkgs packages for cases where SBOM tools cannot accurately/completely infer it would still be very valuable. Perhaps it would make sense to extract that part of this PR into a separate one, which might be noncontroversial? We can keep the 'inheritance' aspect on a branch to experiment with without committing to it by actually merging it into nixpkgs already.

@fricklerhandwerk
Copy link
Contributor

Mostly-automatic annotation of sources with appropriate pURLs is very desirable. The strongest argument is that otherwise we don't have that structured data to work with downstream. I agree with @raboof that a smaller scope of just enabling (and surely also checking) the annotation would already get us a step forward. Whoever really needs to read those annotations will then be able to do so.

What I don't fully understand is how meta depending on src.meta is a big problem, except that there are packages where it's just broken. Those can be fixed, no? And having more checks during CI is always good.

I understand the evaluation time concern, but in a sense that is a deployment issue. There's nothing in principle speaking against distributing Nixpkgs sources just for derivations, with all the metadata expressions stripped and all the attributes packed into one file, and shipping the metadata with a database such as nix-index.

@h0nIg
Copy link
Contributor Author

h0nIg commented Jan 29, 2026

Mostly-automatic annotation of sources with appropriate pURLs is very desirable. The strongest argument is that otherwise we don't have that structured data to work with downstream. I agree with @raboof that a smaller scope of just enabling (and surely also checking) the annotation would already get us a step forward. Whoever really needs to read those annotations will then be able to do so.

What I don't fully understand is how meta depending on src.meta is a big problem, except that there are packages where it's just broken. Those can be fixed, no? And having more checks during CI is always good.

I understand the evaluation time concern, but in a sense that is a deployment issue. There's nothing in principle speaking against distributing Nixpkgs sources just for derivations, with all the metadata expressions stripped and all the attributes packed into one file, and shipping the metadata with a database such as nix-index.

@pombredanne asked me to compile a demo, i demonstrated which data can get extracted with this patch some time ago: #421125 (comment)

https://github.com/sap-contributions/nixpkgs-purl-demo/

out of 10238 python packages (12485 first and n-level-derivations), 10574 are identifiable out of the box. nearly 17% have a homepage different to the source location

focussing on the python derivations only, you can achieve 97% of purl match rate out of the box (302 out of 9648).

a rough list of packages and their purl: https://github.com/sap-contributions/nixpkgs-purl-demo/blob/main/data-name.txt

@raboof
Copy link
Member

raboof commented Mar 3, 2026

i demonstrated which data can get extracted with this patch some time ago: #421125 (comment)

Thank you, it's very helpful to have a concrete example to talk about.

As you know (but repeating for new readers), the main current use case for purls is SBOMs, and there are two general techniques for extracting the dependency relationships from nix trees: evaluating the the nix sources 'in nix', and parsing the drvs from the nix store. Both have their downsides: evaluating 'in nix' currently makes some subtrees unreachable due to NixOS/nix#4677 , while parsing drv files misses useful information from the meta fields until something like #420575 or #466932 would happen.

AFAICT "the jury is still out" on which approach we will land on. Until that time, I think we should not yet merge this PR as-is, as its complex/controversial bits (the propagation between meta and src.meta) only seem useful/necessary for the 'in nix' approach. If we'd land on the drv parsing approach, this propagation is not necessary: I made a small PoC showing that at https://codeberg.org/raboof/nix-build-sbom/src/branch/purl-experiment . Of course that PoC is horrible for several reasons, I'm definitely not suggesting anyone should use that directly, but it does show there are approaches where it's possible to derive purls without having the meta propagation - so it might be premature to introduce it.

The introduction of meta.identifiers.purl/meta.identifiers.purls/drv.src.meta.identifiers.v1.purl(s) in this PR seems more directly useful: for example, for python312Packages.pixelmatch we currently infer something like pkg:generic/pixelmatch-py?vcs_url=https://github.com/whtsky/pixelmatch-py.git@v0.3.0 or pkg:github/whtsky/pixelmatch-py@v0.3.0 as purl, while in fact pkgs:pypi/pixelmatch:0.3.0 would be more accurate. I don't think that information can currently be reasonably inferred, and IMHO it would be great if we could encode that knowledge in meta. This would benefit us immediately for nix-eval-based construction of SBOMs, and also for drv-based construction once we land on a way to expose meta to such tools.

(I have not digested the design decisions of that part of this PR, but it seems to me it might help to discuss those changes in isolation in their own PR, which we might be able to merge more quickly?)

@nixpkgs-ci nixpkgs-ci bot removed the 2.status: merge conflict This PR has merge conflicts with the target branch label Mar 8, 2026
@h0nIg h0nIg force-pushed the purl-featureflag branch 4 times, most recently from 5528902 to 097b483 Compare March 8, 2026 21:28
@nixpkgs-ci nixpkgs-ci bot requested review from a team and infinisil March 8, 2026 21:34
@h0nIg h0nIg force-pushed the purl-featureflag branch from 097b483 to ca4dc6c Compare March 8, 2026 21:42
@h0nIg h0nIg force-pushed the purl-featureflag branch from ca4dc6c to fed1a8d Compare March 8, 2026 21:47
@h0nIg
Copy link
Contributor Author

h0nIg commented Mar 8, 2026

(I have not digested the design decisions of that part of this PR, but it seems to me it might help to discuss those changes in isolation in their own PR, which we might be able to merge more quickly?)

as agreed in our call together with @raboof, i removed the inheritance and its feature flag

@h0nIg h0nIg changed the title stdenv: PURL fetcher introduction & feature flag stdenv: PURL fetcher introduction Mar 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

2.status: wait for branch‐off Waiting for the next Nixpkgs branch‐off 6.topic: fetch Fetchers (e.g. fetchgit, fetchsvn, ...) 6.topic: ruby A dynamic, open source programming language with a focus on simplicity and productivity. 6.topic: stdenv Standard environment 8.has: changelog This PR adds or changes release notes 8.has: documentation This PR adds or changes documentation 10.rebuild-darwin: 1-10 This PR causes between 1 and 10 packages to rebuild on Darwin. 10.rebuild-darwin: 1 This PR causes 1 package to rebuild on Darwin. 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux. 12.approvals: 3+ This PR was reviewed and approved by three or more persons.

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.