CFEP-12: Removing packages that violate the terms of the source package#23
CFEP-12: Removing packages that violate the terms of the source package#23isuruf wants to merge 11 commits intoconda-forge:mainfrom
Conversation
Co-Authored-By: Christopher J. Wright <cjwright4242gh@gmail.com>
beckermr
left a comment
There was a problem hiding this comment.
It may be useful to provide guidance on the kind of message to leave, whether or not to tag core in it, etc. to be honest, unless we have an automated way to do this I am not sure if we will be consistent in our application of the policy which is another thing to consider.
|
here is a suggestion. Let’s expand the functionality of the cf-mark-broken repo to mark all builds for an entire feedstock as broken optionally and let’s enable it to automerge certain PRs after a fixed time period, say 30 days. Then when we find a license issue, we can make the pr in the cf-mark-broken repo right away and leave an issue on the feedstock instructing the maintainers to fix the license and bump us on GitHub to close the mark broken pr. |
|
Any thoughts on automation here @isuruf? |
|
Or did we hit this again: https://xkcd.com/1205/ |
|
I don't think we need to talk about automation here, but we can iterate on a draft message here if you'd like to. |
Co-Authored-By: Matthew R. Becker <beckermr@users.noreply.github.com>
|
@conda-forge/core This PR falls under the CFEP Approval policy, please vote and/or comment on this PR. |
ericdill
left a comment
There was a problem hiding this comment.
Thanks for codifying this @isuruf
tl;dr. I'd like to sort out this main issue that I see: moving packages to broken doesn't fix redistribution violations. If we're concerned about these violations, we should delete the package from anaconda.org/conda-forge until it's fixed.
| If the issue is not addressed after the specified time period, the package will be moved | ||
| to the broken label. |
There was a problem hiding this comment.
If I'm understanding correctly, then I don't think that moving the package to the broken label truly addresses the redistribution violation. We need to remove all binaries from the conda-forge channel until the redistribution violation is addressed. Otherwise we, conda-forge, are still redistributing the conda package that is in violation of the source packages terms and conditions. The "broken" label doesn't actually protect conda-forge from a redistribution violation.
If I'm understanding correctly, then 41 needs to be updated to say that we're going to remove the package from conda-forge.
There was a problem hiding this comment.
Yep. broken label is still a channel and the package is still available. This is probably the only case were a removal is the right procedure.
There was a problem hiding this comment.
From a legal stand point I agree with @ericdill, deleting the package is the only way to stop re-distribution.
From a user's stand point moving to broken is more attractive as it still allows access to the package when re-creating a historic environment.
Perhaps a compromise is to move the package to broken by default but remove the package if the upstream developers request this action. My though is that many developers would be happy to have the license violation corrected even if packages exist which have a violation and are marked as such.
Do other packaging communities (Debian, CentOS, Arch) remove packages from their repositories when a license violation is found or do they release new builds to address the issue?
There was a problem hiding this comment.
Moving things "only to broken" could also get us into more trouble as we show that we know that we are aware of an issue but didn't take appropriate action.
There was a problem hiding this comment.
I suggested we consult a lawyer on gitter and still think we should do that.
There was a problem hiding this comment.
I recall that the decision to wait on a take down notice was taken after we spoke with NumFOCUS people and Anaconda. We can still ask NumFOCUS about. I don't remember why we ended up not asking them to forward that to the lawyer.
PS: who is out primary NumFOCUS contact these days?
There was a problem hiding this comment.
Debian has some documentation on their removal process, https://www.debian.org/doc/manuals/developers-reference/pkgs.en.html#removing-pkgs.
From my reading package are deleted from un-released distributions, ( unstable, experimental and stable) but not from released distributions (jessie, stretch, buster).
Not sure how or if this should translate to conda-forge which has little idea of a "released distribution". Perhaps if a package is included in miniforge it should not be removed?
There was a problem hiding this comment.
@jjhelmus that does not say anything about removal due license violations. (Or I missed while reading it.)
If I recall correctly Debian has a very pick way of including packages and license issues are solved before they make into a distribution.
|
What about PR's from the bot? It seems like any feedstock deemed out of compliance shouldn't be allowed to have new packages until they have proven that they have addressed their license issue. As it stands, their current packages are moved to broken, but what about any updates built afterwards? |
|
I thought the bot already had something in place to add the license file to recipes that don't have it. It seems to me if we are really concerned about this we could just have the bot add license files to every feedstock missing them. |
|
Deleting all builds and then archiving the feedstock should address both of these concerns. I’d want a longer grace period for more extreme measures like this maybe. |
|
A list of feedstocks sans license files is located in the bot data. I forget where. |
|
Right we have a piggyback migrator for |
|
@jakirkham, I don't see how that is relevant. Some feedstocks might have inadequate |
|
I've addressed all of the concerns except for the following.
That's unrelated to this CFEP. |
jakirkham
left a comment
There was a problem hiding this comment.
This still seems to have issues with environment reproducibility. I'm not sure how we fix that without continuing to have the packages available to external people at some level. So for now am voting against this proposal. Open to hearing solutions to this problem though
@jakirkham the first step is to ask NumFOCUS for legal advice here. Once we know what we can and cannot do then it will be easier to lay a path forward. I'm looking into that. |
This feels like saying, hey I did something illegal, but please let me keep doing it because I did it in the past. |
Yep. Like we discussed in the last meeting. If we are going to do this right we'll have to make peace with breaking envs. That is the whole reason why I want to touch base with a lawyer first. |
|
FYI, my proposal here doesn't break envs unless there's a serious issue or if the maintainers are unwilling to fix it. If there's a serious issue about packaging, we can't do anything about it. (There's the check sum issue, but that's minor and people can update those.) If the maintainers are unwilling to fix it, then people who are using it will notice that their environments broke and they might help with fixing it. |
|
Changing the contents of packages is a bad idea in my opinion, I would much prefer a new build. Changing the contents of a package requires the original package to be invalidated in a number of places including anaconda.org, the CDN and on user's local package cache. Each of these can be tricky and error prone. |
|
I checked anaconda.org and we can actually do that and the CDN uses anaconda.org I believe. I don't know what goes behind anaconda.org, so I don't know how that's going to affect anaconda.org, though. |
|
It would be creating a lot of work for the distribution team. It also has been disruptive to users behind caching proxies in the past. These people rarely even know that they're behind a caching proxy - things just start giving them weird checksum mismatch errors. I don't think there's any way to have both licenses added to old packages and reproducibility. One possible workaround would be to allow sidecar files that could augment the package metadata. It could have hotfix stuff as well as license additions/changes. Edit: historical context: conda/conda#4956 |
This wouldn't fix the comments made above that we are still distributing the conda package without these files in anaconda.org. |
|
Yes, you can't have your cake and eat it, too. |
|
Conda will give a |
Thanks for finding that issue @msarahan. I was searching for it but could not find it. |
|
We really need good legal advice here. Let's hold off on further discussion until we hear back. Then I propose we dedicate time on the phone to go through the advice from the lawyers on this. |
|
What is the impact of this policy in terms of number of feedstocks and packages that need fixing, number of child packages that couldn't be built and number of downloads that's would be impacted? |
|
We can probably only get an approximate number (like what things need |
|
|
|
I see the whole holding on further discussion thing is not going to fly. ;) |
|
It has been a month now. Did we hear back from NumFOCUS? |
|
Another 1.5 months have passed now. |
I bet it will be longer. These things are slow and we must be patient. Last week I got a feedback that our second communication to NumFOCUS about this was forwarded to the legal department. They are evaluating now and will probably come back with questions, not answers. |
| the license file required for us to be able to distribute this code. Nearly every | ||
| open-source license has such a requirement. It is extremely important that we | ||
| at `conda-forge` respect license restrictions and requirements such as this one | ||
| so that we can be good stewards of the open-source community. |
There was a problem hiding this comment.
| so that we can be good stewards of the open-source community. | |
| so that we can be good stewards of the open-source community and follow the law. |
No description provided.