Skip to content

CFEP-12: Removing packages that violate the terms of the source package#23

Open
isuruf wants to merge 11 commits intoconda-forge:mainfrom
isuruf:cfep-12
Open

CFEP-12: Removing packages that violate the terms of the source package#23
isuruf wants to merge 11 commits intoconda-forge:mainfrom
isuruf:cfep-12

Conversation

@isuruf
Copy link
Copy Markdown
Member

@isuruf isuruf commented Feb 26, 2020

No description provided.

@isuruf isuruf requested a review from a team February 26, 2020 06:56
Comment thread cfep-12.md Outdated
Comment thread cfep-12.md Outdated
isuruf and others added 3 commits February 26, 2020 09:53
Co-Authored-By: Christopher J. Wright <cjwright4242gh@gmail.com>
Copy link
Copy Markdown
Member

@beckermr beckermr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be useful to provide guidance on the kind of message to leave, whether or not to tag core in it, etc. to be honest, unless we have an automated way to do this I am not sure if we will be consistent in our application of the policy which is another thing to consider.

@beckermr
Copy link
Copy Markdown
Member

here is a suggestion. Let’s expand the functionality of the cf-mark-broken repo to mark all builds for an entire feedstock as broken optionally and let’s enable it to automerge certain PRs after a fixed time period, say 30 days.

Then when we find a license issue, we can make the pr in the cf-mark-broken repo right away and leave an issue on the feedstock instructing the maintainers to fix the license and bump us on GitHub to close the mark broken pr.

Comment thread cfep-12.md Outdated
@beckermr
Copy link
Copy Markdown
Member

Any thoughts on automation here @isuruf?

@beckermr
Copy link
Copy Markdown
Member

Or did we hit this again: https://xkcd.com/1205/

@isuruf
Copy link
Copy Markdown
Member Author

isuruf commented Feb 26, 2020

I don't think we need to talk about automation here, but we can iterate on a draft message here if you'd like to.

Comment thread cfep-12.md
Co-Authored-By: Matthew R. Becker <beckermr@users.noreply.github.com>
@isuruf
Copy link
Copy Markdown
Member Author

isuruf commented Mar 3, 2020

@conda-forge/core

This PR falls under the CFEP Approval policy, please vote and/or comment on this PR.
This PR needs 60% of core to vote yea to pass.
To vote please leave Approve (yea) or Request Changes (nay) reviews.
If you would like changes to the current language please leave a comment or push to this branch.
This vote will end on 2020-03-11.

sodre
sodre previously approved these changes Mar 4, 2020
Copy link
Copy Markdown
Member

@ericdill ericdill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for codifying this @isuruf

tl;dr. I'd like to sort out this main issue that I see: moving packages to broken doesn't fix redistribution violations. If we're concerned about these violations, we should delete the package from anaconda.org/conda-forge until it's fixed.

Comment thread cfep-12.md Outdated
Comment on lines +28 to +29
If the issue is not addressed after the specified time period, the package will be moved
to the broken label.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm understanding correctly, then I don't think that moving the package to the broken label truly addresses the redistribution violation. We need to remove all binaries from the conda-forge channel until the redistribution violation is addressed. Otherwise we, conda-forge, are still redistributing the conda package that is in violation of the source packages terms and conditions. The "broken" label doesn't actually protect conda-forge from a redistribution violation.

If I'm understanding correctly, then 41 needs to be updated to say that we're going to remove the package from conda-forge.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. broken label is still a channel and the package is still available. This is probably the only case were a removal is the right procedure.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a legal stand point I agree with @ericdill, deleting the package is the only way to stop re-distribution.

From a user's stand point moving to broken is more attractive as it still allows access to the package when re-creating a historic environment.

Perhaps a compromise is to move the package to broken by default but remove the package if the upstream developers request this action. My though is that many developers would be happy to have the license violation corrected even if packages exist which have a violation and are marked as such.

Do other packaging communities (Debian, CentOS, Arch) remove packages from their repositories when a license violation is found or do they release new builds to address the issue?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving things "only to broken" could also get us into more trouble as we show that we know that we are aware of an issue but didn't take appropriate action.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both @jjhelmus and @xhochy points goes back to what I asked above. Shouldn't we take action if, and only if, we get a take down notice? We can ask NumFOCUS about it.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggested we consult a lawyer on gitter and still think we should do that.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recall that the decision to wait on a take down notice was taken after we spoke with NumFOCUS people and Anaconda. We can still ask NumFOCUS about. I don't remember why we ended up not asking them to forward that to the lawyer.

PS: who is out primary NumFOCUS contact these days?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Debian has some documentation on their removal process, https://www.debian.org/doc/manuals/developers-reference/pkgs.en.html#removing-pkgs.

From my reading package are deleted from un-released distributions, ( unstable, experimental and stable) but not from released distributions (jessie, stretch, buster).

Not sure how or if this should translate to conda-forge which has little idea of a "released distribution". Perhaps if a package is included in miniforge it should not be removed?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jjhelmus that does not say anything about removal due license violations. (Or I missed while reading it.)

If I recall correctly Debian has a very pick way of including packages and license issues are solved before they make into a distribution.

@msarahan
Copy link
Copy Markdown
Member

msarahan commented Mar 4, 2020

What about PR's from the bot? It seems like any feedstock deemed out of compliance shouldn't be allowed to have new packages until they have proven that they have addressed their license issue. As it stands, their current packages are moved to broken, but what about any updates built afterwards?

@jakirkham
Copy link
Copy Markdown
Member

I thought the bot already had something in place to add the license file to recipes that don't have it.

It seems to me if we are really concerned about this we could just have the bot add license files to every feedstock missing them.

@beckermr
Copy link
Copy Markdown
Member

beckermr commented Mar 4, 2020

Deleting all builds and then archiving the feedstock should address both of these concerns. I’d want a longer grace period for more extreme measures like this maybe.

@beckermr
Copy link
Copy Markdown
Member

beckermr commented Mar 4, 2020

A list of feedstocks sans license files is located in the bot data. I forget where.

@jakirkham
Copy link
Copy Markdown
Member

Right we have a piggyback migrator for license_files already though. PR ( regro/cf-scripts#637 ) introduced if I'm not mistaken. Would it make sense to just make it a normal migrator? Are there fail cases in it that we can identify?

@isuruf
Copy link
Copy Markdown
Member Author

isuruf commented Mar 4, 2020

@jakirkham, I don't see how that is relevant. Some feedstocks might have inadequate license_file entries. (For eg: a package's license is in 2 files and we only package 1.) Some packages might not be redistributable at all. @msarahan was suggesting we stop the bot from making version update PRs, etc if I understood him correctly.

Copy link
Copy Markdown
Contributor

@jjhelmus jjhelmus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see a resolution to the question of moving to broken vs deleting the package and/or a proposed that recommends a workflow based on @mbargull and @isuruf comments before I approve this CFEP.

@isuruf
Copy link
Copy Markdown
Member Author

isuruf commented Mar 10, 2020

I've addressed all of the concerns except for the following.

have clarification on what the license field is supposed to mean

That's unrelated to this CFEP.

Copy link
Copy Markdown
Member

@jakirkham jakirkham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still seems to have issues with environment reproducibility. I'm not sure how we fix that without continuing to have the packages available to external people at some level. So for now am voting against this proposal. Open to hearing solutions to this problem though

@ocefpaf
Copy link
Copy Markdown
Member

ocefpaf commented Mar 10, 2020

This still seems to have issues with environment reproducibility. I'm not sure how we fix that without continuing to have the packages available to external people at some level. So for now am voting against this proposal. Open to hearing solutions to this problem though

@jakirkham the first step is to ask NumFOCUS for legal advice here. Once we know what we can and cannot do then it will be easier to lay a path forward. I'm looking into that.

@isuruf
Copy link
Copy Markdown
Member Author

isuruf commented Mar 10, 2020

I'm not sure how we fix that without continuing to have the packages available to external people at some level.

This feels like saying, hey I did something illegal, but please let me keep doing it because I did it in the past.

@ocefpaf
Copy link
Copy Markdown
Member

ocefpaf commented Mar 10, 2020

This feels like saying, hey I did something illegal, but please let me keep doing it because I did it in the past.

Yep. Like we discussed in the last meeting. If we are going to do this right we'll have to make peace with breaking envs. That is the whole reason why I want to touch base with a lawyer first.

@isuruf
Copy link
Copy Markdown
Member Author

isuruf commented Mar 10, 2020

FYI, my proposal here doesn't break envs unless there's a serious issue or if the maintainers are unwilling to fix it. If there's a serious issue about packaging, we can't do anything about it. (There's the check sum issue, but that's minor and people can update those.)

If the maintainers are unwilling to fix it, then people who are using it will notice that their environments broke and they might help with fixing it.

@jjhelmus
Copy link
Copy Markdown
Contributor

jjhelmus commented Mar 10, 2020

Changing the contents of packages is a bad idea in my opinion, I would much prefer a new build. Changing the contents of a package requires the original package to be invalidated in a number of places including anaconda.org, the CDN and on user's local package cache. Each of these can be tricky and error prone.

@isuruf
Copy link
Copy Markdown
Member Author

isuruf commented Mar 10, 2020

I checked anaconda.org and we can actually do that and the CDN uses anaconda.org I believe. I don't know what goes behind anaconda.org, so I don't know how that's going to affect anaconda.org, though.
User local package cache is fine as we are not redistributing it.

@msarahan
Copy link
Copy Markdown
Member

msarahan commented Mar 10, 2020

It would be creating a lot of work for the distribution team. It also has been disruptive to users behind caching proxies in the past. These people rarely even know that they're behind a caching proxy - things just start giving them weird checksum mismatch errors.

I don't think there's any way to have both licenses added to old packages and reproducibility. One possible workaround would be to allow sidecar files that could augment the package metadata. It could have hotfix stuff as well as license additions/changes.

Edit: historical context: conda/conda#4956

@isuruf
Copy link
Copy Markdown
Member Author

isuruf commented Mar 10, 2020

One possible workaround would be to allow sidecar files that could augment the package metadata. It could have hotfix stuff as well as license additions/changes.

This wouldn't fix the comments made above that we are still distributing the conda package without these files in anaconda.org.

@msarahan
Copy link
Copy Markdown
Member

Yes, you can't have your cake and eat it, too.

@jjhelmus
Copy link
Copy Markdown
Contributor

jjhelmus commented Mar 10, 2020

Conda will give a MD5MismatchError when the contents of the local cache do not match those in repodata.json. Every time packages contents were changed in defaults we had users report errors along these lines. It is possible to remove the index cache using conda clean -iy but requiring that is a poor user experience.

@jjhelmus
Copy link
Copy Markdown
Contributor

Edit: historical context: conda/conda#4956

Thanks for finding that issue @msarahan. I was searching for it but could not find it.

@beckermr
Copy link
Copy Markdown
Member

We really need good legal advice here. Let's hold off on further discussion until we hear back. Then I propose we dedicate time on the phone to go through the advice from the lawyers on this.

@CJ-Wright
Copy link
Copy Markdown
Member

What is the impact of this policy in terms of number of feedstocks and packages that need fixing, number of child packages that couldn't be built and number of downloads that's would be impacted?

@jakirkham
Copy link
Copy Markdown
Member

We can probably only get an approximate number (like what things need license_file). Though it still doesn't cover some edge cases that @isuruf mentioned. It also would include things that don't technically need a license file (like Public Domain).

@isuruf
Copy link
Copy Markdown
Member Author

isuruf commented Mar 10, 2020

license_file is not related to this CFEP. packages that don't have a license_file doesn't necessarily mean that the conda package violates the terms. Some packages distribute the license in $PREFIX/share. Some packages are source only and have the copyright and license as a header of those source files.
So, let's not drag license_file into this CFEP.

@beckermr
Copy link
Copy Markdown
Member

I see the whole holding on further discussion thing is not going to fly. ;)

@isuruf
Copy link
Copy Markdown
Member Author

isuruf commented Apr 9, 2020

It has been a month now. Did we hear back from NumFOCUS?

@isuruf
Copy link
Copy Markdown
Member Author

isuruf commented May 26, 2020

Another 1.5 months have passed now.

@ocefpaf
Copy link
Copy Markdown
Member

ocefpaf commented May 26, 2020

Another 1.5 months have passed now.

I bet it will be longer. These things are slow and we must be patient. Last week I got a feedback that our second communication to NumFOCUS about this was forwarded to the legal department. They are evaluating now and will probably come back with questions, not answers.

Comment thread cfep-12.md
the license file required for us to be able to distribute this code. Nearly every
open-source license has such a requirement. It is extremely important that we
at `conda-forge` respect license restrictions and requirements such as this one
so that we can be good stewards of the open-source community.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
so that we can be good stewards of the open-source community.
so that we can be good stewards of the open-source community and follow the law.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.