Skip to content

improvements #2124

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from
Draft

improvements #2124

wants to merge 2 commits into from

Conversation

Byron
Copy link
Member

@Byron Byron commented Aug 19, 2025

Tasks

  • figure out how this is possible, and have a test for that. (wait for response)

@Byron
Copy link
Member Author

Byron commented Aug 19, 2025

@cruessler @EliahKagan I invite you for collaboration on handling submissions that seem to have been created by AI. My issue is that these are good at being convincing, but bad at actually being good. Thus they need extra scrutiny and time to chew through the veil of excellence.

As I don't think there is any use of disallowing these, there only is finding a way to help dealing with them.

This PR contains a first attempt at steering this, but I think we might want to re-introduce PR templates to make sure it's hard to forget a disclosure.

Once disclosed, I really also want to mark commits with Trailers to identify them as (mostly) generated. There was an entry on the Git mailing list about this, suggesting new trailers, but I couldn't find it. Ideally, we don't invent our own trailers for this either.

What's your thoughts on this and how would you respond?

@cruessler
Copy link
Contributor

This matches my experience. I also find LLM-generated code to be harder to review since changes often can’t easily be traced back to specific human intentions, so the “why” becomes harder to spot. Also, LLM-generated code is optimized for being as plausible as possible, as you said. At this point, I’m willing to put in the extra time needed to deal with such PRs, though, and see where this leads us.

I’m on board with your suggestions!

@Byron
Copy link
Member Author

Byron commented Aug 20, 2025

Thanks for sharing! It's interesting that the PR which triggered this is supposed to be based on multi-line IDE completions. That's interesting because it evades the Agent mode disclosure, but blurs the lines quite a bit as well. I might think of it as "at least a human decided where to output the code", but maybe there is more.

For me to go forward with any PR template, I'd want to find the disclosure trailers that where mentioned on probably the Git mailing list. Thus far, I couldn't find that post though. Maybe it was the linux mailing list.

Ah, here it is (thanks ChatGPT :D): https://www.phoronix.com/news/Linux-Kernel-AI-Docs-Rules

It's really about adding a Co-developed-by: Claude claude-opus-4-20250514 trailer - I think the exact model doesn't matter.

@Byron Byron added the blocked Issue can't progress due to external dependencies label Aug 20, 2025
@Byron
Copy link
Member Author

Byron commented Aug 20, 2025

This conversation pushed me over the edge to say that multi-line completion are the same as full agent. But maybe it's not even about that entirely:

  • AI generates plausible looking but not necessarily correct code.
    • even if the code is correct, it may still be too local, and thus needs a particularly thorough review
  • The original author won't know the PR well (or at all) and what motivates these changes. It's likely that commit messages are a very elaborate version of "bugfixes and performance improvements".

Thus, in the moment AI is used to generate code, it deserves to be disclosed, at least in the PR, but better even in the commit trailer for documentation.

To me it's really a curtesy towards the reviewer, and I also can't help but feel it's some sort of scam as people effectively pass off someone/something else's work as their own.

Usually it's easy to detect which PRs are generated as humans just don't write that way, yet I find it difficult to make claims based on speculation and experience - I'd love to trust people as well, and feel silly to ask them about how a PR is created.
Yet, I feel it's important to know as it changes the way the review has to happen and it really feels scammy somehow, like I am tricked and somebody comes in and wants to waste my time while claiming fully authorship if all they did was to say "can you please implement this feature for me, see #issue5".

Anyway, for now I give it the benefit of the doubt hoping there is some value to salvage. I also started full-agent copilot PRs which are easy to do now, and would even think that a way to deal with obviously generated PRs is to tell Copilot to review them. AI for AI, if you will. Alternatively, one could let Copilot "improve" on the generated PR in a new PR, and see what happens. Maybe that's going to be better while making the use of AI obvious?

There is no putting it back into the box, just learning how to use it.

Rant end.

@ali90h
Copy link

ali90h commented Aug 20, 2025

This conversation pushed me over the edge to say that multi-line completion are the same as full agent. But maybe it's not even about that entirely:

  • AI generates plausible looking but not necessarily correct code.

    • even if the code is correct, it may still be too local, and thus needs a particularly thorough review
  • The original author won't know the PR well (or at all) and what motivates these changes. It's likely that commit messages are a very elaborate version of "bugfixes and performance improvements".

Thus, in the moment AI is used to generate code, it deserves to be disclosed, at least in the PR, but better even in the commit trailer for documentation.

To me it's really a curtesy towards the reviewer, and I also can't help but feel it's some sort of scam as people effectively pass off someone/something else's work as their own.

Usually it's easy to detect which PRs are generated as humans just don't write that way, yet I find it difficult to make claims based on speculation and experience - I'd love to trust people as well, and feel silly to ask them about how a PR is created. Yet, I feel it's important to know as it changes the way the review has to happen and it really feels scammy somehow, like I am tricked and somebody comes in and wants to waste my time while claiming fully authorship if all they did was to say "can you please implement this feature for me, see #issue5".

Anyway, for now I give it the benefit of the doubt hoping there is some value to salvage. I also started full-agent copilot PRs which are easy to do now, and would even think that a way to deal with obviously generated PRs is to tell Copilot to review them. AI for AI, if you will. Alternatively, one could let Copilot "improve" on the generated PR in a new PR, and see what happens. Maybe that's going to be better while making the use of AI obvious?

There is no putting it back into the box, just learning how to use it.

Rant end.

I understand the concern about provenance and reviewer time. For my PRs I’ve disclosed AI assistance in the PR body and added commit trailers documenting it. I’m the accountable author: I reviewed/edited the changes and verified them with clippy + tests, and I’ll address any issues raised.
If the project adopts a specific template/trailer set, I’ll follow it. In the meantime, I’ll keep my PRs tightly scoped, with a small repro script and fast responses to review feedback.

@Byron
Copy link
Member Author

Byron commented Aug 20, 2025

Please note that I have started testing Copilot in full-auto mode, please ignore any PRs that show up as I will have to take it on me to evaluate them. Unless, of course, you want to also do that.

I want to learn to deal with AI by using AI more, for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocked Issue can't progress due to external dependencies
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants