Skip to content

Fix per-file incremental analysis#24

Merged
Szelethus merged 7 commits intomainfrom
00021-transitive-inputs
Sep 15, 2025
Merged

Fix per-file incremental analysis#24
Szelethus merged 7 commits intomainfrom
00021-transitive-inputs

Conversation

@nettle
Copy link
Copy Markdown
Collaborator

@nettle nettle commented Jul 24, 2025

Why: Incremental analysis doesn't work with the per-file rule

What: Make inputs dependent on headers only
Note: depending on headers only may mean that CTU wont work!

Test:

bazel clean --expunge
bazel build //test:code_checker_fail -s
vim test/src/pass.cc
bazel build //test:code_checker_fail -s

Addresses: #21

@nettle nettle marked this pull request as draft July 24, 2025 17:17
@nettle nettle changed the title Fix per-file to incremental analysis Fix per-file incremental analysis Jul 24, 2025
@nettle nettle marked this pull request as ready for review July 24, 2025 19:24
@nettle nettle added the bug Something isn't working label Jul 24, 2025
@nettle nettle requested review from Szelethus and dkrupp July 24, 2025 22:08
@Szelethus Szelethus linked an issue Jul 25, 2025 that may be closed by this pull request
@Szelethus
Copy link
Copy Markdown
Contributor

Do I understand correctly that in this case if a header changes, still all TUs are reanalized, considering that each .cpp file is dependent on all headers?

Or the idea is that the patch demonstrates that we are making bazel think of the analysis rules on a per-TU basis, where we can still refine the header set later down the line?

Comment thread src/code_checker.bzl Outdated
Comment thread src/code_checker.bzl Outdated
@nettle nettle force-pushed the 00021-transitive-inputs branch from e682e70 to 43ee354 Compare July 29, 2025 18:29
@nettle
Copy link
Copy Markdown
Collaborator Author

nettle commented Jul 29, 2025

Do I understand correctly that in this case if a header changes, still all TUs are reanalized, considering that each .cpp file is dependent on all headers?

Well, yes, at least per target. We can say for one target each src depends on all headers within this target.
That's the idea.

Or the idea is that the patch demonstrates that we are making bazel think of the analysis rules on a per-TU basis, where we can still refine the header set later down the line?

Sure, it is per-TU basis, that's how we run CodeChecker analyze --file
The dependency on headers might be optimized and CTU is still the problem.

Copy link
Copy Markdown
Contributor

@Szelethus Szelethus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In terms of the high level stuff, this looks like a great step forward. Nice!

I see a few obstacles/future steps:

  • We need some tests where we demonstate that changing a cpp file only triggers its own reanalysis, and one where changing a header (as of now) triggers a whole reanalysis
  • I'd be happy to see the CI patches land (#37 or #28 at the very least) before this,
  • We discussed in the related issue (#21) that in time, we should retrieve the true header dependency set. It sure sounds plausible (it really must be the case) that bazel knows the true header dependencies, even if the hdrs and srcs fields aren't the way to get them. Tibor mentioned some .d files that bazel generates but seem hard to access -- in loose, informal terms, how would you approach this problem? Make bazel list the header dependencies, should we have a compiler call, or ask the CodeChecker folks (which we will need to do for CTU anyways later on) to create a dry-run option?

@nettle
Copy link
Copy Markdown
Collaborator Author

nettle commented Jul 30, 2025

  • We need some tests where we demonstate that changing a cpp file only triggers its own reanalysis, and one where changing a header (as of now) triggers a whole reanalysis
  • I'd be happy to see the CI patches land (Run tests in CI #37 or CI for unit tests #28 at the very least) before this,

Totally agree with those.

  • We discussed in the related issue (Incremental analysis doesn't work with the per-file rule #21) that in time, we should retrieve the true header dependency set. It sure sounds plausible (it really must be the case) that bazel knows the true header dependencies, even if the hdrs and srcs fields aren't the way to get them.

Well, these discussions are valid and we should continue experimenting with different approaches and ideas.
However, meanwhile I think we should submit this simple improvement even it is suboptimal but at least it is more Bazel-compatible.

Tibor mentioned some .d files that bazel generates but seem hard to access -- in loose, informal terms, how would you approach this problem? Make bazel list the header dependencies, should we have a compiler call, or ask the CodeChecker folks (which we will need to do for CTU anyways later on) to create a dry-run option?

Bazel does not generate anything except artifacts described as Action output (clang -MMD/-MD/-MM/-M does).
So, we can describe an Action(s) to get whatever we need to know.
BTW, remember also that CodeChecker is just a wrapper around Clang.

@Szelethus
Copy link
Copy Markdown
Contributor

Szelethus commented Jul 31, 2025

This patch is indeed an improvement! I'm perfectly okay refining the later steps in subsequent patches, I don't see anything inherent that would make it more difficult.

So, we can describe an Action(s) to get whatever we need to know.

Do you have plans to work on that yourself or would you prefer if @furtib and I took a crack at it?

BTW, remember also that CodeChecker is just a wrapper around Clang.

And what a wrapper it is! :) But yes, you are correct, we should probably patch clang (likely clang-extdef-mapping) directly instead of CodeChecker. We can consider later on whether it'd be good design to consult with clang through CodeChecker regardless, but thats a discussion for another day.

@furtib furtib mentioned this pull request Aug 13, 2025
@furtib furtib force-pushed the 00021-transitive-inputs branch from a7664db to 56d4070 Compare August 25, 2025 07:33
Szelethus pushed a commit that referenced this pull request Aug 29, 2025
Why: We should have a unit test for caching! (for #24  and #23)

What:
Created a test case that:
- Analyzes a two-source file target
- Edits one file from the two
- Reruns the analysis, counts the number of files analyzed, with the
subcommands option

Addresses: #60
nettle and others added 6 commits August 29, 2025 13:38
Why: Incremental analysis doesn't work
     with the per-file rule

What: Make inputs dependenies as transitive

Addresses: #21
Why: Incremental analysis doesn't work
     with the per-file rule

What: Make inputs dependenies as transitive per target

Addresses: #21
@furtib furtib force-pushed the 00021-transitive-inputs branch from e2cf129 to 91b6e68 Compare August 29, 2025 11:41
@furtib furtib requested a review from Szelethus August 29, 2025 11:41
Copy link
Copy Markdown
Contributor

@Szelethus Szelethus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! @nettle, I'd like to have your approval on this as well.

@Szelethus
Copy link
Copy Markdown
Contributor

Szelethus commented Sep 15, 2025

As there are no objections so far, lets land this. If the you-know-what hits the fan, we can still change course after the fact.

@Szelethus Szelethus merged commit f496416 into main Sep 15, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Incremental analysis doesn't work with the per-file rule

3 participants