Skip to content

Conversation

MathiasVP
Copy link
Contributor

This PR switches C/C++ away from our own guards library and over to the shared guards library which was introduced in #19573.

The main motivation for this is two-fold: It reduces the amount of C/C++ specific QL code that needs to be maintained 1. It also enables C/C++ to instantiate the newly shared "nullness" library which was introduced in #20367. This should hopefully open the doors to better null dereference / use-after-free / double free queries.

On top of that, the improvements to the internal guards logic also means much better query results on existing queries. In particular, the cpp/missing-check-scanf query has lost a tons of FPs 🎉.

Commit-by-commit review extremely recommended. It's a large PR, but I actually think I managed to make it fairly reviewable 🤞.

This PR does bring a syntactic breaking changes which was just decided a couple of weeks ago would be okay for CodeQL going forward. The breaking change is that the AST wrapper around the IR-based guards library now extends Element instead of Expr. This is because the shared library also infers that Parameters (which are not Exprs in the AST) can be guards if the parameter determines a condition.

In fact, whereas the old library only made the condition (and certain subexpressions inside the condition) were Guards, the new library makes every expression a Guard, but only some of those Guards actually controls conditions. This is a fundamental difference that's always been between the C/C++ guards library and the Java/C# guards library, and we're finally getting rid of that difference which brings some more language consistency 🎉.

Footnotes

  1. Although, there is still lots of maintenance required because C/C++ still relies on our own implication of ensuresEq and ensuresLt. Getting rid of those requires switching C/C++ fully over to the shared range analysis library.

@MathiasVP MathiasVP requested a review from a team as a code owner September 18, 2025 09:23
@MathiasVP MathiasVP requested review from Copilot and removed request for a team September 18, 2025 09:23
@github-actions github-actions bot added the C++ label Sep 18, 2025
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR switches the C++ CodeQL library from its own guards library to the shared guards library that was introduced previously. The main purpose is to reduce C++-specific QL code maintenance and enable C++ to use the shared "nullness" library. This change also improves query results on existing queries, particularly reducing false positives in the cpp/missing-check-scanf query.

Key changes include:

  • Replacing the old C++-specific guards implementation with the shared guards library
  • Updating guard value types from AbstractValue to GuardValue
  • Making guards extend Element instead of Expr to support parameters as guards
  • Improving the guards logic to provide better query results

Reviewed Changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
GuardsEnsure.expected Updated test expectations showing improved guard inference with more comprehensive results
GuardsControl.ql Changed parameter type from AbstractValue to GuardValue
GuardsControl.expected Expanded test results with additional guard control relationships
GuardsCompare.ql Updated parameter type from AbstractValue to GuardValue
GuardsCompare.expected Extended comparison results with more comprehensive guard comparisons
Guards.ql Removed simple guard selection query
Guards.expected Removed corresponding test expectations
tests.ql Updated parameter types to use GuardValue
TOCTOUFilesystemRace.ql Fixed guard child access by casting to Expr
SSLResultConflation.ql Fixed guard child access by casting to Expr
SemanticExprSpecific.qll Updated method call from controlsEdge to controlsBranchEdge
Instruction.qll Added getAnInput() method to BinaryInstruction
EdgeKind.qll Enhanced switch edge handling with better value range support
SsaImpl.qll Major updates to SSA implementation with improved guard integration
IRGuards.qll Complete rewrite using shared guards library with expanded functionality
RangeAnalysis.qll Updated method call to use controlsBranchEdge

}

private import semmle.code.cpp.dataflow.new.DataFlow::DataFlow as DataFlow
private import semmle.code.cpp.ir.dataflow.internal.DataFlowPrivate as Private

Check warning

Code scanning / CodeQL

Names only differing by case Warning

Private is only different by casing from private that is used elsewhere for modules.
@MathiasVP MathiasVP marked this pull request as draft September 18, 2025 09:52
@MathiasVP MathiasVP force-pushed the use-shared-guards-library branch from c99ab52 to 6fe3e83 Compare September 18, 2025 10:21
@aschackmull
Copy link
Contributor

This is because the shared library also infers that Parameters (which are not Exprs in the AST) can be guards if the parameter determines a condition

No. Parameters can't be Guards. But case statements can - and that's what brings Guards beyond mere Exprs.

@MathiasVP
Copy link
Contributor Author

No. Parameters can't be Guards. But case statements can - and that's what brings Guards beyond mere Exprs.

Oh. I guess there's a subtle thing for C/C++ here then: In the IR world a parameter is an Instruction (specifically, it's an InitializeParameter instruction). And since Instructions are Exprs in the new guards library this means the new guards library can infer that InitializeParameter can be a guard ... and the InitializeParameter maps back to the AST-world as a Parameter.

Is that ... bad?

@MathiasVP
Copy link
Contributor Author

MathiasVP commented Sep 29, 2025

@aschackmull thanks for all the review comments so far! I think I've fixed all of them, and DCA still looks fine. Do you have any other comments before I hand it over to the C team?

Comment on lines 66 to 67
* Holds if this expression is a C/C++ specific constant value such as
* a GCC case range.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This qldoc needs a tweak, as we've purged case ranges from constants.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Fixed in ca53a8e

@aschackmull
Copy link
Contributor

@aschackmull thanks for all the review comments so far! I think I've fixed all of them, and DCA still looks fine. Do you have any other comments before I hand it over to the C team?

No, I don't think so (apart from the two minor qldoc comments), so please pass it on. I've only reviewed the parts that integrate directly with the shared library, so the C-specific bits and the tests still need a proper review.

Copy link
Contributor

@jketema jketema left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a look, skipping over 840097f. This looks reasonable to me.

I was not really able to follow e22d665. I'm not sure if @aschackmull looked at that commit?

Regarding DCA:

  • Did you look at all the alert changes. In particular, why do we gain an alert in one place?
  • neovim seems quite a bit slower, do we understand why?

One small question below.

| test.cpp:177:10:177:10 | Load: i | test.cpp:175:23:175:23 | ValueNumberBound | 1 | false | CompareLT: ... < ... | test.cpp:176:7:176:11 | test.cpp:176:7:176:11 |
| test.cpp:179:10:179:10 | Load: i | test.cpp:175:23:175:23 | ValueNumberBound | 0 | true | CompareLT: ... < ... | test.cpp:176:7:176:11 | test.cpp:176:7:176:11 |
| test.cpp:183:10:183:10 | Load: i | test.cpp:175:23:175:23 | ValueNumberBound | -1 | true | CompareLT: ... < ... | test.cpp:182:9:182:13 | test.cpp:182:9:182:13 |
| test.cpp:185:10:185:10 | Load: i | test.cpp:175:23:175:23 | ValueNumberBound | 0 | true | CompareLT: ... < ... | test.cpp:176:7:176:11 | test.cpp:176:7:176:11 |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be the only place where we lose a result, why's that?

Copy link
Contributor Author

@MathiasVP MathiasVP Sep 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well spotted. Since this tests irreducible control-flow there's probably something going wrong with back-edge detection inside the shared library. I'll take a look

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out this wasn't related to irreducible control-flow at all. There's a slight difference in which basic block is picked as directly controlling another block when we have empty blocks. This happens when there are empty blocks (or a block containing only a goto) just like we have in this test:

// (0)
if (x < i) {
  // (1)
} else {
  // (2)
  goto inLoop;
}
// (3)

The old library concluded that the edge from (1) to (3) was controlled by x < i, whereas the new library only concludes that the edge from (0) to (1) is controlled by x < i.

The new range analysis library handles this perfectly well (as seen in the test I added in 353ee8b), but the old experimental one apparently depends on this behavior.

I don't think this is worth digging more into if that's okay with you.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is worth digging more into if that's okay with you.

That's fine!

@aschackmull
Copy link
Contributor

I was not really able to follow e22d665. I'm not sure if @aschackmull looked at that commit?

It looks reasonable to me. The purpose of that commit is to connect the Guards library to BarrierGuards such that data flow sanitizers hidden in a wrapping function now are recognized. I believe this is verified with a newly added qltest.

MathiasVP and others added 2 commits September 30, 2025 14:14
Co-authored-by: Anders Schack-Mulligen <aschackmull@users.noreply.github.com>
@MathiasVP
Copy link
Contributor Author

MathiasVP commented Sep 30, 2025

Did you look at all the alert changes. In particular, why do we gain an alert in one place?

Yep, we gain a new TP on cpp/incorrect-allocation-error-handling. And this happens because this conjunct now holds.

neovim seems quite a bit slower, do we understand why?

Thanks for pointing that out. This seems to be from these two antijoins (which ultimately come from the same piece of QL):

43.5s |       |              | _IRGuards::GuardValue.asIntValue/0#dispred#336adfc2_IRGuards::GuardValue.getDualValue/0#dispred#a090__#antijoin_rhs@96a3e33k
40.9s |       |              | _IRGuards::GuardValue.asIntValue/0#dispred#336adfc2_IRGuards::GuardValue.getDualValue/0#dispred#a090__#antijoin_rhs@f86552sv

the relevant part of the evaluator log is here:

[2025-09-30 15:15:46] Evaluated non-recursive predicate _IRGuards::GuardValue.asIntValue/0#dispred#336adfc2_IRGuards::GuardValue.getDualValue/0#dispred#a090__#antijoin_rhs@96a3e33k in 43482ms (size: 30615593).
Evaluated relational algebra for predicate _IRGuards::GuardValue.asIntValue/0#dispred#336adfc2_IRGuards::GuardValue.getDualValue/0#dispred#a090__#antijoin_rhs@96a3e33k with tuple counts:
        81636037  ~155%    {4} r1 = SCAN `_IRGuards::Guards_v1::possibleValue/4#fb05e9b0__IRGuards::Guards_v1::possibleValue/4#fb05e9b0_IRGuar__#shared` OUTPUT In.2, In.4, In.0, In.1
            3182    ~0%    {4}    | JOIN WITH `IRGuards::GuardValue.getDualValue/0#dispred#a0901aea` ON FIRST 2 OUTPUT Lhs.2, Lhs.3, Lhs.0, Lhs.1
                       
        81636037  ~156%    {5} r2 = SCAN `_IRGuards::Guards_v1::possibleValue/4#fb05e9b0__IRGuards::Guards_v1::possibleValue/4#fb05e9b0_IRGuar__#shared` OUTPUT _, In.2, In.0, In.1, In.4
        81636037  ~163%    {5}    | REWRITE WITH Out.0 := false
               0    ~0%    {6}    | JOIN WITH num#IRGuards::GuardsImpl::TIntRange#30dda716_120#join_rhs ON FIRST 2 OUTPUT _, Lhs.4, Lhs.2, Lhs.3, Lhs.1, Rhs.2
               0    ~0%    {6}    | REWRITE WITH Out.0 := true
               0    ~0%    {6}    | JOIN WITH num#IRGuards::GuardsImpl::TIntRange#30dda716_120#join_rhs ON FIRST 2 OUTPUT Lhs.2, Lhs.3, Lhs.4, Lhs.1, Lhs.5, Rhs.2
                           {6}    | REWRITE WITH TEST InOut.5 < InOut.4
               0    ~0%    {4}    | SCAN OUTPUT In.0, In.1, In.2, In.3
                       
        81636037  ~165%    {5} r3 = SCAN `_IRGuards::Guards_v1::possibleValue/4#fb05e9b0__IRGuards::Guards_v1::possibleValue/4#fb05e9b0_IRGuar__#shared` OUTPUT _, In.4, In.0, In.1, In.2
        81636037  ~162%    {5}    | REWRITE WITH Out.0 := false
               0    ~0%    {6}    | JOIN WITH num#IRGuards::GuardsImpl::TIntRange#30dda716_120#join_rhs ON FIRST 2 OUTPUT _, Lhs.4, Lhs.2, Lhs.3, Lhs.1, Rhs.2
               0    ~0%    {6}    | REWRITE WITH Out.0 := true
               0    ~0%    {6}    | JOIN WITH num#IRGuards::GuardsImpl::TIntRange#30dda716_120#join_rhs ON FIRST 2 OUTPUT Lhs.2, Lhs.3, Lhs.1, Lhs.4, Lhs.5, Rhs.2
                           {6}    | REWRITE WITH TEST InOut.5 < InOut.4
               0    ~0%    {4}    | SCAN OUTPUT In.0, In.1, In.2, In.3
                       
        81636037  ~165%    {5} r4 = SCAN `_IRGuards::Guards_v1::possibleValue/4#fb05e9b0__IRGuards::Guards_v1::possibleValue/4#fb05e9b0_IRGuar__#shared` OUTPUT _, In.4, In.0, In.1, In.2
        81636037  ~166%    {5}    | REWRITE WITH Out.0 := true
        37936852   ~24%    {6}    | JOIN WITH num#IRGuards::GuardsImpl::TValue#55c23f13_120#join_rhs ON FIRST 2 OUTPUT _, Lhs.4, Lhs.2, Lhs.3, Lhs.1, Rhs.2
        37936852   ~27%    {6}    | REWRITE WITH Out.0 := true
        35873669   ~21%    {6}    | JOIN WITH num#IRGuards::GuardsImpl::TValue#55c23f13_120#join_rhs ON FIRST 2 OUTPUT Lhs.2, Lhs.3, Lhs.1, Lhs.4, Lhs.5, Rhs.2
                           {6}    | REWRITE WITH TEST InOut.4 != InOut.5
        35845012   ~17%    {4}    | SCAN OUTPUT In.0, In.1, In.2, In.3
                       
        81636037  ~172%    {4} r5 = SCAN `_IRGuards::Guards_v1::possibleValue/4#fb05e9b0__IRGuards::Guards_v1::possibleValue/4#fb05e9b0_IRGuar__#shared` OUTPUT In.2, In.0, In.1, In.4
               0    ~0%    {6}    | JOIN WITH num#IRGuards::GuardsImpl::TIntRange#30dda716_201#join_rhs ON FIRST 1 OUTPUT Lhs.3, Lhs.1, Lhs.2, Lhs.0, Rhs.1, Rhs.2
               0    ~0%    {8}    | JOIN WITH `IRGuards::GuardValue.asIntValue/0#dispred#336adfc2` ON FIRST 1 OUTPUT Lhs.1, Lhs.2, Lhs.3, Lhs.0, Lhs.4, Lhs.5, Rhs.1, _
                           {7}    | REWRITE WITH NOT [NOT [Tmp.7 := true, TEST InOut.5 = Tmp.7, TEST InOut.4 < InOut.6], NOT [Tmp.7 := false, TEST InOut.5 = Tmp.7, TEST InOut.4 > InOut.6]] KEEPING 7
               0    ~0%    {4}    | SCAN OUTPUT In.0, In.1, In.2, In.3
                       
        81636037  ~164%    {4} r6 = SCAN `_IRGuards::Guards_v1::possibleValue/4#fb05e9b0__IRGuards::Guards_v1::possibleValue/4#fb05e9b0_IRGuar__#shared` OUTPUT In.4, In.0, In.1, In.2
               0    ~0%    {6}    | JOIN WITH num#IRGuards::GuardsImpl::TIntRange#30dda716_201#join_rhs ON FIRST 1 OUTPUT Lhs.3, Lhs.1, Lhs.2, Lhs.0, Rhs.1, Rhs.2
               0    ~0%    {8}    | JOIN WITH `IRGuards::GuardValue.asIntValue/0#dispred#336adfc2` ON FIRST 1 OUTPUT Lhs.1, Lhs.2, Lhs.0, Lhs.3, Lhs.4, Lhs.5, Rhs.1, _
                           {7}    | REWRITE WITH NOT [NOT [Tmp.7 := true, TEST InOut.5 = Tmp.7, TEST InOut.4 < InOut.6], NOT [Tmp.7 := false, TEST InOut.5 = Tmp.7, TEST InOut.4 > InOut.6]] KEEPING 7
               0    ~0%    {4}    | SCAN OUTPUT In.0, In.1, In.2, In.3
                       
        35848194   ~17%    {4} r7 = r1 UNION r2 UNION r3 UNION r4 UNION r5 UNION r6
                           return r7

[2025-09-30 15:15:54] Evaluated non-recursive predicate __IRGuards::GuardValue.asIntValue/0#dispred#336adfc2_IRGuards::GuardValue.getDualValue/0#dispred#a09__#antijoin_rhs@ba744cmu in 8081ms (size: 10756).
Evaluated relational algebra for predicate __IRGuards::GuardValue.asIntValue/0#dispred#336adfc2_IRGuards::GuardValue.getDualValue/0#dispred#a09__#antijoin_rhs@ba744cmu with tuple counts:
        81636037  ~165%    {4} r1 = SCAN `_IRGuards::Guards_v1::possibleValue/4#fb05e9b0__IRGuards::Guards_v1::possibleValue/4#fb05e9b0_IRGuar__#shared` OUTPUT In.0, In.1, In.2, In.4
                           {4}    | AND NOT `_IRGuards::GuardValue.asIntValue/0#dispred#336adfc2_IRGuards::GuardValue.getDualValue/0#dispred#a090__#antijoin_rhs`(FIRST 4)
           44131  ~306%    {3}    | SCAN OUTPUT In.0, In.1, In.2
                           return r1

[2025-09-30 15:16:04] Evaluated non-recursive predicate IRGuards::Guards_v1::uniqueValue/3#833ab3f8@9663d744 in 9641ms (size: 17098).
Evaluated relational algebra for predicate IRGuards::Guards_v1::uniqueValue/3#833ab3f8@9663d744 with tuple counts:
           86654       ~2%    {3} r1 = `_IRGuards::Guards_v1::possibleValue/4#fb05e9b0_IRGuards::Guards_v1::possibleValue/4#fb05e9b0_1023#jo__#shared` AND NOT `__IRGuards::GuardValue.asIntValue/0#dispred#336adfc2_IRGuards::GuardValue.getDualValue/0#dispred#a09__#antijoin_rhs`(FIRST 3)
        23707587       ~0%    {5}    | JOIN WITH `IRGuards::Guards_v1::possibleValue/4#fb05e9b0` ON FIRST 1 OUTPUT Lhs.0, Lhs.1, Lhs.2, Rhs.2, Rhs.3
        23620933       ~0%    {5}    | REWRITE WITH TEST InOut.1 != InOut.3

So it seems like the forex here is generating quite bad RA Or actually, maybe the RA is as expected. I'll see if I can figure out (cc @aschackmull if you have any ideas)

@MathiasVP
Copy link
Contributor Author

MathiasVP commented Oct 1, 2025

I can't seem to find a proper join order that doesn't lead to a join between the sets (possibleValue(v, false, e, k) and not possibleValue(v, true, e, k)) and possibleValue(v, _, other, otherval) on v which blows up because this histogram:

private predicate foo(SsaDefinition v, int n) {
  n =
    strictcount(Expr e, GuardValue k, Expr other, GuardValue otherval |
      possibleValue(v, false, e, k) and
      not possibleValue(v, true, e, k) and
      possibleValue(v, _, other, otherval)
    ) and
  n > 1
}

has a scary top row:
image

(that's the stupid lexer in neovim showing up again 😂)

@aschackmull do you have any ideas? Otherwise, I think we just have to live with it for now.

(If you want to repro it you can run cpp/missing-check-scanf on neovim from DCA)

@aschackmull
Copy link
Contributor

aschackmull commented Oct 1, 2025

So I think I have a partial fix - it doesn't exactly solve the problem, but it does cut down on one aspect of the cartesian join. Since we're doing a forall over all other expressions to check whether they have disjoint values, then we can split this into a case for those other expressions with the same value and those with different values. The former can be expressed as a strictcount instead of a forall, which is much more efficient, and the latter is then reduced in size, since the expression column can be projected away.
Try this:

diff --git a/shared/controlflow/codeql/controlflow/Guards.qll b/shared/controlflow/codeql/controlflow/Guards.qll
index 0bbfb29e4e6..161cdd22711 100644
--- a/shared/controlflow/codeql/controlflow/Guards.qll
+++ b/shared/controlflow/codeql/controlflow/Guards.qll
@@ -708,7 +708,8 @@ module Make<
     private predicate uniqueValue(SsaDefinition v, Expr e, GuardValue k) {
       possibleValue(v, false, e, k) and
       not possibleValue(v, true, e, k) and
-      forex(Expr other, GuardValue otherval | possibleValue(v, _, other, otherval) and other != e |
+      1 = strictcount(Expr e0 | possibleValue(v, _, e0, k)) and
+      forex(GuardValue otherval | possibleValue(v, _, _, otherval) and otherval != k |
         disjointValues(otherval, k)
       )
     }

@aschackmull
Copy link
Contributor

Hold on! This predicate is built on top of constantHasValue, so we're only considering constants with singleton values - such values are trivially disjoint, so we can replace the entire forex. PR incoming.

@aschackmull
Copy link
Contributor

#20569

@MathiasVP
Copy link
Contributor Author

Even the first fix improves things quite a lot:

[2025-10-01 12:42:45] Evaluated non-recursive predicate _IRGuards::GuardValue.asIntValue/0#dispred#336adfc2_IRGuards::GuardValue.getDualValue/0#dispred#a090__#antijoin_rhs@068e88vg in 1591ms (size: 9202903).
Evaluated relational algebra for predicate _IRGuards::GuardValue.asIntValue/0#dispred#336adfc2_IRGuards::GuardValue.getDualValue/0#dispred#a090__#antijoin_rhs@068e88vg with tuple counts:
        9202903   ~8%    {3} r1 = SCAN `___IRGuards::Guards_v1::possibleValue/4#fb05e9b0_032#count_range#join_rhs_const_1#shared_project#IRG__#shared` OUTPUT In.1, In.2, In.0
           3055   ~0%    {3}    | JOIN WITH `IRGuards::GuardValue.getDualValue/0#dispred#a0901aea` ON FIRST 2 OUTPUT Lhs.2, Lhs.0, Lhs.1
                     
        9202903   ~0%    {4} r2 = SCAN `___IRGuards::Guards_v1::possibleValue/4#fb05e9b0_032#count_range#join_rhs_const_1#shared_project#IRG__#shared` OUTPUT _, In.1, In.0, In.2
        9202903   ~3%    {4}    | REWRITE WITH Out.0 := false
              0   ~0%    {5}    | JOIN WITH num#IRGuards::GuardsImpl::TIntRange#30dda716_120#join_rhs ON FIRST 2 OUTPUT _, Lhs.3, Lhs.2, Lhs.1, Rhs.2
              0   ~0%    {5}    | REWRITE WITH Out.0 := true
              0   ~0%    {5}    | JOIN WITH num#IRGuards::GuardsImpl::TIntRange#30dda716_120#join_rhs ON FIRST 2 OUTPUT Lhs.2, Lhs.3, Lhs.1, Lhs.4, Rhs.2
                         {5}    | REWRITE WITH TEST InOut.4 < InOut.3
              0   ~0%    {3}    | SCAN OUTPUT In.0, In.1, In.2
                     
        9202903   ~0%    {4} r3 = SCAN `___IRGuards::Guards_v1::possibleValue/4#fb05e9b0_032#count_range#join_rhs_const_1#shared_project#IRG__#shared` OUTPUT _, In.2, In.0, In.1
        9202903   ~0%    {4}    | REWRITE WITH Out.0 := false
              0   ~0%    {5}    | JOIN WITH num#IRGuards::GuardsImpl::TIntRange#30dda716_120#join_rhs ON FIRST 2 OUTPUT _, Lhs.3, Lhs.2, Lhs.1, Rhs.2
              0   ~0%    {5}    | REWRITE WITH Out.0 := true
              0   ~0%    {5}    | JOIN WITH num#IRGuards::GuardsImpl::TIntRange#30dda716_120#join_rhs ON FIRST 2 OUTPUT Lhs.2, Lhs.1, Lhs.3, Lhs.4, Rhs.2
                         {5}    | REWRITE WITH TEST InOut.4 < InOut.3
              0   ~0%    {3}    | SCAN OUTPUT In.0, In.1, In.2
                     
        9202903   ~0%    {4} r4 = SCAN `___IRGuards::Guards_v1::possibleValue/4#fb05e9b0_032#count_range#join_rhs_const_1#shared_project#IRG__#shared` OUTPUT _, In.2, In.0, In.1
        9202903   ~0%    {4}    | REWRITE WITH Out.0 := true
        9201475   ~1%    {5}    | JOIN WITH num#IRGuards::GuardsImpl::TValue#55c23f13_120#join_rhs ON FIRST 2 OUTPUT _, Lhs.3, Lhs.2, Lhs.1, Rhs.2
        9201475   ~3%    {5}    | REWRITE WITH Out.0 := true
        9199848   ~0%    {5}    | JOIN WITH num#IRGuards::GuardsImpl::TValue#55c23f13_120#join_rhs ON FIRST 2 OUTPUT Lhs.2, Lhs.1, Lhs.3, Lhs.4, Rhs.2
                         {5}    | REWRITE WITH TEST InOut.3 != InOut.4
        9199848   ~0%    {3}    | SCAN OUTPUT In.0, In.1, In.2
                     
        9202903   ~1%    {3} r5 = SCAN `___IRGuards::Guards_v1::possibleValue/4#fb05e9b0_032#count_range#join_rhs_const_1#shared_project#IRG__#shared` OUTPUT In.1, In.0, In.2
              0   ~0%    {5}    | JOIN WITH num#IRGuards::GuardsImpl::TIntRange#30dda716_201#join_rhs ON FIRST 1 OUTPUT Lhs.2, Lhs.1, Lhs.0, Rhs.1, Rhs.2
              0   ~0%    {7}    | JOIN WITH `IRGuards::GuardValue.asIntValue/0#dispred#336adfc2` ON FIRST 1 OUTPUT Lhs.1, Lhs.2, Lhs.0, Lhs.3, Lhs.4, Rhs.1, _
                         {6}    | REWRITE WITH NOT [NOT [Tmp.6 := true, TEST InOut.4 = Tmp.6, TEST InOut.3 < InOut.5], NOT [Tmp.6 := false, TEST InOut.4 = Tmp.6, TEST InOut.3 > InOut.5]] KEEPING 6
              0   ~0%    {3}    | SCAN OUTPUT In.0, In.1, In.2
                     
        9202903   ~0%    {3} r6 = SCAN `___IRGuards::Guards_v1::possibleValue/4#fb05e9b0_032#count_range#join_rhs_const_1#shared_project#IRG__#shared` OUTPUT In.2, In.0, In.1
              0   ~0%    {5}    | JOIN WITH num#IRGuards::GuardsImpl::TIntRange#30dda716_201#join_rhs ON FIRST 1 OUTPUT Lhs.2, Lhs.1, Lhs.0, Rhs.1, Rhs.2
              0   ~0%    {7}    | JOIN WITH `IRGuards::GuardValue.asIntValue/0#dispred#336adfc2` ON FIRST 1 OUTPUT Lhs.1, Lhs.0, Lhs.2, Lhs.3, Lhs.4, Rhs.1, _
                         {6}    | REWRITE WITH NOT [NOT [Tmp.6 := true, TEST InOut.4 = Tmp.6, TEST InOut.3 < InOut.5], NOT [Tmp.6 := false, TEST InOut.4 = Tmp.6, TEST InOut.3 > InOut.5]] KEEPING 6
              0   ~0%    {3}    | SCAN OUTPUT In.0, In.1, In.2
                     
        9202903   ~0%    {3} r7 = r1 UNION r2 UNION r3 UNION r4 UNION r5 UNION r6
                         return r7

[2025-10-01 12:42:45] Evaluated non-recursive predicate __IRGuards::GuardValue.asIntValue/0#dispred#336adfc2_IRGuards::GuardValue.getDualValue/0#dispred#a09__#antijoin_rhs@9e4ff5b8 in 141ms (size: 0).
Evaluated relational algebra for predicate __IRGuards::GuardValue.asIntValue/0#dispred#336adfc2_IRGuards::GuardValue.getDualValue/0#dispred#a09__#antijoin_rhs@9e4ff5b8 with tuple counts:
                    {3} r1 = `___IRGuards::Guards_v1::possibleValue/4#fb05e9b0_032#count_range#join_rhs_const_1#shared_project#IRG__#shared` AND NOT `_IRGuards::GuardValue.asIntValue/0#dispred#336adfc2_IRGuards::GuardValue.getDualValue/0#dispred#a090__#antijoin_rhs`(FIRST 3)
         0   ~0%    {2}    | SCAN OUTPUT In.0, In.1
         0   ~0%    {2}    | STREAM DEDUP
                    return r1

[2025-10-01 12:42:47] Evaluated non-recursive predicate IRGuards::Guards_v1::uniqueValue/3#833ab3f8@9118d5t9 in 2414ms (size: 17098).
Evaluated relational algebra for predicate IRGuards::Guards_v1::uniqueValue/3#833ab3f8@9118d5t9 with tuple counts:
                            {2} r1 = `__IRGuards::Guards_v1::possibleValue/4#fb05e9b0_032#count_range#join_rhs_const_1#shared` AND NOT `__IRGuards::GuardValue.asIntValue/0#dispred#336adfc2_IRGuards::GuardValue.getDualValue/0#dispred#a09__#antijoin_rhs`(FIRST 2)
          88571      ~0%    {3}    | SCAN OUTPUT In.0, _, In.1
          88571      ~2%    {3}    | REWRITE WITH Out.1 := false
          88072      ~0%    {4}    | JOIN WITH `IRGuards::Guards_v1::possibleValue/4#fb05e9b0_0132#join_rhs` ON FIRST 3 OUTPUT Lhs.0, _, Rhs.3, Lhs.2
          88072      ~4%    {4}    | REWRITE WITH Out.1 := true
                            {4}    | AND NOT `IRGuards::Guards_v1::possibleValue/4#fb05e9b0`(FIRST 4)
          86654      ~0%    {3}    | SCAN OUTPUT In.0, In.3, In.2
        9287557      ~0%    {4}    | JOIN WITH `project#IRGuards::Guards_v1::possibleValue/4#fb05e9b0` ON FIRST 1 OUTPUT Lhs.0, Lhs.1, Lhs.2, Rhs.1
        9200903      ~0%    {4}    | REWRITE WITH TEST InOut.1 != InOut.3

I'll create a local branch containing this PR and #20569 and do another DCA run to see if this fixes things 🤞

@MathiasVP
Copy link
Contributor Author

@jketema the slowdown on neovim has been fixed by #20569 (see the backlinked DCA run). I think that resolves all your comments!

Copy link
Contributor

@jketema jketema left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

In fine with this getting merged. I do observe that this is another case where we do see longer analysis times, and we've been creeping up over the last few months for various reasons. Here zeek-spicy is close to 10%. I wonder if there's something we can do to regain at least some performance.

@MathiasVP
Copy link
Contributor Author

Totally agreed. I'll take a look at zeek-spicy as well once this is merged 👍

@jketema jketema merged commit a34d6d4 into github:main Oct 2, 2025
16 of 17 checks passed
MathiasVP added a commit to MathiasVP/codeql-coding-standards that referenced this pull request Oct 2, 2025
MathiasVP added a commit to MathiasVP/codeql-coding-standards that referenced this pull request Oct 2, 2025
… Notice that this leaves a missing result because the guard condition logic is now better.
MathiasVP added a commit to MathiasVP/codeql-coding-standards that referenced this pull request Oct 2, 2025
MathiasVP added a commit to MathiasVP/codeql-coding-standards that referenced this pull request Oct 2, 2025
… Notice that this leaves a missing result because the guard condition logic is now interprocedural.
MathiasVP added a commit to MathiasVP/codeql-coding-standards that referenced this pull request Oct 2, 2025
MathiasVP added a commit to MathiasVP/codeql-coding-standards that referenced this pull request Oct 2, 2025
jketema added a commit to github/codeql-coding-standards that referenced this pull request Oct 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants