[DRAFT] Add debug information to litmus format by hernanponcedeleon · Pull Request #1006 · hernanponcedeleon/Dat3M

hernanponcedeleon · 2026-03-11T19:09:21Z

We currently provide no useful debug information to the user for litmus programs . For example

> java -jar dartagnan/target/dartagnan.jar cat/vmm.cat --property=cat_spec litmus/C11/auto/cyc_na.litmus
Test: litmus/C11/auto/cyc_na.litmus
Result: FAIL
Reason: CAT specification violation found
Details:
        Flag r-data-race
        E3 / E3 @unknown / @unknown
Time: 0.582 secs

The same is also true for witnesses.

This PR adds line number debug information to litmus code. So far it only does so for a few instructions (thus the draft status), but once we agree on code and the format we want for the output/witness, I will move the code in addChildWithMetadata to addChild and pass the LOC in all litmus visitors.

Output format: once we have proper lines of code for every format, which we already have for spirv/llvm (unless they fail to provide debug info, but I have rarely seen this happen), I think the information about the events id becomes completely useless for the user (it is probably already useless unless they know how to use the printer). However, it might still be useful for developers, so I will keep it, but at the end of the output.

The output for the example above looks like this now

> cat -n litmus/C11/auto/cyc_na.litmus 
     1  C cyc_na
     2  { [x] = 0; [y] = 0; }
     3
     4  P0 (volatile int* x, volatile int* y) {
     5    int r0 = *x;
     6    if (r0) {
     7      *y = 1;
     8    }
     9  }
    10
    11  P1 (volatile int* x, volatile int* y) {
    12    int r1 = *y;
    13    if (r1) {
    14      *x = 1;
    15    }
    16  }
    17
    18  exists (0:r0=1 /\ 1:r1=1)

> java -jar dartagnan/target/dartagnan.jar cat/vmm.cat --property=cat_spec litmus/C11/auto/cyc_na.litmus
Test: litmus/C11/auto/cyc_na.litmus
Result: FAIL
Reason: CAT specification violation found
Details:
        Flag r-data-race
        P1#12 / P1#12   (E14 / E14)
Time: 0.489 secs

This is even more useful for litmus syntax where each thread is a column

> cat -n /home/drc/git/Dat3M/litmus/VULKAN/Data-Race/mpnotinscope2-filter.litmus
     1  Vulkan mpnotinscope2-filter
     2  {
     3  x=0;
     4  y=0;
     5  }
     6  P0@sg 0, wg 0, qf 0            | P1@sg 0, wg 1, qf 0             ;
     7  st.atom.rel.wg.sc0.semsc0 x, 1 | ld.atom.acq.dv.sc0.semsc0 r0, y ;
     8  st.atom.rel.dv.sc0.semsc0 y, 1 | ld.atom.acq.wg.sc0.semsc0 r1, x ;
     9
    10  filter
    11  (P1:r0 == 1)

> java -jar dartagnan/target/dartagnan.jar cat/vulkan.cat --target=vulkan --property=cat_spec /home/drc/git/Dat3M/litmus/VULKAN/Data-Race/mpnotinscope2-filter.litmus
Test: /home/drc/git/Dat3M/litmus/VULKAN/Data-Race/mpnotinscope2-filter.litmus
Filter: bv64 r0 == bv64(1)
Result: FAIL
Reason: CAT specification violation found
Details:
        Flag racy
        P0#7 / P1#8     (E1 / E7)
Time: 0.661 secs

For the witness, since the nodes already contain the thread, I only use the LOC.

ThomasHaas · 2026-03-11T19:19:11Z

dartagnan/src/main/java/com/dat3m/dartagnan/parsers/program/utils/ProgramBuilder.java

+    // TODO: make this the default addChild implementation
+    public Event addChildWithMetadata(int fid, Event child, int ln) {
+        Event e = addChild(fid, child);
+        e.setMetadata(new LineNumber(ln));
+        return e;
+    }


It should probably be named addChildWithSourceInfo/DebugInfo or even addChildWithLineNumber.
Of course, if you eventually decide that this is the default addChild implementation, then the name can be simplified again.

Yes, as the PR description mentions, I will eventually merge both methods into a default addChild

ThomasHaas · 2026-03-11T19:39:22Z

dartagnan/src/main/java/com/dat3m/dartagnan/program/analysis/SyntacticContextAnalysis.java

+    public static String getLitmusLocationString(Event ev, boolean showThread) {
+        final String prefix = showThread ? ev.getThread().getName() : "";
+        return ev.hasMetadata(LineNumber.class) ?
+            String.format("%s#%s", prefix, ev.getMetadata(LineNumber.class).lineNumber()) :
+            "@unknown";
+    }
+


I'm not conviced that the use-site should know what shape the source location comes in.
Maybe you should actually have SourceLocation as the root type, and then subtypes like LineNumber.
Maybe it is even better to have language-specific subtypes like LitmusSourceLoc and LLVMSourceLoc (or GenericSourceLoc for high-level languages?). You can define those even as nested types SourceLocation.Litmus and SourceLocation.LLVM/SourceLocation.Generic within SourceLocation.

The getSourceLocationString method can do the case distinction based on the type of the source location.
The caller might still need to do a case-distinction based on whether he wants the thread to be printed or not though (unless we simply ignore that parameter for LLVM source locations).
However, you should fix MetadataMap.contains to look for subclasses instead of strict class equality (oddly enough, MetadataMap.get is based on subclassing?!). Then just use MetadataMap.get(SourceLocation.class).
In fact, if you overwrite toString for both types of source location appropriately, you don't even need a case distinction at all: just use String.format("@%s%s", prefix, srcLoc) (I would still put "@" for litmus code!)

ThomasHaas · 2026-03-12T10:17:05Z

Btw. we should stay with the name SourceLocation. I think DebugInfo is more suitable to store information like source-level types and variable names.

ThomasHaas · 2026-03-12T20:51:56Z

I pushed a variant of what I have suggested. Feel free to revert or modify the changes if you don't like them.
I have also not tested if the output has changed or not.
After the refactor, it looks like the two variants of SourceLocation can be merged again by making the filePath optional. Then we could add static factory methods like SourceLocation.withFilePath() or SourceLocation.onlyLineNumber().

Signed-off-by: Hernan Ponce de Leon <hernanl.leon@huawei.com>

…tions

hernanponcedeleon · 2026-03-13T10:33:18Z

The output is still the same and the code LGTM (thanks!). Maybe I would just change public static String getSourceLocationString(Event ev, boolean showThread) to public static String getSourceLocationString(Event ev, boolean showThreadForLitmus) to make it clear in the method signature what the implementation actually do.

I'll change all the litmus visitors to make use of the new metadata.

ThomasHaas · 2026-03-13T10:45:45Z

The output is still the same and the code LGTM (thanks!). Maybe I would just change public static String getSourceLocationString(Event ev, boolean showThread) to public static String getSourceLocationString(Event ev, boolean showThreadForLitmus) to make it clear in the method signature what the implementation actually do.

We could also have SourceLocation.Litmus contain a String threadName? I think this would make sense. For example, it would allow us to rename our internal threads (for whatever reason) without breaking source location information. After all, the source location should be independent of whatever we do internally.
But then I would unconditionally print it. Is the only case you want to avoid having the thread printed twice in the execution witness?

hernanponcedeleon · 2026-03-13T10:49:50Z

I thought about it, but then we could not simply use toString() ... we still need a way to enable/disable showing the thread.

There are only two places where we use SourceLocation: command output and witness (we might want to use this in the UI, but AFAIR we do not do it now). We want to show it in the first one (this is particularly useful for the litmus variants where threads are in columns), but not for the later (to avoid repetition)

ThomasHaas · 2026-03-13T10:58:44Z

I thought about it, but then we could not simply use toString() ... we still need a way to enable/disable showing the thread.

That's why I'm asking if we should unconditionally print the thread. Our events always have an IR-level location and optionally a source-level location, and currently we just print both of them (which is IMO fine).
I'm not sure if it is worth working around cases where both types of locations happen to have something in common.
And we could always have an option that generates execution witnesses only with source location information and omits our internal structure. Then the duplication would automatically be gone.

There are only two places where we use SourceLocation: command output and witness (we might want to use this in the UI, but AFAIR we do not do it now). We want to show it in the first one (this is particularly useful for the litmus variants where threads are in columns), but not for the later (to avoid repetition)

Is it really worth it to avoid the repetition though?

ThomasHaas reviewed Mar 11, 2026

View reviewed changes

hernan-poncedeleon and others added 2 commits March 13, 2026 11:09

Add debug information to litmus format

723a549

Signed-off-by: Hernan Ponce de Leon <hernanl.leon@huawei.com>

Refactor SourceLocation -> it is now an interface with two implementa…

59e2775

…tions

hernanponcedeleon force-pushed the debug branch from 9488289 to 59e2775 Compare March 13, 2026 10:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DRAFT] Add debug information to litmus format#1006

[DRAFT] Add debug information to litmus format#1006
hernanponcedeleon wants to merge 2 commits intodevelopmentfrom
debug

hernanponcedeleon commented Mar 11, 2026

Uh oh!

ThomasHaas Mar 11, 2026

Uh oh!

hernanponcedeleon Mar 11, 2026

Uh oh!

ThomasHaas Mar 11, 2026

Uh oh!

ThomasHaas commented Mar 12, 2026

Uh oh!

ThomasHaas commented Mar 12, 2026

Uh oh!

hernanponcedeleon commented Mar 13, 2026

Uh oh!

ThomasHaas commented Mar 13, 2026

Uh oh!

hernanponcedeleon commented Mar 13, 2026

Uh oh!

ThomasHaas commented Mar 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

hernanponcedeleon commented Mar 11, 2026

Uh oh!

ThomasHaas Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

hernanponcedeleon Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

ThomasHaas Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

ThomasHaas commented Mar 12, 2026

Uh oh!

ThomasHaas commented Mar 12, 2026

Uh oh!

hernanponcedeleon commented Mar 13, 2026

Uh oh!

ThomasHaas commented Mar 13, 2026

Uh oh!

hernanponcedeleon commented Mar 13, 2026

Uh oh!

ThomasHaas commented Mar 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants