Skip to content

[DRAFT] Add debug information to litmus format#1006

Open
hernanponcedeleon wants to merge 2 commits intodevelopmentfrom
debug
Open

[DRAFT] Add debug information to litmus format#1006
hernanponcedeleon wants to merge 2 commits intodevelopmentfrom
debug

Conversation

@hernanponcedeleon
Copy link
Owner

We currently provide no useful debug information to the user for litmus programs . For example

> java -jar dartagnan/target/dartagnan.jar cat/vmm.cat --property=cat_spec litmus/C11/auto/cyc_na.litmus
Test: litmus/C11/auto/cyc_na.litmus
Result: FAIL
Reason: CAT specification violation found
Details:
        Flag r-data-race
        E3 / E3 @unknown / @unknown
Time: 0.582 secs

The same is also true for witnesses.

This PR adds line number debug information to litmus code. So far it only does so for a few instructions (thus the draft status), but once we agree on code and the format we want for the output/witness, I will move the code in addChildWithMetadata to addChild and pass the LOC in all litmus visitors.

Output format: once we have proper lines of code for every format, which we already have for spirv/llvm (unless they fail to provide debug info, but I have rarely seen this happen), I think the information about the events id becomes completely useless for the user (it is probably already useless unless they know how to use the printer). However, it might still be useful for developers, so I will keep it, but at the end of the output.

The output for the example above looks like this now

> cat -n litmus/C11/auto/cyc_na.litmus 
     1  C cyc_na
     2  { [x] = 0; [y] = 0; }
     3
     4  P0 (volatile int* x, volatile int* y) {
     5    int r0 = *x;
     6    if (r0) {
     7      *y = 1;
     8    }
     9  }
    10
    11  P1 (volatile int* x, volatile int* y) {
    12    int r1 = *y;
    13    if (r1) {
    14      *x = 1;
    15    }
    16  }
    17
    18  exists (0:r0=1 /\ 1:r1=1)

> java -jar dartagnan/target/dartagnan.jar cat/vmm.cat --property=cat_spec litmus/C11/auto/cyc_na.litmus
Test: litmus/C11/auto/cyc_na.litmus
Result: FAIL
Reason: CAT specification violation found
Details:
        Flag r-data-race
        P1#12 / P1#12   (E14 / E14)
Time: 0.489 secs

This is even more useful for litmus syntax where each thread is a column

> cat -n /home/drc/git/Dat3M/litmus/VULKAN/Data-Race/mpnotinscope2-filter.litmus
     1  Vulkan mpnotinscope2-filter
     2  {
     3  x=0;
     4  y=0;
     5  }
     6  P0@sg 0, wg 0, qf 0            | P1@sg 0, wg 1, qf 0             ;
     7  st.atom.rel.wg.sc0.semsc0 x, 1 | ld.atom.acq.dv.sc0.semsc0 r0, y ;
     8  st.atom.rel.dv.sc0.semsc0 y, 1 | ld.atom.acq.wg.sc0.semsc0 r1, x ;
     9
    10  filter
    11  (P1:r0 == 1)

> java -jar dartagnan/target/dartagnan.jar cat/vulkan.cat --target=vulkan --property=cat_spec /home/drc/git/Dat3M/litmus/VULKAN/Data-Race/mpnotinscope2-filter.litmus
Test: /home/drc/git/Dat3M/litmus/VULKAN/Data-Race/mpnotinscope2-filter.litmus
Filter: bv64 r0 == bv64(1)
Result: FAIL
Reason: CAT specification violation found
Details:
        Flag racy
        P0#7 / P1#8     (E1 / E7)
Time: 0.661 secs

For the witness, since the nodes already contain the thread, I only use the LOC.

Comment on lines +197 to +202
// TODO: make this the default addChild implementation
public Event addChildWithMetadata(int fid, Event child, int ln) {
Event e = addChild(fid, child);
e.setMetadata(new LineNumber(ln));
return e;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should probably be named addChildWithSourceInfo/DebugInfo or even addChildWithLineNumber.
Of course, if you eventually decide that this is the default addChild implementation, then the name can be simplified again.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, as the PR description mentions, I will eventually merge both methods into a default addChild

Comment on lines +224 to +230
public static String getLitmusLocationString(Event ev, boolean showThread) {
final String prefix = showThread ? ev.getThread().getName() : "";
return ev.hasMetadata(LineNumber.class) ?
String.format("%s#%s", prefix, ev.getMetadata(LineNumber.class).lineNumber()) :
"@unknown";
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not conviced that the use-site should know what shape the source location comes in.
Maybe you should actually have SourceLocation as the root type, and then subtypes like LineNumber.
Maybe it is even better to have language-specific subtypes like LitmusSourceLoc and LLVMSourceLoc (or GenericSourceLoc for high-level languages?). You can define those even as nested types SourceLocation.Litmus and SourceLocation.LLVM/SourceLocation.Generic within SourceLocation.

The getSourceLocationString method can do the case distinction based on the type of the source location.
The caller might still need to do a case-distinction based on whether he wants the thread to be printed or not though (unless we simply ignore that parameter for LLVM source locations).
However, you should fix MetadataMap.contains to look for subclasses instead of strict class equality (oddly enough, MetadataMap.get is based on subclassing?!). Then just use MetadataMap.get(SourceLocation.class).
In fact, if you overwrite toString for both types of source location appropriately, you don't even need a case distinction at all: just use String.format("@%s%s", prefix, srcLoc) (I would still put "@" for litmus code!)

@ThomasHaas
Copy link
Collaborator

Btw. we should stay with the name SourceLocation. I think DebugInfo is more suitable to store information like source-level types and variable names.

@ThomasHaas
Copy link
Collaborator

I pushed a variant of what I have suggested. Feel free to revert or modify the changes if you don't like them.
I have also not tested if the output has changed or not.
After the refactor, it looks like the two variants of SourceLocation can be merged again by making the filePath optional. Then we could add static factory methods like SourceLocation.withFilePath() or SourceLocation.onlyLineNumber().

@hernanponcedeleon
Copy link
Owner Author

The output is still the same and the code LGTM (thanks!). Maybe I would just change public static String getSourceLocationString(Event ev, boolean showThread) to public static String getSourceLocationString(Event ev, boolean showThreadForLitmus) to make it clear in the method signature what the implementation actually do.

I'll change all the litmus visitors to make use of the new metadata.

@ThomasHaas
Copy link
Collaborator

The output is still the same and the code LGTM (thanks!). Maybe I would just change public static String getSourceLocationString(Event ev, boolean showThread) to public static String getSourceLocationString(Event ev, boolean showThreadForLitmus) to make it clear in the method signature what the implementation actually do.

We could also have SourceLocation.Litmus contain a String threadName? I think this would make sense. For example, it would allow us to rename our internal threads (for whatever reason) without breaking source location information. After all, the source location should be independent of whatever we do internally.
But then I would unconditionally print it. Is the only case you want to avoid having the thread printed twice in the execution witness?

@hernanponcedeleon
Copy link
Owner Author

I thought about it, but then we could not simply use toString() ... we still need a way to enable/disable showing the thread.

There are only two places where we use SourceLocation: command output and witness (we might want to use this in the UI, but AFAIR we do not do it now). We want to show it in the first one (this is particularly useful for the litmus variants where threads are in columns), but not for the later (to avoid repetition)

@ThomasHaas
Copy link
Collaborator

I thought about it, but then we could not simply use toString() ... we still need a way to enable/disable showing the thread.

That's why I'm asking if we should unconditionally print the thread. Our events always have an IR-level location and optionally a source-level location, and currently we just print both of them (which is IMO fine).
I'm not sure if it is worth working around cases where both types of locations happen to have something in common.
And we could always have an option that generates execution witnesses only with source location information and omits our internal structure. Then the duplication would automatically be gone.

There are only two places where we use SourceLocation: command output and witness (we might want to use this in the UI, but AFAIR we do not do it now). We want to show it in the first one (this is particularly useful for the litmus variants where threads are in columns), but not for the later (to avoid repetition)

Is it really worth it to avoid the repetition though?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants