-
Notifications
You must be signed in to change notification settings - Fork 65
print the diff between right result and wrong result #141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Weizhen Wang <wangweizhen@pingcap.com>
YangKeao
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
|
||
| func (e *WrongResultError) Error() string { | ||
| diff := diffmatchpatch.New() | ||
| diffText := diff.DiffPrettyText(diff.DiffMain(e.actual, e.expected, false)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the effect of checklines = false?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I gave it a try, but I couldn't see any difference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot:
In sergi/go-diff, the checklines parameter controls whether the diff algorithm first does a fast, line‑level diff before falling back to a more precise, character‑level diff.
Specifically:
DiffMain(text1, text2, checklines bool)passeschecklinesdown todiffCompute:func (dmp *DiffMatchPatch) diffCompute(text1, text2 []rune, checklines bool, deadline time.Time) []Diff { … } else if checklines && len(text1) > 100 && len(text2) > 100 { return dmp.diffLineMode(text1, text2, deadline) } return dmp.diffBisect(text1, text2, deadline) }
- When
checklines == trueand both texts are longer than 100 runes, it usesdiffLineMode, which:- Does a quick diff at line granularity (treating each line as a token), then
- “Rediffs” the changed line blocks at character level for better accuracy.
Effects in practice
-
checklines = false- Always uses
diffBisect(a Myers-style character-level diff). - Slower on large inputs, but tends to give minimal/optimal character-level diffs.
- Always uses
-
checklines = true- For long texts: runs the faster line-mode pre-pass, then refines.
- This usually gives a significant speedup on big inputs, at the cost of possibly non-minimal diffs (more or differently grouped hunks than the theoretical minimum).
The tests (TestDiffMainWithCheckLines) confirm that:
- For most cases, results with and without
checklinesmatch exactly. - It’s explicitly documented that the speedup “can produce non-minimal diffs” (comment on
diffLineMode), and there’s a TODO about a failing test case, highlighting that behavior can differ.
bb7133
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
When your result set changes significantly, it's not easy to directly discern the differences from the logs. I've modified the log format so that it can print the diff.
before
after the improvement