-
Notifications
You must be signed in to change notification settings - Fork 1.1k
SIP-72: WIP dedented triple-quoted string literals #24185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
That was fast! |
Not ready to review yet! Still need a bit more vibing haha |
Ah right, I'll convert it to draft then |
|
||
val hasTabs = closingIndent.contains('\t') | ||
val hasSpaces = closingIndent.contains(' ') | ||
if (hasTabs && hasSpaces) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be able to detect this in one loop
I think this is in a pretty good state right now. There's some funniness around the test infrastructure I haven't figured out, and obviously it would have to wait for scala/improvement-proposals#112, but all the tests pass when run manually. Despite attempting to vibe code the whole thing, claude could only get about 80% of the way there. The last 20% of feature-work/debugging/cleanup I ended up doing by hand. Could certainly use more cleanup, but this is probably the natural stopping point until the SIP goes into the next phase |
else | ||
literal(inTypeOrSingleton = true) | ||
|
||
/** Dedent a string literal by removing common leading whitespace. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For new code in the compiler we use indentation syntax and new conditional if / then / else
syntax. The old Java conditional syntax is already disabled under -language.future
.
val isDedented = | ||
in.charOffset + 2 < in.buf.length && | ||
in.buf(in.charOffset - 1) == '\'' && | ||
in.buf(in.charOffset) == '\'' && | ||
in.buf(in.charOffset + 1) == '\'' | ||
in.nextToken() | ||
def nextSegment(literalOffset: Offset) = | ||
segmentBuf += Thicket( | ||
literal(literalOffset, inPattern = inPattern, inStringInterpolation = true), | ||
atSpan(in.offset) { | ||
if (in.token == IDENTIFIER) | ||
termIdent() | ||
else if (in.token == USCORE && inPattern) { | ||
in.nextToken() | ||
Ident(nme.WILDCARD) | ||
} | ||
else if (in.token == THIS) { | ||
in.nextToken() | ||
This(EmptyTypeIdent) | ||
} | ||
else if (in.token == LBRACE) | ||
if (inPattern) Block(Nil, inBraces(pattern())) | ||
else expr() | ||
else { | ||
report.error(InterpolatedStringError(), source.atSpan(Span(in.offset))) | ||
EmptyTree | ||
} | ||
}) | ||
|
||
var offsetCorrection = if isTripleQuoted then 3 else 1 | ||
while (in.token == STRINGPART) | ||
nextSegment(in.offset + offsetCorrection) | ||
// Collect all string parts and their offsets | ||
val stringParts = new ListBuffer[(String, Offset)] | ||
val interpolatedExprs = new ListBuffer[Tree] | ||
|
||
var offsetCorrection = if (isDedented) 3 else if (isTripleQuoted) 3 else 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This bit is super sketchy, I'm sure there's a better way
Implements scala/improvement-proposals#112
The initial Lexing/Scanning is lenient, only looking for the opening
''''*
and equivalent closing delimiter. This matches how we can expect this to be implemented in other tools that have more restricted lexing frameworks (IntelliJ w/ JFlex, VSCode w/ TextMate Grammars, NeoVim w/ TreeSitter)All other validation (opening delimiter must be followed by newline, closing delimiter must be preceded by whitespace only) and de-denting is left to the parsing phase, which is the only time we have a complete string "literal" when an interpolator is present, and thus are able to look at the trailing delimiter's preceding indent whitespace and trim it from all earlier
STRINGLIT
/STRINGPART
tokensdef interpolatedString
needed to be refactored to support dedenting: rather than constructing the trees immediately, we first assemble all the strings parts, then use the last string part to compute the dedent that we apply to all other parts, and only then do we construct the treesCovered by
neg/
tests andrun/
tests for all the major features and edge cases I could think of:s
andf
@compileTimeOnly
I haven't managed to reliably run tests for some reason, I think I'm bumping into https://contributors.scala-lang.org/t/current-testcompilation/7256. But I tested it manually by copy-pasting the run/neg test files into the
bin/scala
REPL and compared the output manually with the.check
files on disk, and the output is identical