Skip to content

Conversation

@happy-barney
Copy link

By extracting various declarations of cursor variable(s) into dedicated non-terminal
for statement now contains only 3 branches:

  • for (;;)
  • for ()
  • for cursor ()

with LALR(1) parser cursor cannot be empty -> it will produce conflict with for (;;) form.

  • This set of changes does not require a perldelta entry.

@happy-barney happy-barney force-pushed the hpb/cleanup branch 2 times, most recently from 30c084c to 4d18cbb Compare October 23, 2025 07:16
@happy-barney happy-barney changed the title perly - simplity non-terminal bare_statement_for perly - simplify non-terminal bare_statement_for Oct 23, 2025
@happy-barney happy-barney force-pushed the hpb/cleanup branch 3 times, most recently from 43552bd to 7f8a73f Compare October 31, 2025 16:56
Comment on lines -349 to +359
PERLY_PAREN_OPEN
remember
PERLY_PAREN_OPEN

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit [parser] bare_statement_for - move remember right after keyword forming sequence KW_FOR remember.

This commit message is still a bit too cryptic for me.

Non-terminal remember is empty non-terminal with action starting
new full lexical scope. Such empty non-terminals, when used in
multiple branches, will generate shift/reduce conflict when not
appearing after distinguishable sequence of terminals.

I simply can't understand most of that..
Reading the Such empty non-terminals, when used in multiple branches, will generate shift/reduce conflict makes me wonder if there is/was an issue with how it was written before and/or if there is any change to how the construct is now treated.

Change formalizes KW_FOR as a keyword starting new lexical scope.

The commit message should also what the consequence of this is. (Does it change anything? If yes then what? If no then explicitly mention it in the commit message).

Also missing a bit of the "why"; What was the real problem (if any) with the old code?

%type <opval> bare_statement_when
%type <opval> bare_statement_while
%type <opval> bare_statement_yadayada
%type <opval> clause_mexpr

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit [parser] clause_mexpr - deduplicate shared pattern into non-terminal PERLY_PAREN_OPEN mexpr PERLY_PAREN_CLOSE

For this commit I would use something like:

Add `clause_mexpr` to remove duplicated code

The pattern `PERLY_PAREN_OPEN mexpr `PERLY_PAREN_CLOSE` is repeated in several places
add `clause_mexpr` to avoid this repetition.

This change should not affect any of the behaviour. 

In my opinion having: [parser] clause_mexpr and non-terminal and PERLY_PAREN_OPEN mexpr PERLY_PAREN_CLOSE in the commit title does not add any real value. (The title is too long and reading it in git log --oneline I would really need to think hard what it really means and if it might alter behavior or not)

perly.y Outdated
remember
KW_MY
my_scalar
my_scalar[cursor]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit [parser] bare_statement_for - rename variable(s) product to cursor to visualize possible deduplication

To me (but I'm unfamiliar with the parser) variable(s) product is meaningless/confusing.

I also find the word cursor confusing (but again could be because I'm unfamiliar with the code and/or non-native English). I don't know what to expect when I see the word cursor..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It made me think of a cursor in a database sense, which lets you iterate over the results of a database query, just as foreach style loops iterate over a list.

perly.y Outdated
%type <opval> bare_statement_when
%type <opval> bare_statement_while
%type <opval> bare_statement_yadayada
%type <opval> clause_for_cursor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit [parser] clause_for_cursor - encapsulate FOR loop cursor variables

This is already an improvement;
What I would still add in the commit message is what branches are now merged.

I.e. something like:

Merge the for-branches of:
- `for $scalar (...)`
- `for my $scalar (...)`
- `for my ($foo, $bar) (...)`
- `for \ $scalar (...)`
- `for my \ $scalar (..)`

into a single branch with the parsing of the variables handled  by `clause_for_cusror`

@happy-barney
Copy link
Author

@bram-perl thanks for your comments on commit messages. I will use it to improve my commit messages.

Based on those I also realized that with end status after this PR it will is reasonable to write dedicated tests focusing of modified part, using commit structure: "first introduce new test(s); second changes".

@happy-barney happy-barney force-pushed the hpb/cleanup branch 2 times, most recently from 241c44d to 2b2eeb9 Compare November 9, 2025 16:49
@happy-barney
Copy link
Author

@bram-perl I've updated PR and commit messages, can you please look whether it is better described ?

print STDERR @_;
}

sub assume {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit [tests] New assert function assume with test message first

This needs more documentation but not sure where/what the proper place is.
At the very least I expect to see comments to indicate what can keys are supported in %args.

A quick look at the places where it is used: personally I'm not a big fan of it :/

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments added.

I already extend this function in my next branch with keys:

  • fresh_perl => {} - when provided, uses fresh_perl instead of eval (I'm trying to fix segfault)
  • todo => q (...) - when provided, given assert is marked as todo
  • experiment => q (name) - when set, enables given experiment for evaluated code

Regarding usage:

Function's name comes for BDD approach.

Main reason I'm using (even have module like this: Test::YAFT) it is getting rid of boring boilerplate "lives_ok / dies_ok / value assert" polluting tests with unnecessary complexity, when single expect / throws is enough.

For example, following pattern:

TODO: {
    local $::TODO = $todo_message;
    todo_skip $skip_reason if $skip_condition;
    fresh_perl ($code, $args);
   is (0, $?, q (executed ok));
   like ($results, qr (regex));
}

can be written like:

assume q (...)
    => todo => $todo_message
    => skip  => sub { $skip_reason if $skip_condition }
    => fresh_perl => $args
    => expect => qr (regex)
    => eval => $code
    ;

and once todo functionality is implemented, you can simply remove just todo and maybe fresh_perl lines.

Comment on lines +6354 to +6356
t/op/for-control-scope.t See if scope of for-control control variable(s) works
t/op/for-many.t See if n-at-a-time for loops work
t/op/for-over-scope.t See if scope of for-over cursor variable(s) works

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit [tests] New tests focused on scope behaviour of loop variables

The for (;;) variant is now called "for-control" and the for $var () variant is called "for-over" and/or "for-over cursor".
Did you come up with these names or are they based on something?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are based on loop behaviour:

  • for-control - for loop with control variable (C-like syntax)
  • for-over - loop over list

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Preamble: when I'm looking at code/when reviewing I tend to be picky about names because I want names to be expressive and instantly make it clear what it is.

Let me clarify the question: did you come up with these names or are they already well established somewhere?

The reason why I'm asking: before this PR I would have no clue what a "for control" loop would mean (I - obviously - know the construct but I've never seen it being referred to as a "for control" loop. perlsyn for example refers to it as c-style loop but that also is not a great name). Same with "for over": I would not immediately link it with for (@list)

If the names are already established somewhere then it's my knowledge that is lacking and then the names are good.
If you came up with the names then it might be worthwhile to think about/discuss alternative/better names so that it is immediately clear to everyone what it means/what it is used for.

Just an initial thought for-list might be clearer then for-over.
Or even for-over-list would be better then for-over.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, now I understand your question.

No, these exact names were not used in code base so far.
I though about list variant as well but it is not always over list. More likely it is for-over-iterable-with-cursor (in future can be object "implementing interface iterable" or result of function containing "yield")

I also though about introducing new directory (eg: grammar/expression-for-control-scope.t) where tests will be named after respective grammar rule they are testing (in this case with suffix what is tested).

Regarding discussion: IMHO PR review is proper place. Rest kind of discussions tends to stale.

Few other ideas (used also LLM to suggest some as well):
for-control:

  • for-with-control
  • for-traditional
  • for-with-condition
  • for-threepart

for-over:

  • for-with-cursor
  • for-iterate
  • for-iterator
  • for-values
  • for-in

I wanted to avoid names like:

  • for-each - there are keywords foreach and each which may derail though process
  • for-list - word list represents special structure in language, which differs from array and hash

@bram-perl
Copy link

@bram-perl I've updated PR and commit messages, can you please look whether it is better described ?

Yes, they're a lot better.

(When merging I would also be in favor of using a merge commit for it but I don't think there is a good work flow for that yet.
I know khw uses the Use merge commit label for it and did create #23547 for it.)

Branislav Zahradník added 7 commits November 18, 2025 06:30
* Motivation

Introduce single assert function which can:
- use emulated named arguments
- evaluate code to build "got" value
- detect eval success/failure
- detect expected eval success/failure

`assume` function enforces most important part of test case,
assumed behaviour description (test message), first.
be it control variable of `for (;;)` variant or cursor variable(s)
of `for $var () { } continue { }`

Tests test:
- visibility of variable in for loop
- visibility of variable in continue loop
- behaviour after loop
Tests covering this changes:
- t/op/for-control-scope.t
- t/op/for-over-scope.t

* Motivation

Normalize where new lexical context starts so every variant
of for loop will start it at same position.

Such normalization will later allow reduction of code currently
duplicated.

* Change description

For statement already defined its own lexical scope in each its branch
though in different places for each branch (using <*> as a marker):

```
for (<*> $control = 0; ...)
for my <*> $cursor (...)
for my <*> ($cursor_a, $cursor_b) (...)
for $cursor (<*> ...)
for my \ <*> $cursor (...)
for \ my <*> $cursor (...)
for \ $cursor (<*> ... )
for (<*> ...)
```

This change formalizes every branch to start with:
```
for <*> ...
```

* Technical details

Lexical scope is started by non-terminal `remember` - empty with an action.

Bison's LALR(1) grammar evaluates action before looking which terminal token
follows, which means, that when `remember` occurs it must be either:
- preceded by unique sequence of terminal tokens (approach used before)
- be on same position for every possible branch (approach used by this change)

As far as `KW_FOR` (as statement) always starts new lexical scope,
it's possible to formalize it by starting it right after keyword,
when `for` statement is identified.
for_control - C-style with control expression
for_over - iterating over list using cursor variable(s)
Sequence `PERLY_PAREN_OPEN` `mexpr` `PERLY_PAREN_CLOSE` is repeated
in several places, add new non-terminal `clause_mexpr` to remove
this repetition.
Extract multiple ways how to declare cursor variable(s) of for-over
loop into dedicated non-terminal allowing to merge `for-over` branches.

Merged for-over branches of:
- `for $scalar (...)`
- `for my $scalar (...)`
- `for my ($foo, $bar) (...)`
- `for \ $scalar (...)`
- `for my \ $scalar (..)`

Branch `for ()` must still remain as far as LALR(1) grammar will run
into shift/reduce conflict with `for (;;)` - there will be two possible
paths between `KW_FOR` and `(`:
- `KW_FOR remember '('` - from `for (;;)` branch
- `KW_FOR remember clause_for_over_cursor '(` - from `for ()` (using default cursor)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants