Skip to content

Non-Greedy *? in SRFI-115 Matches Greedily #1020

@jrvieira

Description

@jrvieira

The non-greedy *? operator in SRFI-115 does not behave as expected. According to SRFI-115's specification non-greedy patterns should follow leftmost-shortest semantics.

Steps to reproduce:

(import (chibi regexp))

(regexp-extract '(: "a" (*? any) "z") "a-z-z-a")

Expected output:

("a-z")

Actual output:

("a-z-z")

Additional Context

The issue was originally observed in Chez Scheme’s SRFI-115 implementation.

Discussion in #scheme IRC suggested this may be a broader issue with SRFI-115's reference implementation: https://paste.jrvieira.com/1743421156171

Relevant parts:

[22:11:04] <Zipheir> Also (regexp-extract (rx "a" (*? any) "-") "a-z-a") => ("a-z-"), which is definitely not what I'd expect.
[22:12:32] <Zipheir> CHICKEN's irregex returns ("a-"). I guess there's something going on with the SRFI 115 implementation.
[22:14:39] <Zipheir> chibi's (srfi 115) is also affected.
...
[22:50:15] <Zipheir> zzz: With cond-expand from (srfi :0) I get (cond-expand (regexp-non-greedy #t) (else #f)) => #f

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions