Skip to content

Consecutive Chunks not matched #9

@gmonaie

Description

@gmonaie

Doesn't seem to match consecutive chunks in this case. There are double spaces in the tag string to preserve whitespace.

var tags = "It's/NN  on/IN  march/NN  5/CD th/DT ./. "

var ranks = {
  ruleType: 'tokens',
  pattern: '[ { tag:CD } ] [ { word:/th|st|nd/ } ]',
  result: 'RANK'
};

var months = {
  ruleType: 'tokens',
  pattern: '[ { word:/[Jj]anuary|[Ff]ebruary|[Mm]arch|[Aa]pril|[Mm]ay|[Jj]une|[Jj]uly|[Aa]ugust|[Ss]eptember|[Oo]ctober|[Nn]ovember|[Dd]ecember/; tag:NNP? } ]',
  result: 'MONTH'
};

var dates = {
  ruleType: 'tokens',
  pattern: '[ { chunk:"MONTH" } ] [ { chunk:"RANK" } ]',
  result: 'DATE'
};

var chunks = chunker.chunk(
  tags,
  [months, ranks, dates]
);

console.log(chunks);

Output:

It's/NN  on/IN  [MONTH march/NN]  [RANK 5/CD th/DT] ./. 

It captured the two chunks MONTH and RANK, but not the consecutive pattern defined by dates.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions