config for parsing MATH

Hi! Currently debugging and updating lm_evals's implementation of `minerva_math`. Currently we basically use:

```python
gold = parse(normalize_final_answer(remove_boxed(last_boxed_only_string(doc["solution"]))))
answer = parse(raw_model_output)
```
and then `verify(gold, answer)`

however, this does have some edge cases for example for a gold (after removing boxed):

```python
parse("\\dfrac{9}{7}")
>>> []
```
but with boxed works correctly:

```python
parse("\\boxed{\\dfrac{9}{7}}")
>>> [9/7, '\\frac{9}{7}']
```

I was looking at how `lighteval` does it, and would [this](https://github.com/huggingface/lighteval/blob/b0edbc340485bd3581da5613a9183bd61aa4cce5/src/lighteval/metrics/metrics.py#L359-L361) generally work on most MATH tasks, or do y'll handle sub tasks differently? Should we also normalize before parsing?

Would appreciate any thoughts!


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

config for parsing MATH #70

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

config for parsing MATH #70

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions