Long-Range-Arena Evaluation

Currently, we only know that our model is better than the baseline because of its lower loss at less training time. However, we could run some benchmarks such as [LRA](https://github.com/google-research/long-range-arena) to see how well our long-context model performs in a real-world scenario. While LRA doesn't leverage our capabilities ideally (unlike, for example, #5 and #9), it'd still allow us to have preliminary evaluation results on a well-known benchmark dataset.\
This issue tacks the progress of integrating our model into LRA, even though it should happen in a separate codebase.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Long-Range-Arena Evaluation #49

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Long-Range-Arena Evaluation #49

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions