Skip to content

Gerbench#1

Draft
jphme wants to merge 4 commits intomainfrom
gerbench
Draft

Gerbench#1
jphme wants to merge 4 commits intomainfrom
gerbench

Conversation

@jphme
Copy link
Copy Markdown
Collaborator

@jphme jphme commented Feb 11, 2025

Datasets:

https://huggingface.co/datasets/ellamind/gerbench_sentence_errors

https://huggingface.co/datasets/ellamind/gerbench_next_word

Target:

  • Improve/test the benchmark code
  • Implement a version for next word not reliant on multiple choice -> sample the continuation (investigate how to handle different token lengths of correct answers properly)
  • Optional: Also add a perplexity benchmark (based on the satz_bis_zum_letzten_wort column) and compare (relative) results

Models to test/eval at start:

  • Mistral Nemo 12b
  • Llama 3.1 8b
  • Llama 3.2 3b
  • Qwen 2.5 7b / 14b
  • Optinal: Gpt-4o / Gpt-4o-mini / Claude (if possible?)

@notion-workspace
Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant