Skip to content

the result of Ultravox-v0.5-LLaMA-3.1-8B #11

@Gpwner

Description

@Gpwner

I have test Ultravox-v0.5-LLaMA-3.1-8B too.but My test results are slightly different from yours, especially on the sdqa dataset.

  AlpacaEval CommonEval SD-QA OpenBookQA IFEval AdvBench
Open-Ended QA Open-Ended QA Reference-Based QA Multiple-Choice QA Instruction Following Safety
samples 199 200 553 455 345 520
Ultravox0.5 LLama3.1 8B Instruct 4.75 4.08 72.42 69.01 68.05 98.84

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions