Can't reproduce results from the paper

Hi,

I am trying to reproduce your Llama-3.1-8B results from the paper. I am following the exact steps from the README with your Docker and vLLM engine, and this is what I am seeing for 131072 sequence length:


  | vt | avg (cwe, fwe) | avg (qa_1, qa_2)
-- | -- | -- | --
paper | 70.4 | 36.2 | 58.8
repro | 88.36 | avg (0.04, 53.4) = 26.72 | avg (71.4, 42.6) = 57

Could you perhaps share a bit more details on how your paper setup differs from the setup outlined in the README here?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Can't reproduce results from the paper #80

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	vt	avg (cwe, fwe)	avg (qa_1, qa_2)
paper	70.4	36.2	58.8
repro	88.36	avg (0.04, 53.4) = 26.72	avg (71.4, 42.6) = 57

Can't reproduce results from the paper #80

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions