Skip to content

InternVL benchmark result #2

@nhhviet98

Description

@nhhviet98

First, I thank you very much for your contribution. 💯 💯 💯

In MathVerse, You have proven that most MLLMs solve problems based on "Text Redundancy".

I saw that, in InternVL they scale up the vision encoder to reduce the gap between Visual and Textual information. And it's also achieved Top 1 in MathVista.

Can you provide the benchmark results of InternVL on the MathVerse dataset? I think it will add useful information to your hypothesis.

Reference papers:

https://arxiv.org/pdf/2312.14238.pdf

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions