Skip to content

Arxiv10 dataset split and code issues #1

@LaviRoars

Description

@LaviRoars

Hi Ashkan,

  1. I'm trying to reproduce the Arxiv10 test results for my learning but the dataset shared on your github page does not specify the train(80,000), validation(10,000), and test split(5,000). The example in the dataloader.py code was for IMDB.csv.

  2. Also, there is some issues with the code base. In trainer.py, df_embeddings was not defined anywhere at all. Plus, I can't seem to locate the multi-objective self-learning part that uses the similarities of embedding prototypes to train the Protoformer FW after fine tuning. Could you please point me in the right direction?

image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions