-
Notifications
You must be signed in to change notification settings - Fork 101
Open
Description
Hi @piegu!
Thanks for processing the DocLayNet dataset into smaller portions. It really helps for fast experimentations!
It was especially useful to have the byte stream of the pdfs in the dataset, so one does not have to download all those files & build a script for aligning the dataset with the files.
This is a notebook where the field still existed:
https://github.com/piegu/language-models/blob/master/processing_DocLayNet_dataset_to_be_used_by_layout_models_of_HF_hub.ipynb
Is there a reason for removing this field?
It would be really great to have it back!
Thanks & all the best!
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels