Removing Truncation/Padding during Prefix Tokenization

In `lwm/vision_generation.py`, the max_input_length is capped to 128 characters `img_enc, img = generate_first_frame(prompts, max_input_length=128)`. In this case, any longer prompts provided via 'scripts/run_sample_image.sh' will be truncated to 128 characters. The suggestion here is to update the 'generate_first_frame' function to use the tokenizer's 'longest' padding mode to dynamically set the length instead of padding/truncation, i.e `inputs = prefix_tokenizer(prompts, padding='longest',return_tensors='np')`.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Removing Truncation/Padding during Prefix Tokenization #83

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Removing Truncation/Padding during Prefix Tokenization #83

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions