computing the inference FLOPs

Hello,
Thanks for your talented work! I was trying to run your model in the test set. However, I have some questions about how did you compute FLOPs. 
To have the amazing acceleration rates as you mentioned in the paper, does it need to train an end-to-end sparsified model, or running your finetuning codes for several epoches is just enough? As is mentioned in the paper, "In the training procedure, Transkimmer does not prune the hidden state tensors as it does in the inference time." So in the inference time, by which means you prune the hidden state tensors to turn it to the inference mode? Can I use torchfile directly on the trained transkimmer model to compute the inference FLOPs? 
Looking forward to your reply.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

computing the inference FLOPs #4

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

computing the inference FLOPs #4

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions