-
Notifications
You must be signed in to change notification settings - Fork 31
Open
Description
Hello, I’m training Kinetics-600 on V100 with 17 frames of 128×128, latent size 5×16×16, and a single-GPU batch size of 4. I’m puzzled because using multiple nodes does not speed up training, adding more GPUs does not help, reducing the sub-dataset size does not make it faster, and using gradient accumulation actually makes it slower. Have you encountered this issue?
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels