-
Notifications
You must be signed in to change notification settings - Fork 8
Open
Description
Hi,
Has any body met such situation? that during the training process , on validation step, the sample_ode function will use the standard duration predictor to predict the sample length of input text. but some times, the predicted length, is even smaller then the ctx_start value on some samples, as a result, that will crush the training process because of invalid context position slicing or indexing.
This invalid prediction makes shorter then normal "out_length" value, and when runs to the DiTRFE2ETTSMultiTaskBackbone forward function,
for i in range(b):
if ctx_start[i] + ctx_length[i] > l: # Of cause! because the ctx_start already larger then the length value l from duration predictor.
ctx_length[i] = l - ctx_start[i] # Oh! then here we got a negative ctx_length value!
How to fix that since there we had got an z_t with bad length.
Our traning dataset samples has length distribution from 0.8 to 30 seconds.
Metadata
Metadata
Assignees
Labels
No labels