Is that seq2seq-like model you want to implement?
I have the same problem met.
It seems like when training, the input of the decoder also have to be sorted by length.
While in the evaluation part, we do not have prior knowledge of the lengths of the sentences we want to generate, so, this part of the information is kind of lost.
Also, it seems seq2seq-like decoder could only be implemented by RNNLM, is that true?(like the code below:

            t = 0
            while(t < self.max_sequence_length-1):
                if t == 0:
                    input_sequence = Variable(torch.LongTensor([self.sos_idx] * batch_size), volatile=True)
                    if torch.cuda.is_available():
                        input_sequence = input_sequence.cuda()
                        outputs        = outputs.cuda()

                input_sequence  = input_sequence.unsqueeze(1)
                input_embedding = self.embedding(input_sequence) # b * e
                output, hidden  = self.decoder_rnn(input_embedding, hidden) 
                logits          = self.outputs2vocab(output) # b * v
                outputs[:,t,:]  = nn.functional.log_softmax(logits, dim=-1).squeeze(1)  # b * v 
                input_sequence  = self._sample(logits)
                t += 1

            outputs = outputs.view(batch_size, self.max_sequence_length, self.embedding.num_embeddings)

Repeated content #7

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions