Skip to content

Training LSTM for classification #6

@denull

Description

@denull

I'm having trouble understanding this part of the example code:

// for example lets assume we have binary classification problem
// so the output of the LSTM are the log probabilities of the
// two classes. Lets first get the probabilities:
var prob1 = R.softmax(out1.o);
var target1 = 0; // suppose first input has class 0
cost += -Math.log(probs.w[ix_target]); // softmax cost function

// cross-entropy loss for softmax is simply the probabilities:
out1.dw = prob1.w;
// but the correct class gets an extra -1:
out1.dw[ix_target] -= 1;

Especially what's going on with the cost variable – it's not declared nor used anywhere. Also I don't understand why out1 is used for training – shouldn't it be the last output, out3?

I'm trying to solve the similar problem – feed a sequence of input vectors to the model, and then retrieve a single output vector. But I am unsure how to correctly train the LSTM in this case.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions