-
Notifications
You must be signed in to change notification settings - Fork 21
Open
Description
Hi~
I'm reading your code and find that the activation function of mlp is tanh, and the output activation is softplus
pred_fun, loglike_fun, parser = build_mlp(layer_specs, output_activation=softplus)def build_mlp(layer_sizes, activation=np.tanh, output_activation=lambda x: x):
......
def predict(weights, X):
cur_X = copy(X.T)
for layer in range(len(layer_sizes) - 1):
cur_W = parser.get(weights, ('weights', layer))
cur_B = parser.get(weights, ('biases', layer))
cur_Z = np.dot(cur_X, cur_W) + cur_B
cur_X = activation(cur_Z)
return output_activation(cur_Z.T)
def log_likelihood(weights, X, y):
y_hat = predict(weights, X)
return mse(y.T, y_hat.T)the output of tanh ranges from -1 to 1, so after the output activation(softplus), the final output is also no greater than 1, but isn't average path length can sometimes be greater than 1?
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels