The activation tanh seems can't fit the region of average path length

Hi~
I'm reading your code and find that the activation function of mlp is tanh, and the output activation is softplus
```python
pred_fun, loglike_fun, parser = build_mlp(layer_specs, output_activation=softplus)
```
```python
def build_mlp(layer_sizes, activation=np.tanh, output_activation=lambda x: x):
......
    def predict(weights, X):
        cur_X = copy(X.T)
        for layer in range(len(layer_sizes) - 1):
            cur_W = parser.get(weights, ('weights', layer))
            cur_B = parser.get(weights, ('biases', layer))
            cur_Z = np.dot(cur_X, cur_W) + cur_B
            cur_X = activation(cur_Z)
        return output_activation(cur_Z.T)

    def log_likelihood(weights, X, y):
        y_hat = predict(weights, X)
        return mse(y.T, y_hat.T)
```
the output of tanh ranges from -1 to 1, so after the output activation(softplus), the final output is also no greater than 1, but isn't average path length can sometimes be greater than 1?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The activation tanh seems can't fit the region of average path length #2

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

The activation tanh seems can't fit the region of average path length #2

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions