Skip to content

Question about quantizer #51

@fake-learn

Description

@fake-learn

Hello, I noticed that in the quantization process you use the operation w_bar = tf.round(tf.stop_gradient(w_hard - w_soft) + w_soft). However, tf.round is a non-differentiable operation, which will prevent the gradients from being backpropagated to the encoder part, resulting in the encoder parameters not being updated throughout the training process. I believe the correct operation should be w_bar = tf.stop_gradient(w_hard - w_soft) + w_soft.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions