v_loss += (v - R) ** 2 / 2 But the original paper just calculate the derivative of the (V-R)^2 right?