In the code snippet 'train_simple_network' optimizer.step() # updates all the parameters theta(k+1) = theta(k)yita gradient it should have been theta(k+1) = theta(k) - yita gradient