what is ``positive_document`` and ``negative_document`` in training data format when training with 20 newsgroup dataset.