I noticed several giant paragraphs in the river from Thompson. It turns out they are paragraph items (not HTML or markdown) but broken up into three text paragraphs in the single JSON paragraph entity. The text is broken up by newline characters, which I'm not processing.
See: http://thompson.fed.wiki/view/welcome-visitors/view/moksha