adsabs/ADSIngestParser#83 raised the issue of whether to capture the language of the overall document in the fulltext.language value of Document.json. In JATS at least, the attribute @xml:lang isn't declared within the body tag itself, although it is declared in subelements like caption.
If the metadata itself declares a language (for example in article-meta), we should try to capture that within Document.json, but I don't think we should populate the language key-value pair under fulltext with a language declaration that's higher in the document heirarchy.
adsabs/ADSIngestParser#83 raised the issue of whether to capture the language of the overall document in the
fulltext.languagevalue of Document.json. In JATS at least, the attribute @xml:lang isn't declared within the body tag itself, although it is declared in subelements like caption.If the metadata itself declares a language (for example in article-meta), we should try to capture that within Document.json, but I don't think we should populate the language key-value pair under fulltext with a language declaration that's higher in the document heirarchy.