Skip to content
peterhs73 edited this page Feb 1, 2021 · 2 revisions

The RefParse package provides a unified API (crossref REST and arXiv) to allow custom citation templates. The following shows the supplied values, with all values defaults to empty string if not available. Not all metadata are included (expect to adopt more in the future if necessary).

API description

Basic reference information

RefParse API crossref API arXiv API type
title <title> <title> string (plain string)
title_latex <title> <title> string (latex string)
title_html <title> <title> string (html string)
author list [surname, givenname]
abstract string
journal_title_full <full_title> string
journal_title_abbrev <abbrev_title> string
journal_pages list [first_page, last_page]
Journal_volume string
journal_issue string

For title, there are articles that with html tags (subscripts, superscripts etc). Three values are provided:

  • title: plain string (strips down all html tags), using the .get_text(strip=True) in BeautifulSoup package.
  • title_latex: convert all html tags to latex (best attempts, if it cannot convert it removes the html tag)
  • title_html: retain the html string, while attempts to treat whitespace as close as the browser.

Publication dates

An ideal situation is to directly convert publication dates to datetime object. However, some journal does not have month and day on the crossref metadata.

RefParse converts dates into two categories, online and print. For arXiv articles, the <updated> time are considered as online date (<updated> date is used instead of <published> date). For crossref articles, the <publication_date media_type=“online”> is converted to online dates and <publication_date media_type=“print”> is converted to print dates.

RefParse API crossref API arXiv API type
online_year <publication_date media_type=“online”> string
online_month <publication_date media_type=“online”> string
online_day <publication_date media_type=“online”> string
print_year <publication_date media_type=“print”> string
print_month <publication_date media_type=“print”> string
print_day <publication_date media_type=“print”> string

Additional

Additional values are provided to provide reference information and flags.

RefParse API return value (crossref) return value (arXiv) type
url url url string
reference doi arXiv ID string
ref_type “doi” “arXiv_ID” string
doi doi string
arXiv ID arXiv ID string
has_publication True False Boolean
has_print True / False False Boolean
  • has_print indicates whether the journal has a print version. Returns True if <publication_date media_type=“print”> exists and False otherwise.

Clone this wiki locally