The InPhO Corpus Builder matches a plaintext bibliography to volumes in the HathiTrust.
-
Install anystyle.io and Ruby dependencies by following the directions:
sudo apt-get install ruby-dev gem install --user-install anystyleNote: you might need to add the
~/.gem/ruby/2.3.0/bindir to yourPATH. Thegem installwill tell you. -
Install
rythonusing distribute:pip install rython
-
Use
parse.pyto parse a file to the JSON format for the browser:python parse.py FILENAME -
Launch the Corpus Builder:
python server.py -p 9024 -
Open the Corpus Builder in a browser: http://localhost:9024/
-
When finished, use
extractids.pyto create a file with 1 HathiTrust ID per line for use with corpus download tools.python extractids.py www/out.json