This repository contains primarily rake tasks used to create the data structures that Mïmis consumes.
The heart of the program is a Neo4j database with the Awards and Series information from the Internet Speculative Fiction Database.
The work of populating the graph and writing to IPFS is done by various rake tasks.
docker run --name argus -p7474:7474 -p7687:7687 -v $HOME/neo4j/data:/data -v $HOME/neo4j/logs:/logs --env NEO4J_AUTH=neo4j/neo4j2 neo4j:latestdocker runis used the first time only, subsequently usedocker start argusgit clone https://github.com/dhappy/arguscd argusrake neo4j:migratealias isotime='date +%Y-%m-%d@%H:%M:%S%:z'function rlog() { rake $1 | tee log/$1.$(isotime).log; }screenrlog isfdb:awards⌘^a crlog isfdb:series⌘^a crlog isfdb:covers⌘^a dscreen -r# after many hours have passed and see how much data has been integrated into the graph.rake export:awards# after everything is loaded
Neo4j has an interactive console you can access by visiting http://localhost:7474.
Assumes that a dump of the Internet Speculative Fiction Database is loaded into MySql in the isfdb database.
Saves the award year, category and books into the graph. The format of the graph is:
(:Award)-[:IN]->(:Year)-[:FOR]->(:Category)-[:NOMINEE]->(:Book|:Movie)
- There is a
resultproperty on theNominatedrelation that is either:- The number that they placed in the competition.
- A text string like
Not on Ballot: Insufficient Nominationsdescribing a special situation. NULLif the order is unspecified.
Saves the series nesting, contents and order into the graph. The format is:
(:Series)-[:CONTAINS*]->(:Series)-[:CONTAINS]->(:Book|:Movie)<-[CREATED]-(:Creators)
Creatorsrepresents all the creators for a work. Names are joined by a & sign because the uniqueness constraint doesn't work with arrays.- There is a
rankassociatedContainsrelations:MATCH (s:Series)-[c:CONTAINS]->(b:Book) ORDER BY c.rank RETURN s
Saves the covers isbn and image url into the graph. The format is:
(:Book)-[:PUBLICATION]->(:Version)-[:COVER]->(:Cover)
- This ISBN uniquifies a version.
Finds Content nodes with a url, but no IPFS id, then downloads the url and inserts it into IPFS. This works in conjunction with isfdb:covers to collect the cover images referenced in the database,
For books without a -[:REPO]-> link, check the directory ../.../trainpacks/ for files matching the pattern *#{author}*#{title}* or *#{title}*#{author}*.
The page has an ⏩ Injest ⏭ button for each found file that will copy the given file to ../.../book/by/#{author}/#{title}/.
Zip and rar files are uncompressed. If there is a single (html|epub|rtf|mobi|lit) file it is renamed to index.#{ext}.
index.htm is renamed to index.html which has an acceptably small chance of breaking a multipage document.