This project aims to uncover the trends and topics of the annual Cognitive Science Society conference using dynamic topic modeling.
Paper: Topics and Trends in Cognitive Science (2000-2017) - to be published in the Proceedings of the 40th Annual Conference of the Cognitive Science Society.
Website: Find similar papers!
Download PDFs from Archives of the Cognitive Science Society Conference Proceedings
Copy PDFs into text_data_new/
Process PDFs:
generate_dtm_input.py
- input: PDFs in text_data_new/volume_{}/
- output: dtm_input_data/dtm_input-mult.dat
dtm_input_data/dtm_input-seq.dat
Use our DTM version: alexanderrich/dtm
Talk to your local High Performance Cluster correspondent how to set everything up.
Run the script run_all.s
@Alex: How did you create dtm_processed_output.p?
Exporting model output into csv tables:
pickle_to_csv.ipynb
- input: output/dtm_processed_output.p
- output: output/csv/year_doc_topic.csv
output/csv/topicnames.csv
output/csv/year_topic_word.csv
Exporting original data into csv tables:
doc_word_freq.ipynb
- input: dtm_input_data/dtm_input-mult.dat
- output: output/csv/doc_word_freq.csv
See R scripts in the folder R. The R scripts save all figures into the folder figures.
Due to the exploratory nature of this project there are several scripts and figures that did not make their way into the paper.