Entities and Relations annotation effort from 2013-2014 NLP project, first years of my Ph.D. at Faculty of Computer Science Iasi
QuoVadis.zip file contains QuoVadis.xml, an annotated corpus of the famous novel, in the Romanian language, containing many linguistic annotations added automatically (sentence segmentation, tokenization, part-of-speech tagging, lemmatization, and noun phrase chunking). And most important, manual annotation of:
- 24636 Entities
- 22301 Referential relations
- 227 Kinship relations
- 229 Affect relations
- 298 Social relations
Also, the Aggregator and CorefGraph directories contain the most important .java projects that are personal contributions to the project.
For more details, 2014 LPCL.pdf contains a book chapter describing the annotation process.