SuperRainbowNLP (RNLP) is a versatile tool for several natural language processing activities. These activities include supervised name entity recognition and relationship extraction.
SuperRainbowNLP (RNLP) is based on Hibernate and can be configured easily to use different data sources (for example, mysql). Hibernate settings are in hibernate.cfg.xml. Please ensure to set connection.url, connection.username and connection.password correctly. Other application settings are in configuration.conf. Make sure the paths are updated and they exist.
The following are instructions on how to use RNLP for different applications.
For relationship extraction, the system needs to first load the text document and the entities annotations.
-
Load test/train documents into the framework using 'SimpleDocumentLoader' or create a new document loader by implementing the 'IDocumentAnalyzer' interface (for test/train sets).
-
Load annotations by creating instances of 'Phrase' and 'PhraseLink' and ensure to save them with HibernateUtil (for test/train sets).
-
Create machine learning examples by creating Phrase/PhraseLink and MLExample objects (for test/train sets)
-
Calculate features for every machine learning example (MLExample objects)
-
Train a machine learning model with train examples.
-
Evaluate the model using test examples.
This is an example of temporal relationship extraction implementation used for I2B2 shared task submission (https://www.i2b2.org/NLP/TemporalRelations/) : https://github.com/latinstallion/temporal-relation.git
Please cite the following publication if you use SuperRainbowNLP in your experiments:
Emadzadeh, E.; Jonnalagadda, S.; Gonzalez, G., 'Evaluating Distributional Semantic and Feature Selection for Extracting Relationships from Biological Text,' Machine Learning and Applications and Workshops (ICMLA), 2011 10th International Conference on , vol.2, no., pp.66,71, 18-21 Dec. 2011
Feel free to contact us with your question/comments: Latinstallion (Latinstallion (at) gmail.com)