haakondr/NLP-Graphs
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|
Repository files navigation
A thesis project focusing on the usage of dependency graphs as a representation of natural language text. Sentences are represented as graph objects, tagged with part-of-speech tags and relations between tokens. This representation is used as a measure of similarity between two sentences, utilized for plagiarism detection. The interesting part of the program is mainly GraphEditDistance.java, which is the focus of this thesis. -------------------------------- Dependencies: java7, maven a MongoDB database must be running at the location specified in app.properties (for a full run, not for calculating graph edit distance between two sentences with GED.java) Usage: modify app.properties and select the appropriate folders for the data set. mvn compile mvn exec:java