Bigram/Trigram Language Model

For CISC220 (Data Structures), students were instructed to pick any data structure we had not worked with in class for our final project. I chose to engage with bigram/trigram/n-gram language models because I have a strong interest in cognitive science/linguistics. The system is not polished but performs rudimentary functionality for sentence production. I also designed a client to perform basic word prediction as many modern e-mail clients are equipped with.

Usage:

Move input text file (sentence corpus) into the same directory as makefile and main.cpp
Open terminal and navigate to the directory
Run makefile with $make
Run main.cpp with $./final [INPUT_TEXT_FILENAME] [OPTIONAL_SENTENCE_START_WORD]
Enter sentence one word at a time, following prompts in terminal. Entering a stop character (".") will terminate program

Further improvement plans:

Debug n-gram version of structure
Isolate sentence start words to prevent fragments
Add compatibility with spelling/forming words, rather than sentences
Update client to perform both functions (sentence generation/auto-complete) depending on user input

All code in this repository is entirely my own.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
Makefile		Makefile
README.md		README.md
input1.txt		input1.txt
input2.txt		input2.txt
input3.txt		input3.txt
main.cpp		main.cpp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bigram/Trigram Language Model

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Bigram/Trigram Language Model

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages