Sentiment Analysis with Naive Bayes in C

Overview

This project implements a Naive Bayes classifier in C to perform sentiment analysis on Amazon reviews. It processes textual data, constructs a vocabulary using a hashmap, and applies probabilistic calculations to classify reviews as positive or negative.

Features

Efficient text preprocessing with tokenization and stop-word filtering.
Hashmap-based vocabulary storage for fast lookups.
Probabilistic sentiment classification using Naive Bayes.
Logging support for debugging and analyzing results.
67.19% accuracy on the test dataset.

Requirements

C Compiler (e.g., GCC)
Libraries:
- cJSON for JSON parsing
- Standard C libraries for string manipulation and file handling

How It Works

Data Preprocessing: Reviews are tokenized, cleaned, and stored in a hashmap.
Model Training: The vocabulary is populated with word frequencies for both positive and negative classes.
Classification: The model calculates probabilities using Naive Bayes and logs classification results.
Performance Metrics: Reports accuracy, failures, and zero-error cases for evaluation.

Dataset Information

The dataset is not included in the repository. Instead, you can access the dataset here.

@article{hou2024bridging,
  title={Bridging Language and Items for Retrieval and Recommendation},
  author={Hou, Yupeng and Li, Jiacheng and He, Zhankui and Yan, An and Chen, Xiusi and McAuley, Julian},
  journal={arXiv preprint arXiv:2403.03952},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
data		data
include		include
src		src
.DS_Store		.DS_Store
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sentiment Analysis with Naive Bayes in C

Overview

Features

Requirements

How It Works

Dataset Information

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Sentiment Analysis with Naive Bayes in C

Overview

Features

Requirements

How It Works

Dataset Information

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages