Data Mining Project

Context Aware Category Prediction Using Semantic Article Graphs

Download the dataset from huggingface
Place the downloaded dataset in the demo/input directory.
Download the models from:
- GCN model: GCN
- GCN without edges: GCN_no_edges
- Place the downloaded models in the demo/models/ directory.
- Rename the folders as undirected_gnn and no_edge_gnn respectively.
Set PYTHONPATH to the parent directory:
```
export PYTHONPATH=demo/
```
Navigate to the demo directory in your terminal.
Install the required dependencies using:
```
pip install -r requirements.txt
```
Run the Streamlit app using:
```
streamlit run src/demo.py
```

Alternatively you can visit the hosted demo at: Demo

Download the dataset from huggingface
Place the downloaded dataset in the modeling/ directory.
Install the required dependencies using:
```
pip install -r demo/requirements.txt
```
Navigate to the modeling directory in your terminal.
Run gnn.ipynb or heuristic_methods.ipynb using Jupyter Notebook or Jupyter Lab.

Navigate to the preprocessing directory in your terminal.
- preprocess.ipynb: Contains all the preprocessing steps to create the dataset from raw wikipedia articles collected using Wikipedia API.
- generate_embeddings.ipynb: Contains steps to generate document embeddings for the articles using a pre-trained embedding model.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
demo		demo
modeling		modeling
preprocessing		preprocessing
.gitignore		.gitignore
README.md		README.md