This repository is the official implementation of the group polarization detection model proposed in our paper When "normal" becomes polarized: Heterogeneous graph clustering for non-ideological community conflicts.
- Python version:
3.10.13
- You can refer to
Data_description.txtfor more information about.csv.xlsx.ptand.npyfiles in our codes. - Code files
Include 2 types:
- Python scripts
.pyfor collecting Sina Weibo's data (Sina_crawl) and some model-training related functions - Jupyter files
.ipynfor (a) data processing; (b) all experiments in paper; and (c) visualization for HG-PD (i.e., exp3) - All Jupyter files are in 2 language versions, i.e., Chinese and English, for better understanding :D
- Python scripts
- Python scripts (
.py)Sina_crawl: Used for crawling the data we need from Sina weibo (You can use it for crawling other Sina Weibo posts)userInter: HomoG-based model framework for exp2mcr_HGPD: HG-PD model framework for exp3mcrLoss: MCR2 loss functionaugment: Data augmentationother_func: Used for constructing membership matrix \PisavePara: Used for saving loss.csvand model states.pt
- Jupyter files (
.ipynb)Data_processing: Include all data processing steps for 3 experimentsK-Prototype: Inmplementation of exp1 in our paper; Results are saved inTrain_record/KPrototypeAblation: Implementation of exp2 with related visualizations in our paper; Training results are saved inTrain_record/Ablationand visualizations inVisualizationModel: Implementation of exp3 in our paper; Results are saved inTrain_record/ModelAnalysis_visualize: Visualizations of exp3 in our paper; Figures are saved inVisualization; Note that some additional visualizations are generated withinSupplementary.ipynband saved automatically to the specified output paths.Abnormal_compar: Model Comparisons Experiments in our paper; including 5 GAD models for GP detection and corresponding analysis & visualizationSupplementary: Includes supplementary visualizations and result analyses, applicable to both the ablation experiments and the GAD models
- About
Train_record- We just put the best model state in
Train_recordfolder
- We just put the best model state in
- About
Abnormal_result- Including (a) detection results (embedding, labels and scores) from all 5 GAD models; (b) csv file involved in analysis of all 5 GAD models; (c) tsne visualizations
Feel free to post any issues via Github.