diff --git a/README.md b/README.md index 2951d92..2899e04 100644 --- a/README.md +++ b/README.md @@ -293,6 +293,20 @@ Please send contributions via github pull request. You can do this by visiting t ### English + +#### HateBench: Benchmarking Hate Speech Detectors on LLM-Generated Content and Hate Campaigns (HateBenchSet) +* Link to publication: [https://arxiv.org/abs/2501.16750](https://arxiv.org/abs/2501.16750) +* Link to data: [https://github.com/TrustAIRLab/HateBench](https://github.com/TrustAIRLab/HateBench) +* Task description: Binary (Hate, Not) +* Details of task: This dataset is constructed to benchmark hate speech detectors on LLM-generated hate speech. It includes 7,838 samples generated by six widely-used LLMs covering 34 identity groups, with meticulous annotations by three labelers. +* Size of dataset: 7,838 +* Percentage abusive: 46.45% +* Language: English +* Level of annotation: Posts +* Platform: LLMs, i.e., GPT-3.5, GPT4, Vicuna, Baichuan2, Dolly2, and OPT +* Medium: Text +* Reference: Xinyue Shen, Yixin Wu, Yiting Qu, Michael Backes, Savvas Zannettou, and Yang Zhang. HateBench: Benchmarking Hate Speech Detectors on LLM-Generated Content and Hate Campaigns. In USENIX Security Symposium (USENIX Security). USENIX, 2025. + #### Not All Counterhate Tweets Elicit the Same Replies: A Fine-Grained Analysis * Link to publication: [https://aclanthology.org/2023.starsem-1.8/](https://aclanthology.org/2023.starsem-1.8/) * Link to data: [https://github.com/albanyan/counterhate_reply](https://github.com/albanyan/counterhate_reply)