Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -293,6 +293,20 @@ Please send contributions via github pull request. You can do this by visiting t

<a id="English-header"></a>
### English

#### HateBench: Benchmarking Hate Speech Detectors on LLM-Generated Content and Hate Campaigns (HateBenchSet)
* Link to publication: [https://arxiv.org/abs/2501.16750](https://arxiv.org/abs/2501.16750)
* Link to data: [https://github.com/TrustAIRLab/HateBench](https://github.com/TrustAIRLab/HateBench)
* Task description: Binary (Hate, Not)
* Details of task: This dataset is constructed to benchmark hate speech detectors on LLM-generated hate speech. It includes 7,838 samples generated by six widely-used LLMs covering 34 identity groups, with meticulous annotations by three labelers.
* Size of dataset: 7,838
* Percentage abusive: 46.45%
* Language: English
* Level of annotation: Posts
* Platform: LLMs, i.e., GPT-3.5, GPT4, Vicuna, Baichuan2, Dolly2, and OPT
* Medium: Text
* Reference: Xinyue Shen, Yixin Wu, Yiting Qu, Michael Backes, Savvas Zannettou, and Yang Zhang. HateBench: Benchmarking Hate Speech Detectors on LLM-Generated Content and Hate Campaigns. In USENIX Security Symposium (USENIX Security). USENIX, 2025.

#### Not All Counterhate Tweets Elicit the Same Replies: A Fine-Grained Analysis
* Link to publication: [https://aclanthology.org/2023.starsem-1.8/](https://aclanthology.org/2023.starsem-1.8/)
* Link to data: [https://github.com/albanyan/counterhate_reply](https://github.com/albanyan/counterhate_reply)
Expand Down