Official Triton kernels for TopK and HierarchicalTopK Sparse Autoencoder decoders.
-
Updated
Sep 29, 2025 - Python
Official Triton kernels for TopK and HierarchicalTopK Sparse Autoencoder decoders.
Embedding language models in probability space via log-likelihood vectors
Source code of "Leaky Thoughts: Large Reasoning Models Are Not Private Thinkers" EMNLP 2025
Evaluating Large Language Models for Detecting Antisemitism
ReviewEval: An Evaluation Framework for AI-Generated Reviews
Beyond One World — A benchmark for testing how well LLMs role-play version-specific characters (e.g., superheroes across universes). Covers 30 heroes and 90 canon variants through two tasks: Canon Events (factual recall) and Moral Dilemmas (ethical reasoning). Introduces the Think-Act Matching metrices.
Beyond Human Labels: A Multi-Linguistic Auto-Generated Benchmark for Evaluating Large Language Models on Resume Parsing [EMNLP 2025 Main Conference]
[EMNLP'25] FuzzAug: Data Augmentation by Coverage-guided Fuzzing for Neural Test Generation
Official implementation of "Resolving UnderEdit & OverEdit with Iterative & Neighbor-Assisted Model Editing" (EMNLP 2025 Findings).
The official repo for the EMNLP 2025 paper "NormXLogit: The Head-on-Top Never Lies"
Code & reproducibility for the EMNLP paper “Profiling LLMs’ Copyright Infringement Risks under Adversarial Persuasive Prompting”: prompts, seeds queries, and figure scripts.
This repository contains the code and detailed analysis regarding competition and system paper I will submit regarding MAHED 2025 subtask1(hate and hope speech classification) in Arabic NLP colocated with EMNLP.
Time to Revisit Exact Match (Findings of EMNLP 2025)
Add a description, image, and links to the emnlp2025 topic page so that developers can more easily learn about it.
To associate your repository with the emnlp2025 topic, visit your repo's landing page and select "manage topics."