SemEval 2026 Task 2: Predicting Variation in Emotional Valence and Arousal over Time from Ecological Essays
The Pytorch version of our SemEval-2026 Task 2 submissions is found within this repository and addresses three major issues in affective computing: state prediction, change forecasting and long-term trajectory prediction.
We utilize efficient "hybrid" architectures; specifically the Siamese Network ("Bifurcated Leviathan"), and custom loss functions (CCC (Concordance Correlation Coefficient)) to prevent regression to the mean, due to resource limitations associated with consumer-grade hardware (8GB VRAM).
├── paper/
│ ├── SemEval_Paper.pdf
├── src/ # Source code for training and inference
│ ├── subtask1_longitudinal.py
│ ├── subtask2a_forecasting.py
│ └── subtask2b_disposition.py
├── LICENSE
├── predictions/ # Output CSVs for submission
├── splits_subtask1/ # Generated automatically
├── splits_subtask2a/ # Generated automatically
├── splits_subtask2b/ # Generated automatically
├── train_subtask1.csv # Raw Dataset
├── train_subtask2a.csv # Raw Dataset
├── train_subtask2b.csv # Raw Dataset (Main file for Subtask 2B)
├── train_subtask2b_detailed.csv
├── train_subtask2b_user_disposition_change.csv
├── README.md # Project documentation
└── requirements.txt # Python dependencies
---
## 🎯 Task Definitions & Methodologies
### 1. Subtask 1: Longitudinal Affect Assessment
**The Task:**
Given a chronological sequence of **m** texts $e_1, e_2, \dots, e_m$, the model must produce **Valence & Arousal (V&A)** predictions for each text: $(v_1, a_1), \dots, (v_m, a_m)$.
* *Constraint:* The test split includes **Unseen Users** (zero-shot generalization) and **Seen Users** (temporal tracking).
**Our Solution: The Hybrid Early-Fusion Model**
* **Architecture:** `distilbert-base-uncased` + `BiLSTM`.
* **Innovation:** Instead of relying solely on text, we implement **Early Fusion**. An explicit `User Embedding` (dim=32) is concatenated with the text embedding *before* temporal processing. This allows the LSTM to condition its memory on the specific user identity.
* **Inference:** Uses a custom `SlidingWindowDataset` to prevent "context starvation" (forgetting history) during testing.
### 2. Subtask 2A: Forecasting State Changes
**The Task:**
Given a sequence of texts and their V&A scores up to time $t$, predict the **immediate next-step change** in Valence and Arousal:
$$
\Delta_1 = v_{t+1} - v_t
$$
**Our Solution: The State-Aware Projector**
* **The Problem:** "The Drowning Problem." High-dimensional text vectors (768-dim) overwhelm low-dimensional scalar inputs (current state $v_t, a_t$).
* **The Fix:** A **Projection MLP** boosts the scalar state features into a higher-dimensional space (64-dim) before fusion.
* **Loss Function:** We replaced MSE (Mean Squared Error) with **CCC Loss**.
* *Observation:* MSE caused the model to predict "zero change" (flatline) to minimize error.
* *Result:* CCC forces the model to match the *variance* of the trajectory, improving correlation from **0.39** to **0.64**.
### 3. Subtask 2B: Dispositional (Long-Term) Change
**The Task:**
Predict the change between the **mean observed affect** (past) and the **mean future affect** (future):
$$
\Delta_{\text{avg}} = \text{avg}(v_{t+1:n}) - \text{avg}(v_{1:t})
$$
**Our Solution: The "Bifurcated Leviathan"**
* **Architecture:** A Siamese Network with a shared `deberta-v3-large` backbone.
* **Sampling:** Implements a "Head-Tail" protocol, sampling the first 16 essays (Head) and last 16 essays (Tail) to model long-term drift.
* **Residual Learning:** We inject the arithmetic difference of the raw scores ("Naive Math") into the final layer. The network learns to *refine* this statistical trend rather than deriving it from scratch.
* **Bifurcation:** The network splits immediately after the backbone into separate **Valence** and **Arousal** heads to prevent noisy Arousal gradients from disrupting Valence learning.
---
## 📊 Results & Performance
| Task | Metric | Score (Pearson $r$) | Key Insight |
| :--- | :--- | :--- | :--- |
| **Subtask 1** | Valence (Seen) | **0.7026** | User Embeddings are critical for known users. |
| **Subtask 1** | Arousal (Seen) | **0.5186** | Arousal is notoriously harder to model than Valence using text. |
| **Subtask 2A** | Avg Correlation | **0.64** | CCC Loss outperformed MSE by ~27%. |
| **Subtask 2B** | Valence Change | **0.7031** | Residual learning ("Naive Math") prevents scale collapse. |
---
## 🚀 Setup & Usage
### Prerequisites
* Python 3.10+
* NVIDIA GPU (Minimum 8GB VRAM recommended for training)
### Installation
```bash
# Clone the repository
git clone [https://github.com/YourUsername/SemEval-2026-Task2.git](https://github.com/YourUsername/SemEval-2026-Task2.git)
cd SemEval-2026-Task2
# Install dependencies
pip install -r requirements.txt-
Subtask 1:
python src/subtask1_longitudinal.py
This script handles the "Seen/Unseen" user split automatically.
-
Subtask 2A:
python src/subtask2a_forecasting.py
This executes the V5 architecture (DeBERTa + Projection) using CCC Loss to replicate our best results.
-
Subtask 2B:
python src/subtask2b_disposition.py
This implements the "Bifurcated Leviathan" model with Head-Tail sampling.
The architectures presented here (including the "Leviathan" Siamese network and the Hybrid LSTM-Fusion) are original contributions developed for this competition.
We gratefully acknowledge the open-source community. Specifically, initial data processing patterns and file handling structures were informed by the work of:
- ThickHedgehog (2025): Deep-Learning-project-SemEval-2026-Task-2. Available at: GitHub.
Note: While preprocessing logic was inspired by the above, the modeling strategies (Early vs. Late Fusion, usage of LSTM for Subtask 1, and CCC optimization) differ significantly in implementation and topology.
If you use this code or our findings in your research, please cite:
@inproceedings{jumakhan2026longitudinal,
title={Longitudinal Affective Forecasting: Architectures for Generalization, State Change, and Trajectory Prediction},
author={Jumakhan, Haseebullah and Assad, Soud and Ahmad, Seyed Abdullah},
booktitle={Proceedings of the 20th International Workshop on Semantic Evaluation (SemEval-2026)},
year={2026}
}