Human–AI Interaction • Carnegie Mellon University • Final Project
Author: Janet Xie
FeedbackGlow is a prototype application designed to examine how artificial intelligence can support workplace feedback practices while preserving user agency. The system provides evaluative scores for written feedback messages along three dimensions—Tone, Clarity, and Constructiveness—and generates reflective prompts intended to guide users in revising their messages.
The goal of the system is not to automate feedback writing, but to scaffold users’ awareness of how their messages may be perceived by recipients. The project explores how lightweight machine learning models and generative AI suggestions can support reflective practice within communication tasks.
The scoring component is implemented using a supervised regression model trained on a dataset of approximately 150 workplace feedback messages. The dataset consists of:
- Authentic workplace feedback collected from participants,
- Additional synthetic examples constructed to increase coverage,
- Human ratings from multiple annotators across the three scoring dimensions.
Feature extraction is performed using TF-IDF vectorization, supplemented with domain-specific lexicons for tone, clarity, and constructiveness. A Random Forest Regressor is used to predict continuous scores on a 0–5 range. These numerical outputs are mapped to categorical labels:
- Tone: Negative, Neutral, Positive
- Clarity: Unclear, Okay, Clear
- Constructiveness: Low, Medium, High
Although the model is limited by dataset size, it is sufficient for exploring interaction design patterns in human-AI collaboration.
Instead of rewriting the user’s message, the suggestion generator (implemented via the OpenAI API) produces brief reflective prompts. These prompts are designed to:
- Identify strengths in the user’s message,
- Highlight the area with the lowest predicted score,
- Encourage thoughtful revision without prescriptive instruction,
- Maintain user ownership of the revised text.
The design of these prompts draws on principles from the Human–AI Interaction literature, specifically the importance of maintaining user agency and avoiding over-automation.
The system is informed by the WeBuildAI framework (Lee et al.), which emphasizes participatory algorithm design and transparency. While this project does not implement community voting, it adopts several core principles:
- Treating users as evaluators of system behavior,
- Making scoring dimensions intelligible and interpretable,
- Encouraging users to reflect on their communication norms,
- Avoiding the substitution of human judgment with automated decisions.
The design aims to support a cooperative interaction between the user and the AI system, where the AI serves as a prompt for reflection rather than an authoritative evaluator.
- The user enters a feedback message into the interface.
- The system predicts Tone, Clarity, and Constructiveness scores using the trained regression model.
- The results are displayed in both numeric and categorical form.
- A reflective suggestion is shown on the first analysis pass.
- The user revises the message according to their own judgment.
- The user requests a re-analysis to observe changes.
- The user may begin a new session at any time.
The interaction design foregrounds iteration: users actively revise their writing and observe how the system responds, enabling a feedback loop that is both cognitive and behavioral.
feedbackglow/ │ ├── app.py ├── requirements.txt ├── README.md │ ├── model/ │ ├── train_model.py │ ├── model.pkl │ ├── tone_lexicon.json │ ├── clarity_lexicon.json │ └── constructiveness_lexicon.json │ ├── static/ │ ├── style.css │ ├── copy_icon.png │ └── checkmark.png │ └── templates/ └── index.html
python3 -m venv venv source venv/bin/activate
pip install -r requirements.txt
Mac/Linux:
export OPENAI_API_KEY="your_api_key"
Windows (PowerShell):
setx OPENAI_API_KEY "your_api_key"
python3 app.py
The application will be available at:
- The dataset is modest in size, resulting in restricted model generalizability.
- Scoring outputs should be interpreted as directional cues rather than authoritative judgments.
- The system has not undergone extensive usability testing.
- Future work may incorporate:
- Larger and more diverse datasets,
- Cross-validation and hyperparameter optimization,
- Expanded feature engineering and model comparison,
- More comprehensive user studies focusing on longitudinal communication improvement.
This project relies on the following open-source libraries:
- Flask
- scikit-learn (TF-IDF Vectorizer, RandomForestRegressor)
- NumPy
- Pandas
- Joblib
- NLTK
- OpenAI API
All additional code was developed specifically for this project.
The accompanying demonstration video provides:
- An overview of the interaction design,
- A discussion of the major design decision influenced by course readings,
- A summary of preliminary user feedback,
- A live demonstration of the scoring and revision workflow.
Janet Xie
Carnegie Mellon University
Master of Educational Technology and Applied Learning Sciences
Human–AI Interaction, Fall 2025