Skip to content

Sam103008/Email_AI_Automation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Email Triage OpenEnv

OpenEnv HuggingFace Space Python 3.11+

A complete OpenEnv-compliant environment for training AI agents to manage a business email inbox. The agent learns to triage, label, reply, route, and escalate emails — a task humans perform every working day.


Motivation

Email triage is one of the most universal productivity bottlenecks in knowledge work. A skilled human triager must:

  • Identify urgent issues (production outages, angry customers, security incidents)
  • Ignore noise (newsletters, spam)
  • Craft appropriate replies
  • Route messages to the right team
  • Escalate issues that require human decision-making

This environment lets an AI agent learn these skills from structured reward signals.


Environment Description

The agent receives an inbox of realistic business emails and must process them through a sequence of actions. Three tasks of increasing difficulty are provided, each scored on a deterministic rubric (0.0 – 1.0).

Action Space

Field Type Description
action_type enum label / reply / archive / escalate / route / noop
email_id string ID of the target email
label enum urgent / high / normal / low / spam
reply_text string Draft reply (min 20 chars for reward)
route_to string Department: devops / legal / engineering / management
reason string Optional explanation

Observation Space

Field Type Description
task_id string Current task identifier
step int Steps taken so far
emails list[Email] Full inbox with sender, subject, body, timestamp
inbox_size int Number of emails
done bool Episode completion flag
message string Human-readable status
metadata dict max_steps, actions_taken

Reward Structure

Rewards are provided at every step (intermediate signals) and a final grade is computed at episode end.

Action Reward
Correct urgent label +0.05
Correct non-urgent label +0.02
Wrong label −0.02
Good reply (needed) +0.04
Unnecessary reply −0.03
Correct route +0.05
Correct escalation +0.05
Unnecessary escalation −0.03
noop −0.02

Tasks

Task 1 — Priority Labeling (Easy)

Objective: Label 5 emails with the correct priority level.

Scoring: Weighted label accuracy. Urgent emails count 2×.

Expected score for a good agent: 0.75 – 1.0

Max steps: 10


Task 2 — Triage and Reply (Medium)

Objective: Label 10 emails and draft replies to emails that require a response.

Scoring: 0.6 × label_accuracy + 0.4 × reply_coverage − 0.05 × unnecessary_replies

Expected score for a good agent: 0.55 – 0.80

Max steps: 25


Task 3 — Full Inbox Management (Hard)

Objective: Process 15 emails — label, reply, route to the correct department, and escalate compliance/critical issues.

Scoring: 0.40 × label + 0.35 × routing + 0.25 × escalation + reply_bonus

Critical emails (regulatory notices, partner outages) carry 3× label weight.

Expected score for a good agent: 0.40 – 0.70

Max steps: 40


Baseline Scores

Measured with meta-llama/Llama-3.3-70B-Instruct via HuggingFace Inference API:

Task Baseline Score
task1 ~0.70
task2 ~0.52
task3 ~0.41
Average ~0.54

Setup and Usage

Local Setup

git clone https://huggingface.co/spaces/<your-username>/email-triage-openenv
cd email-triage-openenv

pip install -r requirements.txt

uvicorn main:app --host 0.0.0.0 --port 7860 --reload

API will be available at http://localhost:7860.

Docker

docker build -t email-triage-openenv .
docker run -p 7860:7860 \
  -e API_BASE_URL=https://router.huggingface.co/v1 \
  -e MODEL_NAME=meta-llama/Llama-3.3-70B-Instruct \
  -e HF_TOKEN=hf_your_token \
  email-triage-openenv

Running the Baseline Inference Script

export API_BASE_URL=https://router.huggingface.co/v1
export MODEL_NAME=meta-llama/Llama-3.3-70B-Instruct
export HF_TOKEN=hf_your_token_here

python inference.py

Quick API Usage

import requests

BASE = "http://localhost:7860"

# Start task 1
obs = requests.post(f"{BASE}/reset", json={"task_id": "task1"}).json()

# Take an action
result = requests.post(f"{BASE}/step", json={
    "action": {
        "action_type": "label",
        "email_id": "t1_e1",
        "label": "urgent"
    }
}).json()
print(result["reward"])

# Get final score
grade = requests.post(f"{BASE}/grade").json()
print(grade["final_score"])

API Reference

Method Endpoint Description
GET /health Health check — returns 200
GET / Environment info
POST /reset Start new episode {"task_id": "task1"}
POST /step Take action {"action": {...}}
GET /state Current environment state
GET /tasks List all tasks with metadata
POST /grade Run grader, get final score

Environment Variables

Variable Description Default
API_BASE_URL LLM API endpoint https://router.huggingface.co/v1
MODEL_NAME Model identifier meta-llama/Llama-3.3-70B-Instruct
HF_TOKEN HuggingFace API key

Project Structure

email-triage-openenv/
├── main.py              # FastAPI server
├── inference.py         # Baseline agent (OpenAI client)
├── openenv.yaml         # OpenEnv spec metadata
├── requirements.txt
├── Dockerfile
├── README.md
└── env/
    ├── __init__.py
    ├── email_env.py     # EmailTriageEnv class (step/reset/state)
    ├── models.py        # Pydantic models (Observation, Action, Reward)
    ├── graders.py       # Task graders (deterministic scoring)
    └── data.py          # Email dataset (15 realistic emails)

License

MIT

About

AI-powered email triage system using LLMs to classify, route, and generate automated responses.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors