This repository contains the code and datasets for the NAACL 2025 paper: Communication Makes Perfect: Persuasion Dataset Construction via Multi-LLM Communication.
A comprehensive tool for generating multi-turn persuasion conversations between AI agents in different communication formats, and a dataset for the generated persuasion dialogues.
Weicheng Ma¹, Hefan Zhang², Ivory Yang², Shiyu Ji³
Joice Chen³, Farnoosh Hashemi³, Shubham Mohole³, Ethan Gearey²
Michael Macy³, Saeed Hassanpour², Soroush Vosoughi²
¹Georgia Institute of Technology
²Dartmouth College
³Cornell University
This codebase was primarily developed and maintained by Hefan Zhang.
For code-related questions, please contact: hefan.zhang.gr@dartmouth.edu
This tool creates realistic persuasion dialogues by simulating conversations between:
- Persuader: An AI agent trying to convince another agent to take a specific action
- Persuadee: An AI agent maintaining its position and resisting persuasion
- Judge: An AI system evaluating conversation quality and outcomes
- Python 3.7+
- OpenAI API key
NormBank.csvfile (or custom dataset)
# View comprehensive usage guide
python usage_guide.py
# Run example commands (dry run mode)
python run_examples.py --api-key "your-api-key-here"
# Generate conversations (actual execution)
python generate_conversations.py --api-key "your-api-key-here" --format general --start 0 --end 5| Parameter | Description | Example |
|---|---|---|
--api-key |
OpenAI API key (required) | --api-key "sk-proj-..." |
| Parameter | Description | Default | Choices |
|---|---|---|---|
--format |
Output conversation format | general |
general, email, twitter |
--start |
Starting data index | 20 |
Any integer |
--end |
Ending data index | 25 |
Any integer |
--llm |
Language model for content generation | gpt-3.5-turbo |
gpt-3.5-turbo, gpt-4, gpt-4-turbo |
--judge_llm |
Language model for judgment tasks | gpt-4 |
gpt-3.5-turbo, gpt-4, gpt-4-turbo |
--input-file |
Input CSV file | NormBank.csv |
Any valid CSV file |
The tool supports separate model configuration for different tasks:
--llm: Used for content generation (conversation responses, refinement)--judge_llm: Used for judgment tasks (topic relevance, agreement detection, quality assessment)
| Use Case | Generation Model | Judgment Model | Benefits |
|---|---|---|---|
| Cost-effective | gpt-3.5-turbo |
gpt-4 |
Lower cost, high-quality judgments |
| High Quality | gpt-4 |
gpt-4 |
Maximum quality for both tasks |
| Speed Optimized | gpt-3.5-turbo |
gpt-3.5-turbo |
Fastest processing |
| Research Grade | gpt-4-turbo |
gpt-4 |
Best generation with reliable judgments |
- Cost Optimization: Use cheaper models for high-volume generation tasks
- Quality Assurance: Use more capable models for critical judgment decisions
- Flexibility: Mix and match models based on your specific needs
- Research Control: Isolate the impact of different model capabilities
- Style: Direct, conversational dialogue
- Token Limit: 100 tokens
- Use Case: Baseline comparisons, general analysis
- Style: Professional email correspondence with headers and signatures
- Token Limit: 1000 tokens
- Use Case: Business communication, formal persuasion scenarios
- Style: Social media posts with hashtags and emojis
- Token Limit: 140 characters
- Use Case: Social media analysis, viral persuasion studies
# Generate email conversations using GPT-4 for both generation and judgment
python generate_conversations.py --api-key "your-key" --format email --llm gpt-4 --judge_llm gpt-4 --start 0 --end 10
# Generate Twitter conversations with custom dataset (GPT-3.5 for generation, GPT-4 for judgment)
python generate_conversations.py --api-key "your-key" --format twitter --input-file "custom_data.csv" --llm gpt-3.5-turbo --judge_llm gpt-4 --start 5 --end 20
# Large-scale generation with premium model for generation, GPT-4 for judgment
python generate_conversations.py --api-key "your-key" --format general --start 0 --end 100 --llm gpt-4-turbo --judge_llm gpt-4
# Cost-effective setup: GPT-3.5 for generation, GPT-4 for judgment (default)
python generate_conversations.py --api-key "your-key" --format general --start 0 --end 50The framework can be extended to support multi-agent conversations beyond the standard two-agent setup. generate_conversations-multiAgent.py demonstrates this capability with a three-agent configuration featuring two persuaders and one persuadee.
The multi-agent version includes:
- Persuader 1: First persuading agent
- Persuader 2: Second persuading agent (provides collaborative persuasion)
- Persuadee: Target agent being persuaded
- Judge: Evaluates conversation quality and outcomes
# Generate multi-agent conversations in general format
python generate_conversations-multiAgent.py --api-key "your-key" --format general --start 0 --end 10
# Multi-agent email conversations
python generate_conversations-multiAgent.py --api-key "your-key" --format email --llm gpt-3.5-turbo --judge_llm gpt-4
# Multi-agent Twitter conversations
python generate_conversations-multiAgent.py --api-key "your-key" --format twitter --start 5 --end 20Multi-agent JSON files contain:
{
"Initial Setting": {
"Persuader1": "Initial persuader 1 prompt...",
"Persuadee": "Initial persuadee prompt...",
"Persuader2": "Initial persuader 2 prompt...",
"Judge": "Initial judge prompt..."
},
"Conversation History": {
"0": {
"Persuader1": "First persuader 1 message...",
"Persuadee": "First persuadee response...",
"Persuader2": "First persuader 2 message..."
}
},
"Result": {
"is_agreement": "Yes/No",
"winner": "0/1/Neither"
}
}- General format:
conversations_json_mainFramework_multiAgent/{index}.json - Email format:
conversations_json_mainFramework_multiAgent_formatControl/{index}_email.json - Twitter format:
conversations_json_mainFramework_multiAgent_formatControl/{index}.json
The source code can be easily extended to support multi-agent conversations beyond the standard two-agent setup. The generate_conversations-multiAgent.py file demonstrates how to implement a three-agent configuration with two persuaders and one persuadee. This extension shows that the framework is flexible enough to handle more complex conversation scenarios.
Key aspects of multi-agent extension:
- Agent Management: Each agent maintains its own conversation history
- Turn Management: Agents take turns responding to multiple previous messages
- Collaborative Persuasion: Multiple persuaders can work together
- Enhanced Judgments: Judge evaluates interactions between multiple agents
To add a new conversation format (e.g., 'reddit', 'sms'):
-
Update argument choices:
parser.add_argument('--format', choices=['general', 'email', 'twitter', 'reddit'], ...)
-
Add token limit:
elif self.format_type == 'reddit': return 500 # Adjust as needed
-
Add style instruction:
elif format_type == 'reddit': return "Write in Reddit comment style with casual tone..."
-
Add initial settings:
elif format_type == 'reddit': initial_persuader = f"""Reddit-style persuader prompt...""" initial_persuadee = f"""Reddit-style persuadee prompt..."""
-
Add output filename pattern:
elif format_type == 'reddit': return f'conversations_json_mainFramework_formatControl/{index}_reddit.json'
-
Prepare your CSV file with columns:
setting: Context/environment (e.g., "library", "office")behavior: Action being discussed (e.g., "wear shoes", "use phone")label: Ethical stance (0=unethical, 1=neutral, 2=ethical)
-
Example CSV structure:
setting,behavior,label "workplace","take breaks","1" "school","cheat on exams","0" "hospital","follow safety protocols","2"
-
Run with custom file:
python generate_conversations.py --api-key "your-key" --input-file "my_topics.csv" --format email
Generated JSON files contain:
{
"Initial Setting": {
"Persuader": "Initial persuader prompt...",
"Persuadee": "Initial persuadee prompt...",
"Judge": "Initial judge prompt..."
},
"Conversation History": {
"0": {
"Persuader": "First persuader message...",
"Persuadee": "First persuadee response..."
}
},
"Result": {
"is_agreement": "Yes/No",
"winner": "0/1/Neither"
}
}- General format:
conversations_json_mainFramework_general/{index}.json - Email format:
conversations_json_mainFramework_formatControl/{index}_email.json - Twitter format:
conversations_json_mainFramework_formatControl/{index}_twitter.json
The conversations_json_mainFramework_general/ folder contains our pre-generated general format dataset with 200 conversations. These conversations were generated using behaviors from NormBank.csv and represent high-quality persuasive dialogues between AI agents. This dataset serves as a ready-to-use resource for research in computational persuasion and social communication.
-
API Key Management:
- Never commit API keys to version control
- Use environment variables:
export OPENAI_API_KEY="your-key" - Monitor usage to avoid unexpected charges
-
Data Processing:
- Start with small ranges to test setup
- Use appropriate models for your needs
- Monitor conversation quality
- Consider cost vs. quality trade-offs when choosing models
- For multi-agent conversations, expect longer processing times due to additional interactions
-
Error Handling:
- Script includes automatic retry for API errors
- Off-topic conversations are logged and skipped
- Check console output for processing status
| Issue | Symptoms | Solutions |
|---|---|---|
| API Key Error | Authentication failed, 401 errors | Verify API key is correct and active |
| File Not Found | NormBank.csv not found | Ensure file exists or use --input-file |
| Rate Limiting | 429 errors, slow processing | Script includes automatic retry |
| Poor Quality | Off-topic conversations | Try GPT-4, adjust prompts |
- Comprehensive Guide: Run
python usage_guide.py - Example Commands: Run
python run_examples.py --api-key "your-key" - Help: Run
python generate_conversations.py --help
This project is for research and educational purposes. Please ensure compliance with OpenAI's usage policies when using their API.
If you use this code or dataset in your research, please cite our paper:
@inproceedings{ma2025communication,
title={Communication Makes Perfect: Persuasion Dataset Construction via Multi-LLM Communication},
author={Ma, Weicheng and Zhang, Hefan and Yang, Ivory and Ji, Shiyu and Chen, Joice and Hashemi, Farnoosh and Mohole, Shubham and Gearey, Ethan and Macy, Michael and Hassanpour, Saeed and others},
booktitle={Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)},
pages={4017--4045},
year={2025}
}Contributions are welcome! Please refer to the extension guides above for adding new formats or features.