LLM-based persuasion simulator

This repository contains the code and datasets for the NAACL 2025 paper: Communication Makes Perfect: Persuasion Dataset Construction via Multi-LLM Communication.

A comprehensive tool for generating multi-turn persuasion conversations between AI agents in different communication formats, and a dataset for the generated persuasion dialogues.

Authors

Weicheng Ma¹, Hefan Zhang², Ivory Yang², Shiyu Ji³
Joice Chen³, Farnoosh Hashemi³, Shubham Mohole³, Ethan Gearey²
Michael Macy³, Saeed Hassanpour², Soroush Vosoughi²

¹Georgia Institute of Technology
²Dartmouth College
³Cornell University

Repository Maintainer

This codebase was primarily developed and maintained by Hefan Zhang.
For code-related questions, please contact: hefan.zhang.gr@dartmouth.edu

Overview

This tool creates realistic persuasion dialogues by simulating conversations between:

Persuader: An AI agent trying to convince another agent to take a specific action
Persuadee: An AI agent maintaining its position and resisting persuasion
Judge: An AI system evaluating conversation quality and outcomes

Quick Start

1. Prerequisites

Python 3.7+
OpenAI API key
NormBank.csv file (or custom dataset)

2. Basic Usage

# View comprehensive usage guide
python usage_guide.py

# Run example commands (dry run mode)
python run_examples.py --api-key "your-api-key-here"

# Generate conversations (actual execution)
python generate_conversations.py --api-key "your-api-key-here" --format general --start 0 --end 5

3. Required Parameters

Parameter	Description	Example
`--api-key`	OpenAI API key (required)	`--api-key "sk-proj-..."`

4. Optional Parameters

Parameter	Description	Default	Choices
`--format`	Output conversation format	`general`	`general`, `email`, `twitter`
`--start`	Starting data index	`20`	Any integer
`--end`	Ending data index	`25`	Any integer
`--llm`	Language model for content generation	`gpt-3.5-turbo`	`gpt-3.5-turbo`, `gpt-4`, `gpt-4-turbo`
`--judge_llm`	Language model for judgment tasks	`gpt-4`	`gpt-3.5-turbo`, `gpt-4`, `gpt-4-turbo`
`--input-file`	Input CSV file	`NormBank.csv`	Any valid CSV file

Model Configuration

The tool supports separate model configuration for different tasks:

Generation vs. Judgment Models

--llm: Used for content generation (conversation responses, refinement)
--judge_llm: Used for judgment tasks (topic relevance, agreement detection, quality assessment)

Recommended Configurations

Use Case	Generation Model	Judgment Model	Benefits
Cost-effective	`gpt-3.5-turbo`	`gpt-4`	Lower cost, high-quality judgments
High Quality	`gpt-4`	`gpt-4`	Maximum quality for both tasks
Speed Optimized	`gpt-3.5-turbo`	`gpt-3.5-turbo`	Fastest processing
Research Grade	`gpt-4-turbo`	`gpt-4`	Best generation with reliable judgments

Why Separate Models?

Cost Optimization: Use cheaper models for high-volume generation tasks
Quality Assurance: Use more capable models for critical judgment decisions
Flexibility: Mix and match models based on your specific needs
Research Control: Isolate the impact of different model capabilities

Format Types

General Format

Style: Direct, conversational dialogue
Token Limit: 100 tokens
Use Case: Baseline comparisons, general analysis

Email Format

Style: Professional email correspondence with headers and signatures
Token Limit: 1000 tokens
Use Case: Business communication, formal persuasion scenarios

Twitter Format

Style: Social media posts with hashtags and emojis
Token Limit: 140 characters
Use Case: Social media analysis, viral persuasion studies

Example Commands

# Generate email conversations using GPT-4 for both generation and judgment
python generate_conversations.py --api-key "your-key" --format email --llm gpt-4 --judge_llm gpt-4 --start 0 --end 10

# Generate Twitter conversations with custom dataset (GPT-3.5 for generation, GPT-4 for judgment)
python generate_conversations.py --api-key "your-key" --format twitter --input-file "custom_data.csv" --llm gpt-3.5-turbo --judge_llm gpt-4 --start 5 --end 20

# Large-scale generation with premium model for generation, GPT-4 for judgment
python generate_conversations.py --api-key "your-key" --format general --start 0 --end 100 --llm gpt-4-turbo --judge_llm gpt-4

# Cost-effective setup: GPT-3.5 for generation, GPT-4 for judgment (default)
python generate_conversations.py --api-key "your-key" --format general --start 0 --end 50

Multi-Agent Conversations

The framework can be extended to support multi-agent conversations beyond the standard two-agent setup. generate_conversations-multiAgent.py demonstrates this capability with a three-agent configuration featuring two persuaders and one persuadee.

Multi-Agent Setup

The multi-agent version includes:

Persuader 1: First persuading agent
Persuader 2: Second persuading agent (provides collaborative persuasion)
Persuadee: Target agent being persuaded
Judge: Evaluates conversation quality and outcomes

Multi-Agent Usage

# Generate multi-agent conversations in general format
python generate_conversations-multiAgent.py --api-key "your-key" --format general --start 0 --end 10

# Multi-agent email conversations
python generate_conversations-multiAgent.py --api-key "your-key" --format email --llm gpt-3.5-turbo --judge_llm gpt-4

# Multi-agent Twitter conversations
python generate_conversations-multiAgent.py --api-key "your-key" --format twitter --start 5 --end 20

Multi-Agent Output Structure

Multi-agent JSON files contain:

{
    "Initial Setting": {
        "Persuader1": "Initial persuader 1 prompt...",
        "Persuadee": "Initial persuadee prompt...", 
        "Persuader2": "Initial persuader 2 prompt...",
        "Judge": "Initial judge prompt..."
    },
    "Conversation History": {
        "0": {
            "Persuader1": "First persuader 1 message...",
            "Persuadee": "First persuadee response...",
            "Persuader2": "First persuader 2 message..."
        }
    },
    "Result": {
        "is_agreement": "Yes/No",
        "winner": "0/1/Neither"
    }
}

Multi-Agent File Locations

General format: conversations_json_mainFramework_multiAgent/{index}.json
Email format: conversations_json_mainFramework_multiAgent_formatControl/{index}_email.json
Twitter format: conversations_json_mainFramework_multiAgent_formatControl/{index}.json

Extending the Tool

Multi-Agent Extensions

The source code can be easily extended to support multi-agent conversations beyond the standard two-agent setup. The generate_conversations-multiAgent.py file demonstrates how to implement a three-agent configuration with two persuaders and one persuadee. This extension shows that the framework is flexible enough to handle more complex conversation scenarios.

Key aspects of multi-agent extension:

Agent Management: Each agent maintains its own conversation history
Turn Management: Agents take turns responding to multiple previous messages
Collaborative Persuasion: Multiple persuaders can work together
Enhanced Judgments: Judge evaluates interactions between multiple agents

Adding New Formats

To add a new conversation format (e.g., 'reddit', 'sms'):

Update argument choices:

parser.add_argument('--format', choices=['general', 'email', 'twitter', 'reddit'], ...)

Add token limit:

elif self.format_type == 'reddit':
    return 500  # Adjust as needed

Add style instruction:

elif format_type == 'reddit':
    return "Write in Reddit comment style with casual tone..."

Add initial settings:

elif format_type == 'reddit':
    initial_persuader = f"""Reddit-style persuader prompt..."""
    initial_persuadee = f"""Reddit-style persuadee prompt..."""

Add output filename pattern:

elif format_type == 'reddit':
    return f'conversations_json_mainFramework_formatControl/{index}_reddit.json'

Using Custom Topics

Prepare your CSV file with columns:
- setting: Context/environment (e.g., "library", "office")
- behavior: Action being discussed (e.g., "wear shoes", "use phone")
- label: Ethical stance (0=unethical, 1=neutral, 2=ethical)

Example CSV structure:

setting,behavior,label
"workplace","take breaks","1"
"school","cheat on exams","0"
"hospital","follow safety protocols","2"

Run with custom file:

python generate_conversations.py --api-key "your-key" --input-file "my_topics.csv" --format email

Output Structure

Generated JSON files contain:

{
    "Initial Setting": {
        "Persuader": "Initial persuader prompt...",
        "Persuadee": "Initial persuadee prompt...", 
        "Judge": "Initial judge prompt..."
    },
    "Conversation History": {
        "0": {
            "Persuader": "First persuader message...",
            "Persuadee": "First persuadee response..."
        }
    },
    "Result": {
        "is_agreement": "Yes/No",
        "winner": "0/1/Neither"
    }
}

File Locations

General format: conversations_json_mainFramework_general/{index}.json
Email format: conversations_json_mainFramework_formatControl/{index}_email.json
Twitter format: conversations_json_mainFramework_formatControl/{index}_twitter.json

Provided Datasets

The conversations_json_mainFramework_general/ folder contains our pre-generated general format dataset with 200 conversations. These conversations were generated using behaviors from NormBank.csv and represent high-quality persuasive dialogues between AI agents. This dataset serves as a ready-to-use resource for research in computational persuasion and social communication.

Best Practices

API Key Management:
- Never commit API keys to version control
- Use environment variables: export OPENAI_API_KEY="your-key"
- Monitor usage to avoid unexpected charges
Data Processing:
- Start with small ranges to test setup
- Use appropriate models for your needs
- Monitor conversation quality
- Consider cost vs. quality trade-offs when choosing models
- For multi-agent conversations, expect longer processing times due to additional interactions
Error Handling:
- Script includes automatic retry for API errors
- Off-topic conversations are logged and skipped
- Check console output for processing status

Troubleshooting

Issue	Symptoms	Solutions
API Key Error	Authentication failed, 401 errors	Verify API key is correct and active
File Not Found	NormBank.csv not found	Ensure file exists or use --input-file
Rate Limiting	429 errors, slow processing	Script includes automatic retry
Poor Quality	Off-topic conversations	Try GPT-4, adjust prompts

Documentation

Comprehensive Guide: Run python usage_guide.py
Example Commands: Run python run_examples.py --api-key "your-key"
Help: Run python generate_conversations.py --help

License

This project is for research and educational purposes. Please ensure compliance with OpenAI's usage policies when using their API.

Citation

If you use this code or dataset in your research, please cite our paper:

@inproceedings{ma2025communication,
  title={Communication Makes Perfect: Persuasion Dataset Construction via Multi-LLM Communication},
  author={Ma, Weicheng and Zhang, Hefan and Yang, Ivory and Ji, Shiyu and Chen, Joice and Hashemi, Farnoosh and Mohole, Shubham and Gearey, Ethan and Macy, Michael and Hassanpour, Saeed and others},
  booktitle={Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)},
  pages={4017--4045},
  year={2025}
}

Contributing

Contributions are welcome! Please refer to the extension guides above for adding new formats or features.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
__pycache__		__pycache__
conversations_json_mainFramework_general		conversations_json_mainFramework_general
NormBank.csv		NormBank.csv
README.md		README.md
generate_conversations-multiAgent.py		generate_conversations-multiAgent.py
generate_conversations.py		generate_conversations.py

Folders and files

Latest commit

History

Repository files navigation

LLM-based persuasion simulator

Authors

Repository Maintainer

Overview

Quick Start

1. Prerequisites

2. Basic Usage

3. Required Parameters

4. Optional Parameters

Model Configuration

Generation vs. Judgment Models

Recommended Configurations

Why Separate Models?

Format Types

General Format

Email Format

Twitter Format

Example Commands

Multi-Agent Conversations

Multi-Agent Setup

Multi-Agent Usage

Multi-Agent Output Structure

Multi-Agent File Locations

Extending the Tool

Multi-Agent Extensions

Adding New Formats

Using Custom Topics

Output Structure

File Locations

Provided Datasets

Best Practices

Troubleshooting

Documentation

License

Citation

Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages