Skip to content

HF-heaven/LLM-based_persuasion_simulator

Repository files navigation

LLM-based persuasion simulator

This repository contains the code and datasets for the NAACL 2025 paper: Communication Makes Perfect: Persuasion Dataset Construction via Multi-LLM Communication.

A comprehensive tool for generating multi-turn persuasion conversations between AI agents in different communication formats, and a dataset for the generated persuasion dialogues.

Authors

Weicheng Ma¹, Hefan Zhang², Ivory Yang², Shiyu Ji³
Joice Chen³, Farnoosh Hashemi³, Shubham Mohole³, Ethan Gearey²
Michael Macy³, Saeed Hassanpour², Soroush Vosoughi²

¹Georgia Institute of Technology
²Dartmouth College
³Cornell University

Repository Maintainer

This codebase was primarily developed and maintained by Hefan Zhang.
For code-related questions, please contact: hefan.zhang.gr@dartmouth.edu


Overview

This tool creates realistic persuasion dialogues by simulating conversations between:

  • Persuader: An AI agent trying to convince another agent to take a specific action
  • Persuadee: An AI agent maintaining its position and resisting persuasion
  • Judge: An AI system evaluating conversation quality and outcomes

Quick Start

1. Prerequisites

  • Python 3.7+
  • OpenAI API key
  • NormBank.csv file (or custom dataset)

2. Basic Usage

# View comprehensive usage guide
python usage_guide.py

# Run example commands (dry run mode)
python run_examples.py --api-key "your-api-key-here"

# Generate conversations (actual execution)
python generate_conversations.py --api-key "your-api-key-here" --format general --start 0 --end 5

3. Required Parameters

Parameter Description Example
--api-key OpenAI API key (required) --api-key "sk-proj-..."

4. Optional Parameters

Parameter Description Default Choices
--format Output conversation format general general, email, twitter
--start Starting data index 20 Any integer
--end Ending data index 25 Any integer
--llm Language model for content generation gpt-3.5-turbo gpt-3.5-turbo, gpt-4, gpt-4-turbo
--judge_llm Language model for judgment tasks gpt-4 gpt-3.5-turbo, gpt-4, gpt-4-turbo
--input-file Input CSV file NormBank.csv Any valid CSV file

Model Configuration

The tool supports separate model configuration for different tasks:

Generation vs. Judgment Models

  • --llm: Used for content generation (conversation responses, refinement)
  • --judge_llm: Used for judgment tasks (topic relevance, agreement detection, quality assessment)

Recommended Configurations

Use Case Generation Model Judgment Model Benefits
Cost-effective gpt-3.5-turbo gpt-4 Lower cost, high-quality judgments
High Quality gpt-4 gpt-4 Maximum quality for both tasks
Speed Optimized gpt-3.5-turbo gpt-3.5-turbo Fastest processing
Research Grade gpt-4-turbo gpt-4 Best generation with reliable judgments

Why Separate Models?

  1. Cost Optimization: Use cheaper models for high-volume generation tasks
  2. Quality Assurance: Use more capable models for critical judgment decisions
  3. Flexibility: Mix and match models based on your specific needs
  4. Research Control: Isolate the impact of different model capabilities

Format Types

General Format

  • Style: Direct, conversational dialogue
  • Token Limit: 100 tokens
  • Use Case: Baseline comparisons, general analysis

Email Format

  • Style: Professional email correspondence with headers and signatures
  • Token Limit: 1000 tokens
  • Use Case: Business communication, formal persuasion scenarios

Twitter Format

  • Style: Social media posts with hashtags and emojis
  • Token Limit: 140 characters
  • Use Case: Social media analysis, viral persuasion studies

Example Commands

# Generate email conversations using GPT-4 for both generation and judgment
python generate_conversations.py --api-key "your-key" --format email --llm gpt-4 --judge_llm gpt-4 --start 0 --end 10

# Generate Twitter conversations with custom dataset (GPT-3.5 for generation, GPT-4 for judgment)
python generate_conversations.py --api-key "your-key" --format twitter --input-file "custom_data.csv" --llm gpt-3.5-turbo --judge_llm gpt-4 --start 5 --end 20

# Large-scale generation with premium model for generation, GPT-4 for judgment
python generate_conversations.py --api-key "your-key" --format general --start 0 --end 100 --llm gpt-4-turbo --judge_llm gpt-4

# Cost-effective setup: GPT-3.5 for generation, GPT-4 for judgment (default)
python generate_conversations.py --api-key "your-key" --format general --start 0 --end 50

Multi-Agent Conversations

The framework can be extended to support multi-agent conversations beyond the standard two-agent setup. generate_conversations-multiAgent.py demonstrates this capability with a three-agent configuration featuring two persuaders and one persuadee.

Multi-Agent Setup

The multi-agent version includes:

  • Persuader 1: First persuading agent
  • Persuader 2: Second persuading agent (provides collaborative persuasion)
  • Persuadee: Target agent being persuaded
  • Judge: Evaluates conversation quality and outcomes

Multi-Agent Usage

# Generate multi-agent conversations in general format
python generate_conversations-multiAgent.py --api-key "your-key" --format general --start 0 --end 10

# Multi-agent email conversations
python generate_conversations-multiAgent.py --api-key "your-key" --format email --llm gpt-3.5-turbo --judge_llm gpt-4

# Multi-agent Twitter conversations
python generate_conversations-multiAgent.py --api-key "your-key" --format twitter --start 5 --end 20

Multi-Agent Output Structure

Multi-agent JSON files contain:

{
    "Initial Setting": {
        "Persuader1": "Initial persuader 1 prompt...",
        "Persuadee": "Initial persuadee prompt...", 
        "Persuader2": "Initial persuader 2 prompt...",
        "Judge": "Initial judge prompt..."
    },
    "Conversation History": {
        "0": {
            "Persuader1": "First persuader 1 message...",
            "Persuadee": "First persuadee response...",
            "Persuader2": "First persuader 2 message..."
        }
    },
    "Result": {
        "is_agreement": "Yes/No",
        "winner": "0/1/Neither"
    }
}

Multi-Agent File Locations

  • General format: conversations_json_mainFramework_multiAgent/{index}.json
  • Email format: conversations_json_mainFramework_multiAgent_formatControl/{index}_email.json
  • Twitter format: conversations_json_mainFramework_multiAgent_formatControl/{index}.json

Extending the Tool

Multi-Agent Extensions

The source code can be easily extended to support multi-agent conversations beyond the standard two-agent setup. The generate_conversations-multiAgent.py file demonstrates how to implement a three-agent configuration with two persuaders and one persuadee. This extension shows that the framework is flexible enough to handle more complex conversation scenarios.

Key aspects of multi-agent extension:

  • Agent Management: Each agent maintains its own conversation history
  • Turn Management: Agents take turns responding to multiple previous messages
  • Collaborative Persuasion: Multiple persuaders can work together
  • Enhanced Judgments: Judge evaluates interactions between multiple agents

Adding New Formats

To add a new conversation format (e.g., 'reddit', 'sms'):

  1. Update argument choices:

    parser.add_argument('--format', choices=['general', 'email', 'twitter', 'reddit'], ...)
  2. Add token limit:

    elif self.format_type == 'reddit':
        return 500  # Adjust as needed
  3. Add style instruction:

    elif format_type == 'reddit':
        return "Write in Reddit comment style with casual tone..."
  4. Add initial settings:

    elif format_type == 'reddit':
        initial_persuader = f"""Reddit-style persuader prompt..."""
        initial_persuadee = f"""Reddit-style persuadee prompt..."""
  5. Add output filename pattern:

    elif format_type == 'reddit':
        return f'conversations_json_mainFramework_formatControl/{index}_reddit.json'

Using Custom Topics

  1. Prepare your CSV file with columns:

    • setting: Context/environment (e.g., "library", "office")
    • behavior: Action being discussed (e.g., "wear shoes", "use phone")
    • label: Ethical stance (0=unethical, 1=neutral, 2=ethical)
  2. Example CSV structure:

    setting,behavior,label
    "workplace","take breaks","1"
    "school","cheat on exams","0"
    "hospital","follow safety protocols","2"
  3. Run with custom file:

    python generate_conversations.py --api-key "your-key" --input-file "my_topics.csv" --format email

Output Structure

Generated JSON files contain:

{
    "Initial Setting": {
        "Persuader": "Initial persuader prompt...",
        "Persuadee": "Initial persuadee prompt...", 
        "Judge": "Initial judge prompt..."
    },
    "Conversation History": {
        "0": {
            "Persuader": "First persuader message...",
            "Persuadee": "First persuadee response..."
        }
    },
    "Result": {
        "is_agreement": "Yes/No",
        "winner": "0/1/Neither"
    }
}

File Locations

  • General format: conversations_json_mainFramework_general/{index}.json
  • Email format: conversations_json_mainFramework_formatControl/{index}_email.json
  • Twitter format: conversations_json_mainFramework_formatControl/{index}_twitter.json

Provided Datasets

The conversations_json_mainFramework_general/ folder contains our pre-generated general format dataset with 200 conversations. These conversations were generated using behaviors from NormBank.csv and represent high-quality persuasive dialogues between AI agents. This dataset serves as a ready-to-use resource for research in computational persuasion and social communication.

Best Practices

  1. API Key Management:

    • Never commit API keys to version control
    • Use environment variables: export OPENAI_API_KEY="your-key"
    • Monitor usage to avoid unexpected charges
  2. Data Processing:

    • Start with small ranges to test setup
    • Use appropriate models for your needs
    • Monitor conversation quality
    • Consider cost vs. quality trade-offs when choosing models
    • For multi-agent conversations, expect longer processing times due to additional interactions
  3. Error Handling:

    • Script includes automatic retry for API errors
    • Off-topic conversations are logged and skipped
    • Check console output for processing status

Troubleshooting

Issue Symptoms Solutions
API Key Error Authentication failed, 401 errors Verify API key is correct and active
File Not Found NormBank.csv not found Ensure file exists or use --input-file
Rate Limiting 429 errors, slow processing Script includes automatic retry
Poor Quality Off-topic conversations Try GPT-4, adjust prompts

Documentation

  • Comprehensive Guide: Run python usage_guide.py
  • Example Commands: Run python run_examples.py --api-key "your-key"
  • Help: Run python generate_conversations.py --help

License

This project is for research and educational purposes. Please ensure compliance with OpenAI's usage policies when using their API.

Citation

If you use this code or dataset in your research, please cite our paper:

@inproceedings{ma2025communication,
  title={Communication Makes Perfect: Persuasion Dataset Construction via Multi-LLM Communication},
  author={Ma, Weicheng and Zhang, Hefan and Yang, Ivory and Ji, Shiyu and Chen, Joice and Hashemi, Farnoosh and Mohole, Shubham and Gearey, Ethan and Macy, Michael and Hassanpour, Saeed and others},
  booktitle={Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)},
  pages={4017--4045},
  year={2025}
}

Contributing

Contributions are welcome! Please refer to the extension guides above for adding new formats or features.

About

This is the code for NAACL 2025 "Communication Makes Perfect: Persuasion Dataset Construction via Multi-LLM Communication."

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages