AIAA: AI Audio Authenticity

A machine learning system for detecting AI-generated audio using Benford's Law analysis and advanced audio feature extraction. The system employs ensemble learning with adaptive model updating capabilities.

Features

Multi-Model Ensemble: Uses Random Forest, Gradient Boosting, SGD, and Passive Aggressive classifiers
Benford's Law Analysis: Analyzes frequency distributions for AI detection patterns
Comprehensive Audio Features: Extracts spectral, temporal, and compression-related features
Adaptive Learning: Supports incremental model updates with new data
Batch Processing: Parallel processing for large audio datasets
Spectrogram Generation: Creates and compares various types of spectrograms
Interactive CLI: User-friendly command-line interface

Supported Audio Formats

WAV (.wav)
MP3 (.mp3)
FLAC (.flac)
OGG (.ogg)
M4A (.m4a)
AAC (.aac)

Installation

Option 1: Install from PyPI (Recommended)

pip install aiaa

Option 2: Install from Source

Clone the repository:

git clone https://github.com/ajprice16/AI_Audio_Detection.git
cd AI_Audio_Detection

Install dependencies:

pip install -r requirements.txt

System Dependencies

On Ubuntu/Debian:

sudo apt-get install libsndfile1 ffmpeg

On macOS:

brew install libsndfile ffmpeg

Quick Start

Training Initial Models

Prepare your data: Organize your audio files into two directories:
- human_audio/ - Human-generated audio files
- ai_audio/ - AI-generated audio files
Run the detector:

If installed from PyPI:

aiaa --interactive
# or
aiaa --predict-file path/to/audio.wav

If running from source:

python -m aiaa --interactive
# or
python -m aiaa --predict-file path/to/audio.wav

Choose option 1 to train new models and follow the prompts.

Command Line Usage

Train models:

aiaa --train --human-dir path/to/human/audio --ai-dir path/to/ai/audio

Predict single file:

aiaa --predict-file path/to/audio.wav

Predict batch:

aiaa --predict-batch path/to/audio/directory

Interactive mode:

aiaa --interactive

Predicting Single Files

Interactive mode:

aiaa --interactive
# Choose option 2 and enter the path to your audio file

Direct command:

aiaa --predict-file path/to/audio.wav

Batch Prediction

Interactive mode:

aiaa --interactive
# Choose option 3 and enter the directory path

Direct command:

aiaa --predict-batch path/to/audio/directory

Advanced Usage

Programmatic Usage

from aiaa import AIAudioDetector
from pathlib import Path

# Initialize detector
detector = AIAudioDetector(base_dir=Path.cwd())

# Train models
human_features = detector.extract_features_from_directory("human_audio/", is_ai_directory=False)
ai_features = detector.extract_features_from_directory("ai_audio/", is_ai_directory=True)

all_features = human_features + ai_features
df_results = pd.DataFrame(all_features)
training_results = detector.train_models(df_results)

# Make predictions
result = detector.predict_file("test_audio.wav")
print(f"Prediction: {'AI' if result['is_ai'] else 'Human'}")
print(f"Confidence: {result['confidence']:.3f}")

Adaptive Learning

The system supports adaptive learning to improve accuracy with new data:

# Add new AI data
detector.add_ai_data("new_ai_audio/", retrain_batch_models=True)

# Add new human data
detector.add_human_data("new_human_audio/", retrain_batch_models=True)

# Add mixed data batch
directories = [
    {'path': 'dataset1/', 'is_ai': True},
    {'path': 'dataset2/', 'is_ai': False}
]
detector.add_mixed_data_batch(directories, retrain_batch_models=True)

Features Extracted

Benford's Law Features

Chi-square test statistics
Kolmogorov-Smirnov test statistics
Mean absolute deviation from expected distribution
Maximum deviation
Entropy measures

Spectral Features

Spectral centroid, bandwidth, rolloff
MFCCs (13 coefficients + standard deviations)
Chroma features
Spectral contrast
Zero crossing rate

Temporal Features

RMS energy (mean and standard deviation)
Tempo estimation
Spectral flatness
Dynamic range
Peak-to-RMS ratio

Compression Features

Estimated bit depth
Clipping detection
DC offset
High frequency content ratio

Model Architecture

The system uses an ensemble of four different models:

Incremental Models (for adaptive learning):
- SGD Classifier with log loss
- Passive Aggressive Classifier
Batch Models (for maximum accuracy):
- Random Forest (200 estimators)
- Gradient Boosting (200 estimators)

All features are standardized using StandardScaler, and final predictions use ensemble averaging.

Configuration

Modify config.yaml to customize:

Model parameters
Feature extraction settings
Processing options
Output directories

Command Line Options

Train new models - Initial training from audio directories
Predict single file - Analyze one audio file
Predict batch - Analyze all files in a directory
Update models - Adaptive learning with new data
Add AI data - Add new AI samples to existing models
Add Human data - Add new human samples to existing models
Batch directories - Add multiple directories at once
Training history - View model training history
Data balance - Check AI vs Human data balance
Create visualizations - Generate analysis plots
Generate spectrograms - Create spectrograms for audio files
Spectrogram comparison - Compare AI vs Human spectrograms

Output Files

models/aiaa.joblib - Trained models and metadata
training_results.csv - Detailed training data and features
ai_detection_analysis.png - Visualization plots
spectrograms/ - Generated spectrogram images
spectrogram_comparisons/ - Side-by-side comparisons

Performance Considerations

Multiprocessing: Automatically used for batches > 3 files
Memory Management: Spectrograms are generated efficiently with proper cleanup
Scalability: Incremental learning allows handling large datasets over time

Requirements

Python 3.7+
librosa (audio processing)
scikit-learn (machine learning)
pandas, numpy (data manipulation)
matplotlib (visualization)
scipy (statistical tests)

Contributing

Fork the repository
Create a feature branch
Make your changes
Add tests if applicable
Submit a pull request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Uses Benford's Law for detecting artificial patterns in audio
Built on librosa for robust audio feature extraction
Employs scikit-learn for machine learning capabilities

Citation

If you use this work in your research, please cite:

@software{aiaa,
  title={AIAA: AI Audio Authenticity - Machine Learning System for Detecting AI-Generated Audio},
  author={Alex Price},
  year={2025},
  url={https://github.com/ajprice16/AI_Audio_Detection}
}

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
.github		.github
aiaa		aiaa
tests		tests
.bandit		.bandit
.flake8		.flake8
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
config.yaml		config.yaml
debug_models.py		debug_models.py
example_usage.py		example_usage.py
joss_examples.py		joss_examples.py
paper.bib		paper.bib
paper.md		paper.md
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
run_benchmark.sh		run_benchmark.sh
setup.py		setup.py

Folders and files

Latest commit

History

Repository files navigation

AIAA: AI Audio Authenticity

Features

Supported Audio Formats

Installation

Option 1: Install from PyPI (Recommended)

Option 2: Install from Source

System Dependencies

Quick Start

Training Initial Models

Command Line Usage

Predicting Single Files

Batch Prediction

Advanced Usage

Programmatic Usage

Adaptive Learning

Features Extracted

Benford's Law Features

Spectral Features

Temporal Features

Compression Features

Model Architecture

Configuration

Command Line Options

Output Files

Performance Considerations

Requirements

Contributing

License

Acknowledgments

Citation

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages