ASVspoof 2019 LightCNN Implementation

This repository contains a PyTorch implementation of LightCNN for voice anti-spoofing detection on the ASVspoof 2019 Logical Access (LA) dataset. The implementation follows the specifications from the original LightCNN paper and is designed for the HSE Voice Anti-spoofing homework assignment.

Task Overview

This project implements a Countermeasure (CM) system for voice anti-spoofing detection using the LightCNN (LCNN) architecture on the Logical Access partition of the ASVspoof 2019 Dataset. The goal is to distinguish between bonafide (genuine) and spoofed audio samples.

Key Requirements

LightCNN Architecture: Implemented according to the original paper specifications
ASVspoof 2019 LA Dataset: Logical Access partition
STFT Frontend: Short-Time Fourier Transform for feature extraction
Cross-Entropy Loss: Standard classification loss function
Dropout Regularization: As specified in the training recipe
EER Evaluation: Equal Error Rate as the primary metric

Architecture

LightCNN Model

The implementation follows the LightCNN architecture from the original paper with the following key components:

MFM Layers: Max-Feature-Map operations for feature selection
Batch Normalization: For training stability
Dropout: 0.75 dropout probability for regularization
STFT Frontend: Short-Time Fourier Transform with optimized parameters

Model Specifications

model:
  input_shape: [1, 863, 600]  # (channels, height, width)
  num_classes: 2               # bonafide vs spoof
  dropout_prob: 0.75          # dropout for regularization

Quick Start

1. Environment Setup

# Clone the repository
git clone https://github.com/Melodiz/LightCNN_ASVspoof2019.git
cd cd LightCNN_ASVspoof2019/

# Create virtual environment
python3 -m venv env
source env/bin/activate  # On Windows: env\Scripts\activate

# Install dependencies
pip install -r requirements.txt

2. Comet.ml Setup

This project uses Comet.ml for experiment tracking and logging. Follow these steps to set up Comet.ml:

Step 1: Create Comet.ml Account

Go to Comet.ml and create a free account
Verify your email address

Step 2: Get Your API Key

Log in to your Comet.ml account
Go to your profile settings (click on your avatar in the top right)
Navigate to "API Keys" section
Copy your API key

Step 3: Set Environment Variables

Set your Comet.ml API key as an environment variable:

On macOS/Linux:

export COMET_API_KEY="your_api_key_here"
export COMET_WORKSPACE="your_workspace_name"
export COMET_PROJECT_NAME="asvspoof-baseline"

On Windows (Command Prompt):

set COMET_API_KEY=your_api_key_here
set COMET_WORKSPACE=your_workspace_name
set COMET_PROJECT_NAME=asvspoof-baseline

On Windows (PowerShell):

$env:COMET_API_KEY="your_api_key_here"
$env:COMET_WORKSPACE="your_workspace_name"
$env:COMET_PROJECT_NAME="asvspoof-baseline"

Alternative: Create a .env file Create a .env file in your project root:

# .env file
COMET_API_KEY=your_api_key_here
COMET_WORKSPACE=your_workspace_name
COMET_PROJECT_NAME=asvspoof-baseline

Then load it in your shell:

source .env

Step 4: Verify Setup

Test your Comet.ml setup by running a quick training session:

python train.py trainer.n_epochs=1  # Quick test with 1 epoch

You should see Comet.ml initialization messages in the console output.

3. Dataset Setup

This project uses the ASVspoof 2019 Logical Access (LA) dataset. The ASVspoof 2019 dataset was created for the third Automatic Speaker Verification Spoofing and Countermeasures Challenge. This repository focuses on the Logical Access (LA) partition, which, for the first time in the challenge's history, includes all three major attack types: text-to-speech (TTS), voice conversion (VC), and replay attacks. The data is derived from the VCTK corpus and is split into training, development, and evaluation sets with no speaker overlap between them. A key feature of the evaluation set is the inclusion of "unknown attacks"—spoofing techniques not present in the training or development data—to rigorously test a model's ability to generalize.

Download the ASVspoof 2019 LA dataset and organize it as follows:

Preferred method (direct download):

curl -o ./LA.zip -# https://datashare.ed.ac.uk/bitstream/handle/10283/3336/LA.zip\?sequence\=3\&isAllowed\=y
unzip LA.zip

Alternative sources:

Original Source: ASVspoof 2019 dataset page
Kaggle Mirror: ASVpoof 2019 Dataset on Kaggle

Expected directory structure:

LA/
├── ASVspoof2019_LA_train/
│   ├── flac/
│   └── LICENSE.txt
├── ASVspoof2019_LA_dev/
│   ├── flac/
│   └── LICENSE.txt
├── ASVspoof2019_LA_eval/
│   ├── flac/
│   └── LICENSE.txt
└── ASVspoof2019_LA_cm_protocols/
    ├── ASVspoof2019.LA.cm.train.trn.txt
    ├── ASVspoof2019.LA.cm.dev.trl.txt
    └── ASVspoof2019.LA.cm.eval.trl.txt

4. Training

# Basic training with default settings
python train.py

# Custom training parameters
python train.py trainer.n_epochs=50 optimizer.lr=0.001

# CPU training
python train.py trainer.device=cpu

5. Evaluation

# Evaluate trained model
python evaluate.py

6. Download Pre-trained Checkpoint

The trained model checkpoint (122MB) is not included in this repository to keep it lightweight. You can download it from Comet.ml:

Method 1: Download from Comet.ml Web Interface

Go to Comet.ml Experiment
Navigate to the "Assets" tab
Find the best_model.pth file in the checkpoints/ folder
Click the download button to save the file
Place the downloaded file in the saved/ directory:
```
mkdir -p saved
mv best_model.pth saved/
```

Method 2: Download via Comet.ml API

# Install comet-ml if not already installed
pip install comet-ml

# Download the checkpoint using Python
python -c "
import comet_ml
api = comet_ml.API()
experiment = api.get_experiment('ivan-novosad', 'asvspoof-baseline', '4rhkga45el9k6gs00b0qs36qvscu475t')
experiment.download_asset('checkpoints/best_model.pth', 'saved/best_model.pth')
"

Method 3: Direct Download Link

If available, you can download directly from the experiment URL:

Visit the experiment page
Go to Assets → checkpoints
Right-click on best_model.pth and "Save link as..."
Save to saved/best_model.pth

Note: The checkpoint file is ~122MB. Make sure you have sufficient disk space and a stable internet connection.

Project Structure

template/
├── train.py                    # Modular training script
├── evaluate.py                 # Modular evaluation script
├── requirements.txt            # Dependencies
├── README.md                  # This file
├── src/
│   ├── configs/
│   │   ├── asvspoof_baseline.yaml  # Main configuration
│   │   ├── model/lightcnn.yaml     # Model configuration
│   │   ├── optimizer/adam.yaml      # Optimizer configuration
│   │   ├── scheduler/exponential.yaml # Scheduler configuration
│   │   └── writer/cometml.yaml      # Writer configuration
│   ├── datasets/
│   │   ├── asvspoof_dataset.py     # Dataset implementations
│   │   ├── data_utils.py           # Data utilities
│   │   ├── dataloader_utils.py     # Training data loading
│   │   └── eval_utils.py           # Evaluation data loading
│   ├── model/
│   │   └── lightcnn_original.py    # LightCNN model implementation
│   ├── trainer/
│   │   ├── asvspoof_trainer.py     # Main trainer class
│   │   └── evaluator.py            # Evaluation logic
│   ├── metrics/
│   │   └── eer_utils.py            # EER calculation functions
│   ├── utils/
│   │   ├── init_utils.py           # Initialization utilities
│   │   ├── model_utils.py          # Model loading utilities
│   │   └── results_utils.py        # Results management
│   └── logger/
│       └── cometml.py              # CometML writer
└── LA/                          # Dataset directory (not in repo)

Configuration

Main Configuration (`src/configs/asvspoof_baseline.yaml`)

# Training parameters
epochs: 16
batch_size: 16
learning_rate: 0.0005

# Model parameters
model:
  input_shape: [1, 863, 600]
  num_classes: 2
  dropout_prob: 0.75

# Optimizer
optimizer:
  lr: 0.0005

# Scheduler
scheduler:
  gamma: 0.98

Key Hyperparameters

Learning Rate: 0.0005 (Adam optimizer)
Batch Size: 16
Epochs: 16 (configurable)
Dropout: 0.75
Scheduler: ExponentialLR with gamma=0.98

Implementation Details

Data Processing

STFT Parameters: n_fft=1724, hop_length=130, win_length=1724
Feature Shape: [1, 863, 600] (channels, height, width)
Augmentation: SpecAugment with frequency and time masking
Class Balancing: WeightedRandomSampler for balanced training

Training Features

Multi-GPU Support: Current training/evaluation pipeline is adapted for multi-GPU computations (parallel)
Class Balancing: WeightedRandomSampler for balanced training
SpecAugment: Data augmentation for spectrograms
- Applying uniform noise to 50% of objects
- Frequency Masking
- Time Masking
Exponential LR Scheduler: Learning rate scheduling
Comet.ml Integration: Experiment tracking and logging

Training and Evaluation

Training Process

Data Loading: ASVspoof 2019 LA dataset with class balancing
Feature Extraction: STFT with optimized parameters
Model Training: LightCNN with Cross-Entropy loss
Evaluation: EER calculation on evaluation set
Logging: Comet.ml integration for experiment tracking

Evaluation Process

Model Loading: Load trained checkpoint
Score Generation: Generate scores for evaluation set
EER Calculation: Compute Equal Error Rate
Results Saving: Save predictions to CSV

Results and Logging

Comet.ml Integration

The project uses Comet.ml for experiment tracking:

Real-time Metrics: Live training and evaluation metrics
Experiment Management: Organized experiment tracking
Checkpoint Logging: Automatic model checkpoint logging
Hyperparameter Tracking: Configuration parameter logging

Live Training Logs: View on Comet.ml

Training Results

The model achieved excellent performance with EER of 2.13% after only 4 epochs of training. This result is close to the authors' reported EER value of 1.86% and significantly exceeds the HSE homework goal of 5.3% EER. The training was stopped early due to satisfactory performance.

Training Dashboards

Below are the training metrics and loss curves from the 4-epoch training run:

Training Loss vs Steps:

Training Metrics (Epoch Loss and EER):

Core Dependencies

torch==2.8.0
torchaudio==2.8.0
torchvision==0.23.0
soundfile==0.13.1
hydra-core==1.3.2
omegaconf==2.3.0
comet_ml==3.50.0
numpy==1.26.4
pandas==2.3.1
tqdm==4.67.1

References and Citations

Academic Papers

ASVspoof 2019: Evaluation Plan
LightCNN Paper: Speech Technology Center

Code and Templates

PyTorch Project Template: GitHub Repository by Petr Grinberg

Citation

If you use this implementation, please cite:

@software{asvspoof_lightcnn_2024,
  title={ASVspoof 2019 LightCNN Implementation},
  author={Novosad, Ivan},
  year={2025},
  url={https://github.com/Melodiz/LightCNN_ASVspoof2019},
  note={Voice anti-spoofing detection using LightCNN on ASVspoof 2019 LA dataset}
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
assets		assets
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
evaluate.py		evaluate.py
requirements.txt		requirements.txt
train.py		train.py

Folders and files

Latest commit

History

Repository files navigation