voiceprint-api

基于3D-Speaker的声纹识别API服务

项目简介

本项目是一个基于FastAPI开发的声纹识别HTTP服务，使用3D-Speaker模型实现声纹识别功能。支持声纹注册和识别功能，并提供完整的API文档。

目前用于xiaozhi-esp32-server项目，识别小智设备说话人

主要功能

声纹注册
- 输入：说话人ID和声音WAV文件
- 输出：注册成功状态
声纹识别
- 输入：可能的说话人ID列表（逗号分隔）和声音WAV文件
- 输出：识别到的说话人ID（未识别则返回空）

技术栈

FastAPI：Web框架
3D-Speaker：声纹识别模型
MySQL：数据存储

安装说明

克隆项目

git clone https://github.com/xinnan-tech/voiceprint-api.git
cd voiceprint-api

安装依赖

conda remove -n voiceprint-api --all -y
conda create -n voiceprint-api python=3.11 -y
conda activate voiceprint-api

pip config set global.index-url https://mirrors.aliyun.com/pypi/simple/
pip install -r requirements.txt

配置数据库

创建数据库

CREATE DATABASE voiceprint_db CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

创建数据表

CREATE TABLE voiceprints (
    id INT AUTO_INCREMENT PRIMARY KEY,
    speaker_id VARCHAR(50) UNIQUE,
    feature_vector LONGBLOB NOT NULL,
    INDEX idx_speaker_id (speaker_id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

复制 voiceprint.yaml 为 data/.voiceprint.yaml
1. 启动

python app.py

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
test		test
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
app.py		app.py
db.py		db.py
requirements.txt		requirements.txt
test_user.py		test_user.py
voiceprint.yaml		voiceprint.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

voiceprint-api

项目简介

主要功能

技术栈

安装说明

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

License

xinnan-tech/voiceprint-api

Folders and files

Latest commit

History

Repository files navigation

voiceprint-api

项目简介

主要功能

技术栈

安装说明

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages