Skip to content

Conversation

@KirillKukharev
Copy link

The transcribe_bytes() method has been added to the GigaAMASR class, which performs audio transcription directly from bytes in memory without file I/O.

Why is this important?

Performance and latency
Fixed file I/O: temporary files are not created
Eliminated overhead from ffmpeg subprocess call
Direct conversion of PCM16 bytes to tensors via torch.frombuffer()

Technical details
Uses load_audio_from_bytes() instead of load_audio() for direct byte conversion
Usage example:
Version with temp file (before pull request was created):

result = model.transcribe("audio.wav")

Version without file:

audio_bytes = receive_audio_from_network()  # abstract method; receive PCM16 16kHz
result = model.transcribe_bytes(audio_bytes)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant