This repository contains a server and client implementation for processing audio files. The server receives audio files from the client, transcribes the audio to text, processes the text using a zero-shot classification model, converts the processed text back to speech, and sends the audio response back to the client which can be useful as a medical assistant. The name of the service is called Baymax, inspired from the movie Big Hero 6
- Python 3.10 or higher
pip(Python package installer)
- Clone the repository:
git clone https://github.com/Skriller18/Baymax.git- Install the required packages:
pip install -r requirements.txt- Install the required models and packages for the zero-shot classification model and the speech synthesis model:
- Navigate to the root directory of the repository.
- Start the server:
python server.py- Start the client to record your audio/query that you would like to ask Baymax
python client.pyThe client will record audio, send it to the server, receive the processed audio, and play it.
- The server reads instructions from instructions.txt.
- The client and server IP and port can be configured in client.py and server.py.