NOTE: This repo is deprecated. Please refer to https://github.com/laac-LSCP/analysis-service instead

Daemon for Echolalia

A Python-based daemon that periodically checks and schedules tasks from Echolalia

Running the Daemon

Create a virtual environment somewhere, or use your Conda environment, and just install the project

pip install git@github.com:LAAC-LSCP/analysis-daemon.git
echolalia --config [path to configuration toml file] run-daemon

Note you need to have an ssh key associated with LAAC.

For updating the DB

echolalia --config [path to configuration toml file] run-migrations

To manage tasks directly from the CLI, say for bug-fixing purposes, see

echolalia --config [path to configuration toml file] task-manager --help

Configuration

An example configuration is given below for configuration.toml.

You'll need to create a file like this on your own system. Note that file paths should be absolute, not relative paths.

log_directory = "/Users/me/Desktop/echolalia_log"
conda_executable = "/Users/me/miniconda3/bin/activate"
output_folder = "/Users/me/echolalia"
script_wrapper = "/Users/me/Desktop/script_wrapper.sh

[database]
url = "sqlite:///database.db"

[http]
base_url = "ECHOLALIA_REMOTE_SERVER_URL"
client_id = "MY_ID"
client_secret = "SECRET"

[jobs]
handler = "slurm"
partition = "echolalia"
use_slurm = true  # true by default

[[scripts]]
name = "run vtc"
python_script_path = "/Users/me/Desktop/scripts/run_vtc.py"
bash_script_path = "/Users/me/Desktop/scripts/apply_vtc.sh"
env_name = "pyannote"
model_name = "vtc"

[[scripts]]
name = "run alice"
python_script_path = "/Users/me/Desktop/scripts/run_alice.py"
bash_script_path = "/Users/me/Desktop/scripts/apply_alice.sh"
env_name = "alice"
model_name = "alice"

Script Setup (READ CAREFULLY!)

It is recommended you use the scripts from the scripts folder in the repo.

While working on this system, we realised we couldn't run a per-file slurm job, for example running vtc on a per-file basis by launching a new job for each file. This was due to memory requirements. While the smaller models could use this pattern, W2V2 presented a problem because it was 1) very large and 2) designed to be run over countless tiny files—as a result, the cost of bootstrapping the model each time would have been too high.

We have opted for a compromise that has a few anti-patterns and requires careful reading, if you want to add new scripts that is. The daemon, instead of asking SLURM for status updates, will continuously check a log file created by the running script. Running scripts must, therefore, take in a log directory. Scripts are assumed, by the daemon, to adhere to a strict interface that looks something like:

python3 vtc.py --task-id [task id] --bash-script [the .sh script used by the model] --input-folder [input_dir] --dataset [dataset name] --echolalia-folder [folder as in config] -i [file 1] -i [file 2] ...

Scripts create status logs status.log that follow a specific format based on file failure or success:

SUCCESS - [some descriptive string] - [absolute file path]
ERROR - [some descriptive string] - [absolute file path] - [stack trace]

This output format must be adhered to for the service to work. It periodically checks these outputs over the running tasks.

Finally, for running any of the models, you must install the associated Conda environments. More info on getting the models to work at:

https://github.com/MarvinLvn/voice-type-classifier/ for VTC https://github.com/orasanen/ALICE for ALICE

Each model has its own corresponding Conda environment.

Note that since the Python wrapper scripts rely on some libraries as well (typically only click is missing) some dependencies may be missing. You just need to pip install them into your Conda environments, or change the conda env files to include them.

Another unfortunate pattern is the need for several nested scripts. The original bash scripts for the models are often clunky to work with. While we have created wrappers in bash itself, these are hard to test, and so we created Python wrappers for the bash script.

But because the environment changes according to the model, we must also wrap the Python script in a bash scripts which prepares the environment, which we call the "script wrapper" (see config).

Script Outputs

Scripts are output in the echolalia folder. If the output_folder (see config) was set to /echolalia, then scripts push their outputs in /echolalia/dataset_name/task_id/.

Above you find the script API. It includes an input-folder as well, and this is to allow us to faithfully reproduce the input folder's structure.

For VTC:

input_folder = "/input_folder"
echolalia_folder = "/echolalia"
dataset = "loann_2025"
task_id = "601cb879-8f86-4153-8e1a-9a3a3f5c812e"
input_1 = "/input_folder/recording_1.wav"
input_2 = "/input_folder/recording_2.wav"
input_3 = "/input_folder/folder_1/folder_2/recording_3.wav"

# This means:
outputs = [
    "/echolalia/outputs/loann_2025/601cb879-8f86-4153-8e1a-9a3a3f5c812e/recording_1.rttm",
    "/echolalia/outputs/loann_2025/601cb879-8f86-4153-8e1a-9a3a3f5c812e/recording_2.rttm",
    "/echolalia/outputs/loann_2025/601cb879-8f86-4153-8e1a-9a3a3f5c812e/folder_1/folder_2/recording_3.rttm",
    "/echolalia/outputs/loann_2025/601cb879-8f86-4153-8e1a-9a3a3f5c812e/status.log"
]

These outputs are meant to be thrown into the /annotations/vtc/raw folder in the ChildProject dataset, and then an importation is meant to be run (see ChildProject docs).

For ALICE:

input_folder = "/input_folder"
echolalia_folder = "/echolalia"
dataset = "loann_2025"
task_id = "601cb879-8f86-4153-8e1a-9a3a3f5c812e"
input_1 = "/input_folder/recording_1.wav"
input_2 = "/input_folder/recording_2.wav"
input_3 = "/input_folder/folder_1/folder_2/recording_3.wav"

# This means:
outputs = [
    "/echolalia/outputs/loann_2025/601cb879-8f86-4153-8e1a-9a3a3f5c812e/recording_1.txt",
    "/echolalia/outputs/loann_2025/601cb879-8f86-4153-8e1a-9a3a3f5c812e/recording_1_sum.txt",
    "/echolalia/outputs/loann_2025/601cb879-8f86-4153-8e1a-9a3a3f5c812e/recording_2.txt",
    "/echolalia/outputs/loann_2025/601cb879-8f86-4153-8e1a-9a3a3f5c812e/recording_2_sum.txt",
    "/echolalia/outputs/loann_2025/601cb879-8f86-4153-8e1a-9a3a3f5c812e/folder_1/folder_2/recording_3.txt",
    "/echolalia/outputs/loann_2025/601cb879-8f86-4153-8e1a-9a3a3f5c812e/folder_1/folder_2/recording_3_sum.txt",
    "/echolalia/outputs/loann_2025/601cb879-8f86-4153-8e1a-9a3a3f5c812e/status.log"
]

These outputs are meant to be thrown into the /annotations/alice/output/raw folder in the ChildProject dataset, and then an importation is meant to be run (see ChildProject docs).

The "sum" files must likewise be thrown into /annotations/alice/output/extra.

Name		Name	Last commit message	Last commit date
Latest commit History 77 Commits
.github		.github
scripts		scripts
src		src
tests		tests
.flake8		.flake8
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
configuration.toml		configuration.toml
mypy.ini		mypy.ini
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NOTE: This repo is deprecated. Please refer to https://github.com/laac-LSCP/analysis-service instead

Daemon for Echolalia

Running the Daemon

Configuration

Script Setup (READ CAREFULLY!)

Script Outputs

About

Uh oh!

Releases 26

Uh oh!

Contributors 3

Uh oh!

Languages

License

LAAC-LSCP/analysis-daemon

Folders and files

Latest commit

History

Repository files navigation

NOTE: This repo is deprecated. Please refer to https://github.com/laac-LSCP/analysis-service instead

Daemon for Echolalia

Running the Daemon

Configuration

Script Setup (READ CAREFULLY!)

Script Outputs

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 26

Uh oh!

Contributors 3

Uh oh!

Languages