entask

a university project in online informational systems design. it models a microservice-oriented platform for scalable, durable, and consistent content conversion/transcoding. the system applies durable execution (conductor-oss), pub/sub state models (redis streams), message brokerage (nats jetstream), and an api gateway (traefik) for auth checks, request routing, and per-service load balancing, with the frontend is built in angular v19.

schema

a very rough-looking schema, will evolve as the project itself evolves. currently, it only serves a development guidance, so i dont steer off into the unknown.

usage

deployment is supported via docker swarm (todo hihi) or docker compose.

prerequisites:

docker
docker model runner
atleast 16gb of working memory (for all 4 services up and running and not turning your pc into a ticking time bomb)

installation:

clone the repo (git clone https://github.com/komadiina/entask.git)
initialize .env with all required values (user/password credentials, hosts, ports)
modify config files in /core/* according to set envvars (still havent migrated them into a generation script)
run with docker compose up (use -d for detached)
stop with docker compose down -v --remove-orphans

if any package.json or requirements.txt changes haven't reflected, pass the --force-recreate flag into docker compose up

text-recognizer converter requires an instantiated docker model (i.e. ai/gemma3:latest, ai/gemma3-qat:latest)
1. go to docker desktop
2. navigate to Models (BETA) tab
3. download any model (gemma3 should suffice, if low on resources you can use any other smaller-form quantized models)
4. since it is running outside of the entask network, requests will need to target the docker internal network (model-runner.docker.internal, see DOCKER_MODEL_RUNNER_LISTEN)
5. (note) to enable GPU inference, see official docs

envfile

below is a list of some important env-vars. default.dev.env for more info.

envvar	description
`PGADMIN_EMAIL`	use this email with `POSTGRES_PASSWORD` to log into pgAdmin console
`FRONTEND_HOST`	hardcoded, used for client-side redirects (`0.0.0.0` does not work here)
`CLIENT_SECRET_FILE`	your Google API Client secrets file
`GOOGLE_OAUTH_CLIENT_ID`	extracted from the secret file or via the Google Cloud console
`GOOGLE_KEYS_URL`	public Google endpoint for fetching public keys (if `provider == 'google'`)
`DOCKER_MODEL_RUNNNER_LISTEN`	docker model runner host/listen, used as a local LLM host
`LLM_SERVICE_SYSTEM_PROMPT`	characterize your LLM model w/ harsh instructions
`...`	others are pretty self-explanatory

default hosts/listens

take note that some require authenticated URLs (user:pass@host:port):

docker-service	listen
MinIO	`minio:9000`
NATS	`0.0.0.0:{4222, 6222, 8222}`
PostgreSQL	`postgres:5432`
pgBouncer	`pgbouncer:6432`
Redis	`redis:6379`
Traefik	`0.0.0.0:{80, 443, 8080}`
Conductor	`conductor-server:{5000, 8080}`
angular client	`frontend:4200`
auth-service	`auth-service:5201`
user-details-service	`user-details-service:5202`
file-service	`file-service:5204`
conversion-service	`conversion-service:5205`
notifier-service	`notifier-service:5206`
llm-service	`llm-service:5207`
thumbnailer-converter	`thumbnailer-converter:7401`
waveformer-converter	`waveformer-converter:7402`
term-extractor-converter	`term-extractor-converter:7403`
text-recognizer-converter	`text-recognizer-converter:7404`
ws-proxy	`ws-proxy:9202`

components

review TODO for roadmap, issues & etc

converters

all converters use minio and httpx for file-based communication (no ftp yet)

thumbnailer uses:

ffmpeg & imageio
moviepy

waveformer uses:

pedalboard

term-extractor uses:

sentence-transfomers
spacy

text-recognizer uses:

easyocr
pyspellchecker
openai
fpdf2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

entask

schema

usage

envfile

default hosts/listens

components

converters

About

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 76 Commits
.husky		.husky
compose		compose
converters		converters
core		core
docs		docs
frontend		frontend
gateway		gateway
services		services
.gitignore		.gitignore
README.md		README.md
docker-compose.yml		docker-compose.yml
package.json		package.json
todo.md		todo.md

Folders and files

Latest commit

History

Repository files navigation

entask

schema

usage

envfile

default hosts/listens

components

converters

About

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages