Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
68 commits
Select commit Hold shift + click to select a range
9aeabd0
UPD: v1.0.1 -> v2.0.0
davidmeijer Sep 3, 2025
477a51b
WIP: updating docker installation
davidmeijer Sep 4, 2025
801ba0f
Merge branch 'webapp_dev' into webapp_dev_docker
davidmeijer Sep 4, 2025
38565eb
FIX: make sure new data files are included in build
davidmeijer Sep 4, 2025
14d648b
FIX: make sure new data files are included in build
davidmeijer Sep 4, 2025
b91dff0
ENH: don't push .DS_Store
davidmeijer Sep 4, 2025
2c10f7e
UPD: add psutils
davidmeijer Sep 4, 2025
5396836
ADD: logging
davidmeijer Sep 4, 2025
4acf74b
ENH: add ModelSpec + MultiModelLoader for lazy mmp'd model loading fo…
davidmeijer Sep 4, 2025
f621ec3
ADD: health check
davidmeijer Sep 4, 2025
b23bd36
ENH: update deployment with more resource controls and health check
davidmeijer Sep 4, 2025
cd78bae
FIX: use model loader everywhere; prevent fail on too many open files
davidmeijer Sep 4, 2025
1cfd21f
REF: not able to retrieve annotation submissions
davidmeijer Sep 4, 2025
72e0039
Make NRPSTransformer parser more lenient
BTheDragonMaster Sep 5, 2025
a82321a
Add code to pull substrate mapping from database
BTheDragonMaster Sep 5, 2025
6102556
Add parsers for pca data
BTheDragonMaster Sep 5, 2025
c85126e
Add random forest trainer script
BTheDragonMaster Sep 5, 2025
8589cd9
WIP: submitting annotations to github
davidmeijer Sep 5, 2025
82a6de5
WIP: submitting annotations to github
davidmeijer Sep 5, 2025
ceee04f
Add command line interface for PARAS
BTheDragonMaster Sep 5, 2025
03aaa4c
UPD: make sure no env files are being pushed
davidmeijer Sep 5, 2025
a08aed4
ENH: adding captcha
davidmeijer Sep 5, 2025
cf2a206
STY: add padding to titles in annotation submission form, make corner…
davidmeijer Sep 5, 2025
2eeb598
ENH: go back to data annotation start upon submission
davidmeijer Sep 5, 2025
06c70f6
Add PARASECT cli
BTheDragonMaster Sep 5, 2025
d003bf0
REF: remove unused function
davidmeijer Sep 5, 2025
7924227
FIX: switching back from NCBI input method resets input type accordingly
davidmeijer Sep 5, 2025
f006f87
STY: add check mark when domain is annotated
davidmeijer Sep 5, 2025
27910c4
ENH: open PR url in separate tab upon data submission
davidmeijer Sep 5, 2025
bcc6738
ENH: add and valdiate ORCID and published articles
davidmeijer Sep 5, 2025
e49c81c
ENH: forward orcid and references into github submission
davidmeijer Sep 5, 2025
ceed11a
FIX: make sure prUrl is mutable
davidmeijer Sep 5, 2025
cffe8ed
FIX: read out correct value for pr_url
davidmeijer Sep 5, 2025
5ab62d5
UPD: return json data in markdown format
davidmeijer Sep 5, 2025
f735586
ADD: button to data annotation page on home page
davidmeijer Sep 6, 2025
13176d4
STY: update capitalization to match other items in menu
davidmeijer Sep 6, 2025
9fbf7c8
UPD: make ORCID optional and at least one reference mandatory
davidmeijer Sep 6, 2025
b4e540c
Fix substrate
BTheDragonMaster Sep 8, 2025
3b8b6b9
Avoid duplicate substrates
BTheDragonMaster Sep 8, 2025
e714d56
Update database with substrate corrections
BTheDragonMaster Sep 8, 2025
7160687
Add code to pull domain corrections from github
BTheDragonMaster Sep 8, 2025
78e6546
Cleanup: remove TODO
BTheDragonMaster Sep 8, 2025
575ca51
Add code to automate database substrate corrections from github issues
BTheDragonMaster Sep 8, 2025
21ddd7f
close #59; Make substrate correction
BTheDragonMaster Sep 9, 2025
92c13e9
Add train test splitting on taxonomic rank
BTheDragonMaster Sep 14, 2025
1640795
Add taxonomy to database
BTheDragonMaster Sep 14, 2025
4b9e444
Add scripts for handing taxonomy
BTheDragonMaster Sep 14, 2025
7db57ce
Add option to train only on bacterial or fungal domains to train test…
BTheDragonMaster Sep 15, 2025
cdf5ebd
Finalise train test splitters
BTheDragonMaster Sep 15, 2025
9ec9f2f
Add code for training PARAS and PARASECT
BTheDragonMaster Sep 17, 2025
d7d8c7f
Add code for confusion matrix plotting
BTheDragonMaster Sep 17, 2025
dc3d7a0
Add dependencies for plotting
BTheDragonMaster Sep 17, 2025
d998c44
Update code for training and testing paras and parasect
BTheDragonMaster Sep 22, 2025
d9297e8
Write fingerprints to file when training random forest
BTheDragonMaster Sep 22, 2025
81a4d0b
Update fingerprints
BTheDragonMaster Sep 22, 2025
58ce77a
Update models
BTheDragonMaster Sep 22, 2025
c2ea125
Update setup.cfg
BTheDragonMaster Sep 22, 2025
b1b9396
Add init to model training module
BTheDragonMaster Sep 22, 2025
f6a36b8
Bugfix: Add hmm binaries to parasect package data
BTheDragonMaster Sep 22, 2025
0448568
Add bacterial PARASECT model
BTheDragonMaster Sep 29, 2025
7fc3979
Merge branch 'feature_webapp_esm' into webapp_dev_docker
davidmeijer Sep 30, 2025
d5caf9a
WIP: adding parasect bacterial mode to frontend
davidmeijer Sep 30, 2025
b4f9262
Merge branch 'feature_webapp_esm' into webapp_dev_docker
davidmeijer Sep 30, 2025
eda9d32
ENH: parasect bacterial mode functional
davidmeijer Sep 30, 2025
78b5380
STY: fixed typos
davidmeijer Sep 30, 2025
75aa240
WIP: add navigation query database page
davidmeijer Oct 6, 2025
a59f8c7
ENH: database querying
davidmeijer Oct 6, 2025
459ebda
FIX: corrected query for retrieving entries with signature
davidmeijer Oct 6, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,7 @@ celerybeat.pid

# Environments
.env
.env.*
.venv
env/
venv/
Expand Down Expand Up @@ -162,4 +163,7 @@ cython_debug/

# do not push anythin in the condensed/out folder apart from .gitkeep
condensed/out/*
!condensed/out/.gitkeep
!condensed/out/.gitkeep

# macOS
.DS_Store
24 changes: 17 additions & 7 deletions app/Dockerfile.client
Original file line number Diff line number Diff line change
Expand Up @@ -2,17 +2,27 @@
# Provides proxies / api requests

# Build React frontend
FROM node:16-alpine as build-step
FROM node:20-alpine as build
WORKDIR /app
ENV PATH /app/node_modules/.bin:$PATH

# Provide the site key at build time (public value)
ARG REACT_APP_TURNSTILE_SITE_KEY
ENV REACT_APP_TURNSTILE_SITE_KEY=${REACT_APP_TURNSTILE_SITE_KEY}

# Install dependencies with cache-friendly order
COPY ./src/client/package.json ./src/client/package-lock.json ./
RUN npm ci

# Build
COPY ./src/client/src ./src
COPY ./src/client/public ./public
RUN npm install
RUN npm run build
COPY ./src/client/deployment/nginx.default.conf /etc/nginx/conf.d/default.conf

# Build nginx container
# Nginx runtime
FROM nginx:stable-alpine
COPY --from=build-step ./app/build /usr/share/nginx/html
COPY --from=build-step ./etc/nginx/conf.d/default.conf /etc/nginx/conf.d/default.conf

COPY --from=build /app/build /usr/share/nginx/html
COPY ./src/client/deployment/nginx.default.conf /etc/nginx/conf.d/default.conf

EXPOSE 80
CMD ["nginx", "-g", "daemon off;"]
64 changes: 45 additions & 19 deletions app/Dockerfile.server
Original file line number Diff line number Diff line change
@@ -1,25 +1,51 @@
# This Dockerfile builds API only
FROM python:3.9
# Use Mambaforge for fast, reliable conda solves (includes mamba)
FROM condaforge/mambaforge:24.9.2-0

WORKDIR /app
COPY ./src/server/ ./
RUN pip install -r ./requirements.txt
ENV FLASK_ENV production
# Make non-interactive shells load conda env nicely
SHELL ["/bin/bash", "-lc"]

EXPOSE 4009
# Create a non-root user for saner file permissions
ARG USERNAME=app
ARG USER_UID=1000
ARG USER_GID=$USER_UID
RUN groupadd --gid $USER_GID $USERNAME \
&& useradd --uid $USER_UID --gid $USER_GID -m $USERNAME

# Install HMMER2
RUN apt-get update
RUN apt-get install -y hmmer2
# Workdir
WORKDIR /app

# Install MUSCLE
RUN mkdir muscle
WORKDIR /app/muscle
RUN wget https://drive5.com/muscle/muscle_src_3.8.1551.tar.gz
RUN tar -xvzf muscle_src_3.8.1551.tar.gz
RUN make
RUN cp muscle /usr/local/bin/muscle
# Copy env first to leverage Docker layer caching
COPY server-environment.yml /app/
COPY server-requirements.txt /app/

WORKDIR /app
# Create Conda env with mamba (faster) and clean caches
RUN mamba env create -n web -f server-environment.yml \
&& conda clean -afy

# Ensure `conda run -n web ...` works for ENTRYPOINT/CMD
RUN echo "conda activate web" >> /etc/bash.bashrc

# Copy the app code and give ownership to the non-root user
COPY --chown=${USER_UID}:${USER_GID} ./src/server/ /app/

# Ensure temp dir exists and is writable
RUN mkdir -p /app/temp && chown -R ${USER_UID}:${USER_GID} /app

# Runtime env: unbuffered logs, mmap for joblib, cap math threads
ENV PYTHONUNBUFFERED=1 \
LOG_LEVEL=INFO \
OMP_NUM_THREADS=1 \
OPENBLAS_NUM_THREADS=1 \
MKL_NUM_THREADS=1 \
NUMEXPR_NUM_THREADS=1 \
JOBLIB_MMAP_MODE=r

# Use the unprivileged user at runtime
USER $USERNAME

# Expose your service port
EXPOSE 4009

CMD ["sh", "-c", "gunicorn -b :4009 --timeout 120 api:app"]
# Run everything inside the conda env without needing manual activation
ENTRYPOINT ["conda", "run", "--no-capture-output", "-n", "web"]
CMD ["gunicorn", "-b", ":4009", "--worker-class", "gthread", "--workers", "1", "--threads", "4", "--preload", "--timeout", "120", "api:app"]
29 changes: 24 additions & 5 deletions app/docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,25 +7,44 @@ services:
dockerfile: Dockerfile.server
image: paras-server
restart: always
env_file:
- .env # Need secrets for submitting newly annotated domains
container_name: paras-server
volumes:
- ./models:/app/models
- ./models:/app/models:ro
environment:
- MODEL_DIR=/app/models
networks:
- paras-network
- LOG_LEVEL=INFO
# Also set in Dockerfile, can override here
- OMP_NUM_THREADS=1
- OPENBLAS_NUM_THREADS=1
- MKL_NUM_THREADS=1
- NUMEXPR_NUM_THREADS=1
- JOBLIB_MMAP_MODE=r
networks: [paras-network]
healthcheck:
# Use base Python; no curl/wget dependency needed
test: [ "CMD", "/opt/conda/bin/python", "-c",
"import urllib.request,sys; urllib.request.urlopen('http://localhost:4009/health', timeout=2); sys.exit(0)" ]
interval: 10s
timeout: 3s
retries: 10
start_period: 5s

paras-client:
build:
context: .
dockerfile: Dockerfile.client
args:
- REACT_APP_TURNSTILE_SITE_KEY=${REACT_APP_TURNSTILE_SITE_KEY}
image: paras-client
ports:
- "4010:80"
restart: always
container_name: paras-client
networks:
- paras-network
networks: [paras-network]
depends_on:
- paras-server

networks:
paras-network:
Expand Down
17 changes: 17 additions & 0 deletions app/server-environment.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
name: web
channels:
- conda-forge
- bioconda
dependencies:
# Python runtime
- python=3.9

# Bio tools (from bioconda)
- hmmer # HMMER v3 (hmmsearch, hmmscan, hmmpress, etc.)
- hmmer2 # HMMER v2 (e.g., hmmpfam, hmmsearch from v2 toolset)
- muscle=3.8.1551 # MUSCLE aligner

# Extra PyPI deps
- pip
- pip:
- -r /app/server-requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,5 @@ flask_sqlalchemy
gunicorn
joblib
apscheduler
git+https://github.com/BTheDragonMaster/parasect@webapp
psutil
git+https://github.com/BTheDragonMaster/parasect@webapp_dev_docker
93 changes: 60 additions & 33 deletions app/src/client/deployment/nginx.default.conf
Original file line number Diff line number Diff line change
@@ -1,37 +1,64 @@
# nginx configuration for Docker.
# nginx configuration for Docker (React SPA + API proxy)
server {
listen 80;
server_name localhost;
root /usr/share/nginx/html;
index index.html;
error_page 500 502 503 504 /50x.html;

location / {
try_files $uri $uri/ /index.html$is_args$args =404; # https://stackoverflow.com/questions/43555282/react-js-application-showing-404-not-found-in-nginx-server
add_header Cache-Control "no-cache";
}
listen 80;
server_name _;

location /static {
expires 1y;
add_header Cache-Control "public";
}
root /usr/share/nginx/html;
index index.html;

# Serve static assets with long cache (immutable)
location /static/ {
try_files $uri =404;
access_log off;
expires 1y;
add_header Cache-Control "public, max-age=31536000, immutable";
}

# API proxy
location /api/ {
proxy_pass http://paras-server:4009; # do not add slash at end!

# pass-thru headers
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;

# websockets / SSE (if you ever use them)
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection $connection_upgrade;

location /api {
proxy_pass http://paras-server:4009;

# CORS headers
add_header Access-Control-Allow-Origin "*";
add_header Access-Control-Allow-Methods "GET, POST, OPTIONS";
add_header Access-Control-Allow-Headers "Content-Type, Authorization";

# Handle preflight requests (for POST requests)
if ($request_method = 'OPTIONS') {
add_header Access-Control-Allow-Origin "*";
add_header Access-Control-Allow-Methods "GET, POST, OPTIONS";
add_header Access-Control-Allow-Headers "Content-Type, Authorization";
add_header Content-Length 0;
add_header Content-Type text/plain;
return 204;
}
# (optional) timeouts for long requests
proxy_read_timeout 300s;
proxy_send_timeout 300s;

# CORS (only if you need cross-origin; if served via same origin, you can remove)
add_header Access-Control-Allow-Origin "*" always;
add_header Access-Control-Allow-Methods "GET, POST, PUT, PATCH, DELETE, OPTIONS" always;
add_header Access-Control-Allow-Headers "Content-Type, Authorization, X-Requested-With" always;

# Preflight
if ($request_method = OPTIONS) {
return 204;
}
}
}

# SPA fallback (client-side routing)
location / {
try_files $uri /index.html;
# Avoid caching index.html so new deploys show up
add_header Cache-Control "no-cache";
}

# Friendly error page
error_page 500 502 503 504 /50x.html;
location = /50x.html {
internal;
}
}

# Globals
map $http_upgrade $connection_upgrade {
default upgrade;
'' close;
}
Loading