You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Extract config dataclasses from openlrc.py and defer heavy imports
Problem
TranscriptionConfig and TranslationConfig are plain @dataclass classes that only depend on stdlib (dataclasses, pathlib.Path). But they live in openlrc/openlrc.py, so importing them triggers the entire module-level import chain:
Downstream projects that embed these dataclasses (e.g. via composition for their own config types) are forced to install and load ~3GB+ of dependencies just to use two stdlib-only classes. This also prevents packaging tools (Nuitka, PyInstaller) from producing lean binaries.
Proposed Changes
1. Move config dataclasses to a lightweight module
Move TranscriptionConfig and TranslationConfig to openlrc/config.py (or the existing openlrc/models.py which already only depends on stdlib). Re-export from __init__.py:
# openlrc/__init__.pyfromopenlrc.configimportTranscriptionConfig, TranslationConfig# stdlib onlyfromopenlrc.openlrcimportLRCer# heavy
from openlrc import TranscriptionConfig becomes instant. Non-breaking — all existing import paths still work.
2. Defer heavy imports in openlrc.py to method level
Even from openlrc import LRCer (without calling any method) currently loads torch, spacy, faster-whisper, etc. These can be deferred to the methods that use them:
Import
Used in
Pulls in
openlrc.preprocess.Preprocessor
pre_process()
torch, deepfilternet
openlrc.transcribe.Transcriber
transcriber property (already lazy-init)
faster_whisper, ctranslate2
openlrc.translate.LLMTranslator
_translate()
openai, anthropic, google-genai
faster_whisper.transcribe.Segment
to_json() type hint
faster_whisper
Lightweight internal imports (openlrc.context, openlrc.defaults, openlrc.logger, openlrc.models, openlrc.opt, openlrc.subtitle) can stay at module level.
Test compatibility: Tests in test_preprocess.py patch module-level names like @patch("openlrc.preprocess.enhance"). After deferral these would need to target the source module (e.g. @patch("df.enhance.enhance")). I'm aware this was the concern in #87 — happy to include test updates in the same PR.
Benefits
from openlrc import TranscriptionConfig loads stdlib only
from openlrc import LRCer no longer triggers torch/spacy/whisper at import time
Extract config dataclasses from
openlrc.pyand defer heavy importsProblem
TranscriptionConfigandTranslationConfigare plain@dataclassclasses that only depend on stdlib (dataclasses,pathlib.Path). But they live inopenlrc/openlrc.py, so importing them triggers the entire module-level import chain:Downstream projects that embed these dataclasses (e.g. via composition for their own config types) are forced to install and load ~3GB+ of dependencies just to use two stdlib-only classes. This also prevents packaging tools (Nuitka, PyInstaller) from producing lean binaries.
Proposed Changes
1. Move config dataclasses to a lightweight module
Move
TranscriptionConfigandTranslationConfigtoopenlrc/config.py(or the existingopenlrc/models.pywhich already only depends on stdlib). Re-export from__init__.py:from openlrc import TranscriptionConfigbecomes instant. Non-breaking — all existing import paths still work.2. Defer heavy imports in
openlrc.pyto method levelEven
from openlrc import LRCer(without calling any method) currently loads torch, spacy, faster-whisper, etc. These can be deferred to the methods that use them:openlrc.preprocess.Preprocessorpre_process()openlrc.transcribe.Transcribertranscriberproperty (already lazy-init)openlrc.translate.LLMTranslator_translate()faster_whisper.transcribe.Segmentto_json()type hintLightweight internal imports (
openlrc.context,openlrc.defaults,openlrc.logger,openlrc.models,openlrc.opt,openlrc.subtitle) can stay at module level.Test compatibility: Tests in
test_preprocess.pypatch module-level names like@patch("openlrc.preprocess.enhance"). After deferral these would need to target the source module (e.g.@patch("df.enhance.enhance")). I'm aware this was the concern in #87 — happy to include test updates in the same PR.Benefits
from openlrc import TranscriptionConfigloads stdlib onlyfrom openlrc import LRCerno longer triggers torch/spacy/whisper at import timeRelated
preprocess.py(deferred due to test patch concerns)