-
Notifications
You must be signed in to change notification settings - Fork 80
Description
Currently, there are ~41 dependencies listed as core dependencies for pruna, but the total installation of transitive dependencies leads to about 120+ deps installed for users and contributors. In order to reduce management overhead for dependencies and improve contributor experience, this issue is aimed to discuss how to best mitigate it. These approaches focus on minimizing transitive deps and maintenance overhead:
- Feature-Scoped Dependency Groups
Restructure the set of dependencies from pyproject.toml which are required for the essential working of the package, move all ML frameworks, optimization and metric libraries to optional groups. Ideally, the core dependency count < 10, but it can be adjusted to a flexible number based on the features retained in the core package. A good example of this is optuna-integration, where a large set of optional deps are split off to reduce core dependencies, this also can be done inside this repo without creating separate python packages.
- Telemetry Based Prioritization
Based on the optional telemetry data which informs the usage of algorithms, if a certain usage threshold (e.g., <5% of runs over 90 days) or based on usage data from the inference providers like Replicate or total download data from HF Hub. It can be marked as deprecated in a minor release and removed after 2-3 cycles of minor releases.
- Runtime Guards for Feature Modules
For feature modules, use guarded import for the feature modules. The core import should not trigger load of the entire framework and clean error messaging for missing packages. This can work alongside [1].
- Introduce a Dependency Policy
Consider a policy or a set of policies that a new dependency should be added:
• It cannot be implemented reasonably with custom code or in-house.
• It has active maintenance on github.
• It is version bounded on PyPi.
It is also important to know that such dependency policies may be implemented with broad consensus, not as one-off measures to manage dependencies.
Feel free to add any additional thoughts or feedback for this RFC.