Skip to content

[RFC] Reduce Core Dependencies with Decoupling of Features #542

@ParagEkbote

Description

@ParagEkbote

Currently, there are ~41 dependencies listed as core dependencies for pruna, but the total installation of transitive dependencies leads to about 120+ deps installed for users and contributors. In order to reduce management overhead for dependencies and improve contributor experience, this issue is aimed to discuss how to best mitigate it. These approaches focus on minimizing transitive deps and maintenance overhead:

  1. Feature-Scoped Dependency Groups

Restructure the set of dependencies from pyproject.toml which are required for the essential working of the package, move all ML frameworks, optimization and metric libraries to optional groups. Ideally, the core dependency count < 10, but it can be adjusted to a flexible number based on the features retained in the core package. A good example of this is optuna-integration, where a large set of optional deps are split off to reduce core dependencies, this also can be done inside this repo without creating separate python packages.

  1. Telemetry Based Prioritization

Based on the optional telemetry data which informs the usage of algorithms, if a certain usage threshold (e.g., <5% of runs over 90 days) or based on usage data from the inference providers like Replicate or total download data from HF Hub. It can be marked as deprecated in a minor release and removed after 2-3 cycles of minor releases.

  1. Runtime Guards for Feature Modules

For feature modules, use guarded import for the feature modules. The core import should not trigger load of the entire framework and clean error messaging for missing packages. This can work alongside [1].

  1. Introduce a Dependency Policy

Consider a policy or a set of policies that a new dependency should be added:

• It cannot be implemented reasonably with custom code or in-house.

• It has active maintenance on github.

• It is version bounded on PyPi.

It is also important to know that such dependency policies may be implemented with broad consensus, not as one-off measures to manage dependencies.

Feel free to add any additional thoughts or feedback for this RFC.

cc: @davidberenstein1957, @minettekaum

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions