Skip to content

Pretrained pickle models are incompatible with newer scikit-learn and Python versions #30

@Nanguage

Description

@Nanguage

The pretrained model files provided by Peakachu (e.g. high-confidence.100million.10kb.w6.pkl) are currently stored in the pickle format.
However, these model files depend on older versions of scikit-learn and numpy, and fail to load in modern Python environments (e.g. Python 3.12 + scikit-learn ≥ 1.3) with the following error:

ValueError: node array from the pickle has an incompatible dtype:
- expected: {... 'missing_go_to_left' ...}
- got     : {...}

This error occurs because the internal structure of sklearn.tree._tree.Tree has changed between versions, making pickle files incompatible across releases.
Furthermore, older versions of scikit-learn (≤ 0.22) cannot even be installed on modern systems, as they rely on the deprecated numpy.distutils module, which has been removed in newer NumPy versions.

Steps to Reproduce

  1. Install Python 3.12 and the latest scikit-learn.
  2. Run: import joblib; model = joblib.load("high-confidence.100million.10kb.w6.pkl")

Suggested Solution

To improve compatibility and ensure long-term usability, it is recommended that the Peakachu project:

  • Convert pretrained models to ONNX format (see https://onnx.ai/sklearn-onnx/), or consider alternatives like skops for safer serialization;
  • Release model files as .onnx to allow loading and inference across Python/scikit-learn versions;

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions