Skip to content

Adhere to scikit-learn estimator interface #94

@kiudee

Description

@kiudee

Rationale

Most of the learners implemented in cs-ranking already implement an interface similar to the one described in https://scikit-learn.org/stable/developers/develop.html#rolling-your-own-estimator,
i.e., we usually have a fit and predict method implemented.
For users to be able to use all learners effortlessly in a scikit-learn pipeline.Pipeline or to apply model_selection.GridSearchCV, we should make sure that all additional requirements are also fulfilled.

To do

  • Use get_params and set_params to set parameters. This is important, since GridSearchCV or BayesSearchCV call set_params for hyperparameter optimization. sklearn.base.BaseEstimator implements basic versions of these. The current way we handle hyperparameters should be deprecated.
  • It is recommended to not do any parameter validation in __init__, but rather in fit itself. set_params is supposed to do exactly the same thing as __init__ with respect to parameters.
  • Init parameters should be written without changes as attributes. All generated attributes should have a trailing _.
  • There should be no mandatory parameters. The user should be able to run the learner without having to provide arguments.
  • Implement a score method. This is helpful, since hyperparameter optimizers call this function by default. Otherwise the user has to implement a custom one.
  • Implement clone methods for each learner.

Most of these changes are independent of each other and could be done using separate branches.

Metadata

Metadata

Assignees

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions