New approach to TRAFO

During an insightful conversation with Franz, Raphael and Jakob we were coming to the issue that our `$trafo` is missing the information about its *image*, i.e. the parameter set that it maps to. This information would be useful e.g. for the ["autotuner"](https://github.com/mlr-org/mlr3tuning/blob/master/R/AutoTuner.R) (i.e. a tuning wrapper for a Learner), because the autotuner would like to know what parameters the user should *not* set (because the tuner is doing that). My idea for the solution for this would mostly be an extension of my suggestion in  #215. To avoid confusion with the old `$trafo` slot, I am going to use different slot names for the new things I introduce, although one of these could well be named `$trafo`.

I think this is a cool design, for whatever that may be worth to you ;-)

The plan:
* Remove the current `ps$trafo` slot.
* `ParamSet` gets a method `ps$transform(x, context = list(), terminal = TRUE)` that takes a named list `x` that is a valid parameter configuration according to the `ParamSet`, and returns a named list with transformed parameter values. (Ignore `context` and `terminal` for now).
* `ParamSet` gets a method `ps$image(terminal = TRUE)` that returns a `ParamSet`. This is the `ParamSet` that all values of `ps$transform()` will conform to. (In fact, `ParamSet` checks that the return value of `transform()` conforms to this image and throws an error if it doesn't).
* The `ps$get_values()` method of `ParamSet` is extended with the parameter `transformed = TRUE` and `context = list()`. `ps$get_values(transformed = FALSE)` behaves just as `ps$get_values()` does currently. If `ps$get_values(transformed = TRUE, context = ctx)` is called, it returns the same as `ps$get_values(transformed = FALSE) %>% ps$transform(context = ctx)`. <sup>We could argue about renaming the function `get_transformed_values()` or something, or having both functions and removing the `transformed` parameter</sup>
* `ParamSet` gets a method `ps$add_trafo(trafo, new_ps)`. `trafo` is a `function(x, param_set, context)`, `new_ps` is a `ParamSet`. What it does is that it "transmutes" `ps` into the `ParamSet` given in the `new_ps` argument. Given that `ps` does *not have a "trafo" yet*, the `trafo` function then takes inputs according to `new_ps` and gives outputs according to `ps`, i.e. the old `ps` becomes its image. Some examples:
  ```r
  # given:
  # * ps (ParamSet) that DOES NOT HAVE A TRAFO
  # * new_ps (ParamSet)
  # * trafo (function(x, param_set, context))
  # * x (named list)
  ps_old = ps$clone(deep = TRUE)  # keep the old ps for comparison
  ps$add_trafo(trafo, new_ps)

  # from the outside, ps now looks like param_set
  all.equal(ps$params, new_ps$params)  # TRUE
  
  # the `ps$transform()` function calls the `trafo` function
  all.equal(trafo(x = x, param_set = ps, context = list()),
    ps$transform(x = x, context = list()))  # TRUE

  # the `ps$image()` is just the "old" ParamSet
  all.equal(ps$image()$params, ps_old$params)  # TRUE
  ```
* What if a `ps` already has a "trafo" and another one is added? It just stacks! In that case, the trafo that was added later is called first, then the earlier trafo is called, etc. Think of the different `ParamSet`s as a linked list, connected by "trafo"-functions, the image of each is the preimage of the next. This is where the "`terminal`" comes in: We can choose to apply all transformations of a `ParamSet` in a row, or just one transformation to go one "step" ahead. Similarly, we can get the "terminal" image, i.e. of the last image, or just the image of one transformation step. In code:
  ```r
  # given:
  # * ps_one, ps_two, ps_three (ParamSet that DO NOT HAVE A TRAFO)
  # * trafo_one_two, trafo_two_three (function(x, param_set, context))
  # * x, y, z (named lists)
  ps = ps_three$clone(deep = TRUE)
  ps$add_trafo(trafo_two_three, ps_two$clone(deep = TRUE))
  ps$add_trafo(trafo_one_two, ps_one$clone(deep = TRUE))

  # from the outside, ps now looks like ps_one
  all.equal(ps$params, ps_one$params)  # TRUE

  # images: ps_three is the "terminal" one, but ps_two is the "next" one
  all.equal(ps$image()$params, ps_three$params)  # TRUE
  all.equal(ps$image(terminal = FALSE)$params, ps_two$params)  # TRUE

  # can go along the linked list to reach terminal
  all.equal(ps$image(terminal = FALSE)$image(terminal = FALSE)$params,
    ps_three$params)  # TRUE

  # ps_three does not have a "trafo", btw, so its image is just itself
  all.equal(ps_three$image()$params, ps_three$params)  # TRUE

  # trafos: ps$transform() calls trafo_one_two, then trafo_two_three
  # but only if "terminal" is TRUE
  all.equal(ps$transform(x = x, context = list()),
    trafo_one_two(x = x, param_set = ps, context = list()) %>%
      trafo_two_three(param_set = ps$
        image(terminal = FALSE), context = list()))  # TRUE
  all.equal(ps$transform(x = x, context = list(), terminal = FALSE),
    trafo_one_two(x = x, param_set = ps, context = list())

  # we could also go along the linked list here:
  all.equal(ps$transform(x = x, context = list()),
    ps$transform(x = x, context = list(), terminal = FALSE) %>%
      ps$image(terminal = FALSE)$
        transform(x = x, context = list(), terminal = FALSE))  # TRUE

  # ps_three does not have a "trafo", so its `$transform()` is the identity
  all.equal(ps_three$transform(x = x, context = list()), x)
  ```
* What about the `context()`? It can optionally be given to the `ps$transform()` function as an argument, and it will be passed on to the `trafo` function given to `ps$add_trafo()`. It can contain information about how the transformation is to be performed. It could, for example, contain information about a task (number of features, number of samples), and the `trafo` could then make use of this information to transform a parameter value. This will work together with a **convention** that each `Learner` will always call `ps$get_values(transformed = TRUE, context = list(task = task))`. Now what happens is the following:
  * The learner is created with a vanilla `ParamSet`, so `ps$get_values(transformed = TRUE, [...])` when called in the `Learner`'s `$train()` function just gives the parameter values as given by the user.
  * If the user wants to add a transformation to the `ParamSet`, he calls `learner$param_set$add_trafo(....)`. This changes how the `ParamSet` looks to the user at the outside. For example, maybe the new `ParamSet` contains a `mtry_pexp` parameter, while the `Learner`'s original `ParamSet` only had an `mtry` parameter.
  * When the `Learner` now calls `ps$get_values(transformed = TRUE, [...])`, the result will be conforming to the `ParamSet` that the `Learner` was created with (because `get_values` in this case gives a value conforming to the `$image`).
  * Because `context = list(task = task)` is given to the `$get_values()`, and hence to the `trafo()` function, the transformation can depend on properties of the task. It could, for example, do
    ```r
    x$mtry = context$task$nfeat ^ x$mtry_pexp`
    ```
  * There may be other contexts, for example inside a prediction-aggregating `PipeOp`. These pipeops can call `get_values` with a different `context` argument. How they call `get_values` should be documented, so the user can choose to use `$add_trafo()` in a way that makes use of all information available. <sup>I am not sure yet if it is possible to build this behaviour into the (`Learner`, `PipeOp`, ...) class in some way to make it consistent, e.g. for all `Learner`s</sup>
  * `context` is basically what I called `env` in #215 / #225
* It should be noted that this is a transformation that can be both performed at the learner-side or at the tuner-side. I.e. if I have a `Learner` with parameters that I want to tune over, but with transformed values (say `tune_ps`, and trafo `tune_trafo`), I can do either of the following:
  1. Transformation happens in `Learner`
    ```r
    learner$param_set$add_trafo(tune_trafo, tune_ps)
    tune_learner(lrn = learner, ps = tune_ps, [...])
    ```
  2. Transformation happens in the tuner
    ```r
    total_tune_ps = learner$param_set$clone(deep = TRUE)
    total_tune_ps$add_trafo(tune_trafo, tune_ps)
    tune_learner(lrn = learner, ps = total_tune_ps, [...])
    ```
  Either of these could make sense in their own right. (i) is relevant if transformation should be task-dependent, (ii) is relevant if the tuning result parameters should be in a form that is naturally understandable to someone familiar with the learner.
* I am thinking about whether there should be an `ps$add_trafo(trafo, preimage_ps, image_ps)` function, so that we can add a transformation just on a subset of the `ParamSet`. E.g. if the `ParamSet` has the parameters `mtry`, `n.tree` and we just want to add a trafo for `mtry`, we could do 
  ```r
  ps$add_trafo(function(x, ...) x$mtry = round(exp(mtry),
    preimage_ps = ParamSet$new(ParamDbl$new("mtry", 0, 10)),
    image_ps = ParamSet$new(ParamInt$new("mtry", 0, Inf)))
  ```
  And the `trafo` function would only be called with the `"mtry"` part of the input parameter value. This would make subsetting easy. But that is a story for a different time :-)
  

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

New approach to TRAFO #246

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

New approach to TRAFO #246

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions