-
Notifications
You must be signed in to change notification settings - Fork 631
Open
Description
copied from: tidymodels/recipes#636
The same dataset with an NA value does not work with caret-recipes pipeline whereas the single caret does not cause any issues.
library(caret)
#> Loading required package: lattice
#> Loading required package: ggplot2
library(recipes)
#> Loading required package: dplyr
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
#>
#> Attaching package: 'recipes'
#> The following object is masked from 'package:stats':
#>
#> step
data(cars)
cars$Mileage[100] <- NA
## Without recipes
train(Price ~.,
trControl = trainControl(
method = 'CV',
number = 3 #Reduced the number of CV-folds. Otherwise we would get a bunch of warnings
),
data = cars,
tuneLength = 1,
method = "xgbLinear",
objective = "reg:squarederror",
na.action = na.pass)
#> Warning in check.booster.params(params, ...): The following parameters were provided multiple times:
#> objective
#> Only the last value for each of them will be used.
#> Warning in check.booster.params(params, ...): The following parameters were provided multiple times:
#> objective
#> Only the last value for each of them will be used.
#> Warning in check.booster.params(params, ...): The following parameters were provided multiple times:
#> objective
#> Only the last value for each of them will be used.
#> Warning in check.booster.params(params, ...): The following parameters were provided multiple times:
#> objective
#> Only the last value for each of them will be used.
#> eXtreme Gradient Boosting
#>
#> 804 samples
#> 17 predictor
#>
#> No pre-processing
#> Resampling: Cross-Validated (3 fold)
#> Summary of sample sizes: 536, 536, 536
#> Resampling results:
#>
#> RMSE Rsquared MAE
#> 2442.675 0.9394766 1674.597
#>
#> Tuning parameter 'nrounds' was held constant at a value of 50
#> Tuning
#> 'alpha' was held constant at a value of 0
#> Tuning parameter 'eta' was
#> held constant at a value of 0.3
## With recipes
rec <- recipe(Price ~., data = cars)
train(rec,
data = cars,
trControl = trainControl(
method = 'CV',
number = 3
),
tuneLength = 1,
method = "xgbLinear",
objective = "reg:squarederror",
na.action = na.pass)
#>
#> Attaching package: 'xgboost'
#> The following object is masked from 'package:dplyr':
#>
#> slice
#> Warning in check.booster.params(params, ...): The following parameters were provided multiple times:
#> objective
#> Only the last value for each of them will be used.
#> Warning: model fit failed for Fold1: lambda=0, alpha=0, nrounds=50, eta=0.3 Error in as.character(x) :
#> cannot coerce type 'closure' to vector of type 'character'
#> Warning in check.booster.params(params, ...): The following parameters were provided multiple times:
#> objective
#> Only the last value for each of them will be used.
#> Warning: model fit failed for Fold2: lambda=0, alpha=0, nrounds=50, eta=0.3 Error in as.character(x) :
#> cannot coerce type 'closure' to vector of type 'character'
#> Warning in check.booster.params(params, ...): The following parameters were provided multiple times:
#> objective
#> Only the last value for each of them will be used.
#> Warning: model fit failed for Fold3: lambda=0, alpha=0, nrounds=50, eta=0.3 Error in as.character(x) :
#> cannot coerce type 'closure' to vector of type 'character'
#> Warning in train_rec(rec = x, dat = data, info = trainInfo, method = models, :
#> There were missing values in resampled performance measures.
#> Something is wrong; all the RMSE metric values are missing:
#> RMSE Rsquared MAE
#> Min. : NA Min. : NA Min. : NA
#> 1st Qu.: NA 1st Qu.: NA 1st Qu.: NA
#> Median : NA Median : NA Median : NA
#> Mean :NaN Mean :NaN Mean :NaN
#> 3rd Qu.: NA 3rd Qu.: NA 3rd Qu.: NA
#> Max. : NA Max. : NA Max. : NA
#> NA's :1 NA's :1 NA's :1
#> Error: Stopping
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels