-
Notifications
You must be signed in to change notification settings - Fork 2
Regression
Prashant edited this page Jan 31, 2019
·
1 revision
- Ridge Regression imposes penalty on size of coef.
- Less impacted by outliers.
- As alpha tends toward zero the coefficients found by Ridge regression stabilize towards the randomly sampled - - vector w (similar to LinearRegression). For big alpha (strong regularisation) the coefficients are smaller (eventually converging at 0) leading to a simpler and biased solution.
- Linear model that predict's sparse coefs
- Reduces the regressors predicting target
- Elastic-net is useful when there are multiple features which are correlated with one another. Lasso is likely to pick one of these at random, while elastic-net is likely to pick both.
- Linear Model of classification, assumes linear relationship between feature & target
- y = e^(b0 + b1x) / (1 + e^(b0 + b1x))
- Returns class probabilities
- Hyperparameter : C - regularization coef
- Fundamentally suited for bi-class classification
- Stochastic Gradient Descent & Passive Aggrasive Algorithms
- Simple & Efficient to fit linear models
- Useful where number of samples is very large ( Scale of 10^5 )
- Supports partial_fit for out-of-core learning
- Both the algorithms support regression & classification
- Robust regression is interested in fitting a regression model in the presence of corrupt data: either outliers, or error in the model.
- Three techniques supported by scikit - RANSAC, Theil Sen and HuberRegressor
- Comparisions RANSAC, Theil Sen, HuberRegressor
- HuberRegressor should be faster than RANSAC
- Theil Sen and RANSAC are unlikely to be as robust as HuberRegressor for the default parameters.
- RANSAC will deal better with large outliers in the y direction
- RANSAC is faster than Theil Sen and scales much better with the number of samples
- RANSAC is a good default option
- Sometimes relationship between variables & target is of higher polynomial degree
- Transformer can be used to convert data to higher degree
- Linear models can predict coef of these higher degree polynomials