correlation-constrained-regression

Introduces correlation constraints for Linear, Ridge, and Kernel Ridge regression.

In their standard form, we can formulate these models as unconstrained optimization problems of the form

minimize     L(y, yhat)

where y is the target values and yhat is the predictions. Correlation constrained regression is given by adding a correlation constraint and arriving at

minimize     L(y, yhat)
subject to   |corr(y, e)| <= correlation_bound

where corr is Pearson correlation, e = y - yhat is the residuals, and correlation bound is a hyperparameter that controls the maximum amount of permissible correlation. The resultant models have been implemented in both Python and Matlab.

Python

The module correlation_constrained_regression.py provides three models, LinearRegression, Ridge, and KernelRidge. They extend the eponymous models in Scikit-Learn with an additional parameter correlation_bound (a value between 0 and 1) that specifies the maximally permissible correlation between targets and residuals. The following code example illustrates the LinearRegression model:

import numpy as np
import correlation_constrained_regression as ccr

# create some regression data
X = np.array([[1, 1], [1, 2], [3, 2], [3, 3], [4, 3], [4, 4]])
y = np.dot(X, np.array([1, 2])) + np.array([0.1, 0.2, -0.1, -0.2, -0.1, -0.2])

# fit correlation constrained model and calculate residual correlation
reg = ccr.LinearRegression(correlation_bound=0.01).fit(X, y)
print('corr(y, e):', np.corrcoef(y, y - reg.predict(X))[0,1])

# instead of calculating the correlation by hand, we can use the built-in method:
print('corr(y, e):', reg.calculate_residual_correlation(X, y))

# scaling factor
print('theta:', reg.theta_)

# for comparison: let's train a standard linear regression model in Scikit-Learn and print the correlation
reg = sklearn.linear_model.LinearRegression().fit(X, y)
print('corr(y, e):', np.corrcoef(y, y - reg.predict(X))[0,1])

Fitting Ridge and KernelRidge works analogous:

ridge = ccr.Ridge(correlation_bound=0.1, alpha=10).fit(X, y)
krr = ccr.KernelRidge(correlation_bound=0, kernel='rbf', gamma=1).fit(X, y)
print('corr(y, e):', ridge.calculate_residual_correlation(X, y))
print('corr(y, e):', krr.calculate_residual_correlation(X, y))

You can use the models in the same way as other Scikit-Learn models. For instance, let us use GridSearchCV to optimize the hyperparameters of the KernelRidge model:

tune_KernelRidge = [
  {'kernel': ['rbf'], 'gamma': [100, 10, 1, 1e-1], 'alpha': [1e-3, 1e-2, 1e-1, 1, 10]},
  {'kernel': ['poly'], 'gamma': [100, 10, 1, 1e-1], 'alpha': [1e-3, 1e-2, 1e-1, 1, 10], 'degree': [2, 3, 4, 5], 'coef0':[0, 1]}
 ]
 
krr = sklearn.model_selection.GridSearchCV(ccr.KernelRidge(correlation_bound=0), param_grid=tune_KernelRidge, scoring='neg_mean_squared_error')

Matlab

Regression models with correlation constraints are implemented in the MVPA-Light toolbox. The hyperparameter correlation_bound (a value between 0 and 1) specifies the maximally permissible correlation between targets and residuals. The following code example illustrates the Ridge regression model:

% create some regression data
X = [1, 1; 1, 2; 3, 2; 3, 3; 4, 3; 4, 4]
y = X * [1, 2]' + [0.1, 0.2, -0.1, -0.2, -0.1, -0.2]'

% get hyperparameter struct
param = mv_get_hyperparameter('ridge');

% specify correlation bound
param.correlation_bound = 0.1;

% train model
model = train_ridge(param, X, y);

% scaling factor
fprintf('theta = %.4f\n', model.theta)

Linear regression is included in ridge model (set param.lambda=0). Fitting a kernel ridge models is analogous:

param = mv_get_hyperparameter('kernel_ridge');
param.correlation_bound = 0.1;

model = train_kernel_ridge(param, X, y);

The models can be used in a cross-validation framework using the mv_regress function:

cfg = [];
cfg.model  = 'ridge';
cfg.metric = 'r_squared';
cfg.hyperparameter = [];
cfg.hyperparameter.correlation_bound = 0.1;

r2 = mv_regress(cfg, X, y);

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
PAC2019_BrainAge_ICA_reduced.csv		PAC2019_BrainAge_ICA_reduced.csv
README.md		README.md
activation_patterns.m		activation_patterns.m
analysis_tools.py		analysis_tools.py
correlation_constrained_regression.ipynb		correlation_constrained_regression.ipynb
correlation_constrained_regression.py		correlation_constrained_regression.py
pac2019_ICA_20201202.csv		pac2019_ICA_20201202.csv
roc_curve_pac2019.pickle		roc_curve_pac2019.pickle
run_ADC_MAE_tradeoff.py		run_ADC_MAE_tradeoff.py
run_ADC_MAE_tradeoff_train_test.py		run_ADC_MAE_tradeoff_train_test.py
run_regression_analysis.m		run_regression_analysis.m
run_regression_analysis.py		run_regression_analysis.py
run_regression_analysis_train_test.py		run_regression_analysis_train_test.py
run_roc_curve.py		run_roc_curve.py
run_roc_curve_train_test.py		run_roc_curve_train_test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

correlation-constrained-regression

Python

Matlab

About

Uh oh!

Releases

Packages

Languages

treder/correlation-constrained-regression

Folders and files

Latest commit

History

Repository files navigation

correlation-constrained-regression

Python

Matlab

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages