This is a Python port of the CGI (Causality Graphical Inference) MATLAB toolbox.
CGI is a causal discovery algorithm that uses conditional independence tests based on Gaussian processes and kernel methods to identify causal relationships between variables.
pip install -e .- numpy >= 1.18.0
- scipy >= 1.5.0
import numpy as np
from CGI_py import find_genes_gci, load_data
# Load data (from .mat file)
data = load_data('normalized_Leukemia.mat')
# Run causal gene discovery
results = find_genes_gci(data, alpha=0.05)
# Get causal genes
causal_genes = results['found_genes']
print(f"Found {len(causal_genes)} causal genes")kernel(x, xKern, theta): Compute RBF kernel matrixdist2(x, c): Compute squared Euclidean distancepaco_test(x, y, Z, alpha): Partial correlation testkcit(X, Y, Z, ...): Kernel conditional independence testfit_gpr(X, Y, cov, hyp, Ncg): Fit GP regression model
find_genes_gci(data, alpha, cov, Ncg, hyp): Find causal genes
-
Original MATLAB implementation: CGI
-
H. Zhang, C. Yan, Y. Xia, J. Guan and S. Zhou, "Causal Gene Identification Using Non-Linear Regression-Based Independence Tests," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 20, no. 1, Jan.-Feb. 2023, doi: 10.1109/TCBB.2022.3149864.
-
Zhang, K., Peters, J., Janzing, D., & Schölkopf, B. (2011). Kernel-based conditional independence test and application in causal discovery. arXiv:1202.2775
See the original CGI repository for license information.