MySurgeryRisk XAI is a full-stack explainable AI framework that specifically designed for healthcare AI models. MySurgeryRisk XAI aims to enhance transparency for diverse stakeholders. It offers model interpretability for AI experts and developers, while simultaneously providing model explainability for end users, such as healthcare providers.
MySurgeryRisk XAI is an open-source initiative that comprises of a collection of curated high-stakes datasets, models, methods of interpretability and explainability, and provides a simple and easy-to-use API that enables researchers and practitioners to benchmark explanation methods using just a few lines of code.
- Real world AI/ML ready clinical dataset for surgical patients: MySurgeryRiskXAI provides a clinical dataset containing features derived from electronical healthcare records (EHR) of patients who underwent any major surgery at University of Florida Health Gainesville from 2014 to 2020. The outcome is the incidence of complication of prolonged mechanical ventilation (MV) after surgery.
- Provide support for diverse stakeholders: MySurgeryRiskXAI provides interpretability, enabling developers and AI experts to gain crucial insights into the internal mechanics of the AI system by providing the global feature importance to examine the relationships between input features and output predictions. MySurgeryRiskXAI provides explainability to end users, such as healthcare providers, by systematically addresses five fundamental questions: “Why?”, “Why not?”, “How?”, “What if?”, and “What else?”.
- “Why” question: why a particular prediction was made? The “why” question was answered by providing local feature importance.
- “Why not” question: why was an alternative prediction not made? The “Why not” question was answered by providing contrastive explanations, which highlighted the differences between actual predictions and the expected or desired alternative.
- “How” question: how was this model developed? The “How?” question was addressed by providing model cards enhanced with explainability and bias and fairness assessment results.
- “What if” question: what if this feature was modified, and how would that affect the risk prediction? The “What if?” question was first addressed by provided recommendations that list possible changes required to alter the model’s decision, focusing on actionable parameters. Additionally, to ensure user-friendliness, an interactive user interface in real system implementation which allows users to manipulate feature values and run inference on perturbed data points was aslo proposed.
- “What else” question: what else can we learn from similar patients? The “What else?” question enhances the XAI framework by providing real-world context, and was answered by providing model’s predictions and actual event ratios for similar patients.
- Open-source initiative: MySurgeryRisk XAI is an open-source initiative and easily extensible
- Supported Datatype: Structured dataset
- Supported AI Model Type: Tree-based AI models including random forest model, XGBoost model, and LightGBM model. General Pytorch based deep learning neural networks are supported as well. Only binary classifier is supported.
- see tutorial/xai_for_continuous_categorical_dataset.ipynb for how to apply MySurgeryRisk XAI framework to a tree based model and a structured dataset mixing with continuous and categorical dataset.
- see tutorial/deeplearning_xai_for_continuous_categorical_dataset.ipynb for how to apply MySurgeryRisk XAI framework to a general deep neural network and a structured dataset mixing with continuous and categorical dataset.
- see tutorial/model_card_generation.ipynb for how to apply MySurgeryRisk XAI framework to generate the model card.
ExplainerDataset is a class to collect the information from the dataset, including the training and testing dataset, feature names, outcome name, categorical variables. It supports to output the summary of dataset including the data types, numerical value and categorical levels of features. It supports the tree based model. DeepExplainerDataset is used for deep learning models. For a concrete example, the code snippet below shows how to use the class:
from ExplainerDataset import ExplainerDataset
dataset = ExplainerDataset(X_train, y_train, X_test, y_test, feature_names, outcome_name, categorical_variables)
###get summary of the dataset
meta_data = dataset.get_metadata()
###for deep learning neural network
from DeepExplainerDataset import DeepExplainerDataset
dataset = DeepExplainerDataset(X_train, y_train, transform_func=lambda_transformer, test_X=X_test, test_y=y_test, feature_names=feature_names,
outcome_name=outcome_name, categorical_variables=categorical_variables)
##get tensor data for model development and validation
X_train_tensor, y_train_tensor, X_test_tensor, y_test_tensor = dataset.get_tensor_data()
Explainer is a class to provide interfaces of interpretability and explainability for tree based model. DeepExplainer is a class to provide same functions for general Pytorch based deep learning models.
- It supports to provide interpretability through global feature importance approaches, including SHAP, permutation based importance, and model inherently provided importance. For deep learning models, only SHAP is provided. It outputs the numerica importance score and visualization. For a concrete example, the code snippet below shows how to use the class:
from ExplainerDataset import ExplainerDataset
from Explainer import Explainer
dataset = ExplainerDataset(X_train, y_train, X_test, y_test, feature_names, outcome_name, categorical_variables)
explainer = Explainer(model, dataset)
importance = explainer.interpretability(method='permutation_importance')
importance = explainer.interpretability(method='shap')
importance = explainer.interpretability(method='inherent importance')
###for deep learning neural networks
from DeepExplainerDataset import DeepExplainerDataset
from DeepExplainer import DeepExplainer
dataset = DeepExplainerDataset(X_train, y_train, transform_func=lambda_transformer, test_X=X_test, test_y=y_test, feature_names=feature_names,
outcome_name=outcome_name, categorical_variables=categorical_variables)
explainer = DeepExplainer(model, dataset)
importance = explainer.interpretability()
- It supports to answer why question for explainability through local feature importance approaches, including SHAP and LIME. It outputs the numerica importance score and visualization. For a concrete example, the code snippet below shows how to use the class:
from ExplainerDataset import ExplainerDataset
from Explainer import Explainer
dataset = ExplainerDataset(X_train, y_train, X_test, y_test, feature_names, outcome_name, categorical_variables)
explainer = Explainer(model, dataset)
X_train, y_train, X_test, y_test = dataset.get_data()
test_sample = X_test.loc[[i], :] #ith sample
importance = explainer.explainability_why(test_sample, method='shap')
importance = explainer.explainability_why(test_sample, method='lime')
###for deep learning neural networks
from DeepExplainerDataset import DeepExplainerDataset
from DeepExplainer import DeepExplainer
dataset = DeepExplainerDataset(X_train, y_train, transform_func=lambda_transformer, test_X=X_test, test_y=y_test, feature_names=feature_names,
outcome_name=outcome_name, categorical_variables=categorical_variables)
explainer = DeepExplainer(model, dataset)
X_train, y_train, X_test, y_test = dataset.get_data()
test_sample = X_test.loc[[i], :] #ith sample
importance = explainer.explainability_why(test_sample, method='shap')
importance = explainer.explainability_why(test_sample, method='lime')
- It supports to answer why not question for explainability through contrastive explaination approach including DICE. It outputs the numerica importance score and visualization. For a concrete example, the code snippet below shows how to use the class:
from ExplainerDataset import ExplainerDataset
from Explainer import Explainer
dataset = ExplainerDataset(X_train, y_train, X_test, y_test, feature_names, outcome_name, categorical_variables)
explainer = Explainer(model, dataset)
X_train, y_train, X_test, y_test = dataset.get_data()
test_sample = X_test.loc[[i], :] #ith sample
importance = explainer.explainability_whynot(test_sample)
###for deep learning neural networks
from DeepExplainerDataset import DeepExplainerDataset
from DeepExplainer import DeepExplainer
dataset = DeepExplainerDataset(X_train, y_train, transform_func=lambda_transformer, test_X=X_test, test_y=y_test, feature_names=feature_names,
outcome_name=outcome_name, categorical_variables=categorical_variables)
explainer = DeepExplainer(model, dataset)
X_train, y_train, X_test, y_test = dataset.get_data()
test_sample = X_test.loc[[i], :] #ith sample
importance = explainer.explainability_whynot(test_sample)
- It supports to answer how question for explainability using approach model card. It provides a JSON template for model card and generates the html page showing the model card. For a concrete example, the code snippet below shows how to use the class:
from ExplainerDataset import ExplainerDataset
from Explainer import Explainer
dataset = ExplainerDataset(X_train, y_train, X_test, y_test, feature_names, outcome_name, categorical_variables)
explainer = Explainer(model, dataset)
explainer.explainability_how(content_json_path, output_path)
Explainer.explainability_how(content_json_path, output_path) ##the method can be also called without creating an class instance
###for deep learning neural networks
from DeepExplainerDataset import DeepExplainerDataset
from DeepExplainer import DeepExplainer
dataset = DeepExplainerDataset(X_train, y_train, transform_func=lambda_transformer, test_X=X_test, test_y=y_test, feature_names=feature_names,
outcome_name=outcome_name, categorical_variables=categorical_variables)
explainer = DeepExplainer(model, dataset)
explainer.explainability_how(content_json_path, output_path)
Explainer.explainability_how(content_json_path, output_path) ##the method can be also called without creating an class instance
- It supports to answer what if question for explainability using approach counterfactual explanation DICE. It aims to provide the recommendations that list possible changes required to alter the model’s decision, focusing on actionable parameters. For a concrete example, the code snippet below shows how to use the class:
from ExplainerDataset import ExplainerDataset
from Explainer import Explainer
dataset = ExplainerDataset(X_train, y_train, X_test, y_test, feature_names, outcome_name, categorical_variables)
explainer = Explainer(model, dataset)
X_train, y_train, X_test, y_test = dataset.get_data()
test_sample = X_test.loc[[i], :] #ith sample
explainer.explainability_whatif(test_sample, features_to_vary, permitted_range)
###for deep learning neural networks
from DeepExplainerDataset import DeepExplainerDataset
from DeepExplainer import DeepExplainer
dataset = DeepExplainerDataset(X_train, y_train, transform_func=lambda_transformer, test_X=X_test, test_y=y_test, feature_names=feature_names,
outcome_name=outcome_name, categorical_variables=categorical_variables)
explainer = DeepExplainer(model, dataset)
X_train, y_train, X_test, y_test = dataset.get_data()
test_sample = X_test.loc[[i], :] #ith sample
explainer.explainability_whatif(test_sample, features_to_vary, permitted_range)
- It supports to answer what else question for explainability through identifying similar instances, including feature importance-based similarity (cosine similarity and euclidean distance similarity), proximity matrix, and principal feature matching methods. It provides model's predictions and actual event ratios for similar patients. For deep learning models, embedding-based similarity (cosine similarity and euclidean distance similarity) and principal feature matching methods are provided. For a concrete example, the code snippet below shows how to use the class:
from ExplainerDataset import ExplainerDataset
from Explainer import Explainer
dataset = ExplainerDataset(X_train, y_train, X_test, y_test, feature_names, outcome_name, categorical_variables)
explainer = Explainer(model, dataset)
X_train, y_train, X_test, y_test = dataset.get_data()
test_sample = X_test.loc[[i], :] #ith sample
explainer.explainability_whatelse(test_sample, method='cosine_simility')
explainer.explainability_whatelse(test_sample, method='euclidean_distince_simility')
explainer.explainability_whatelse(test_sample, method='proximity_simility')
explainer.explainability_whatelse(test_sample, method='feature_matching', matching_func=matching_func)
###for deep learning neural networks
from DeepExplainerDataset import DeepExplainerDataset
from DeepExplainer import DeepExplainer
dataset = DeepExplainerDataset(X_train, y_train, transform_func=lambda_transformer, test_X=X_test, test_y=y_test, feature_names=feature_names,
outcome_name=outcome_name, categorical_variables=categorical_variables)
explainer = DeepExplainer(model, dataset)
X_train, y_train, X_test, y_test = dataset.get_data()
test_sample = X_test.loc[[i], :] #ith sample
explainer.explainability_whatelse(test_sample, method='cosine_simility', embedding_func=embedding_func)
explainer.explainability_whatelse(test_sample, method='euclidean_distince_simility', embedding_func=embedding_func)
explainer.explainability_whatelse(test_sample, method='feature_matching', matching_func=matching_func)
MySurgeryRisk XAI provides an interface to generate the model card of an AI model. It provides a JSON template for the model card, containing sections: 1. Model Details section provides an overview of the model; 2. Model Parameters section describe the model architecture, training and testing datasets, and training details; 3. Quantitative Analysis section provides the evaluation results, enhanced with explainability and bias/fairness measurement results. It also provides utility functions to help generating the content of the JSON file, for example, generating the distribution of dataset, generating performance metrics, and generating bias/fairness assessment results from both dataset and model. For a concrete example, the code snippet below shows how to use these functions:
from Model_card.model_metrics import compute_performance_metrics, plot_AUC, plot_AUPRC
from Model_card.outcome_distribution import generate_outcome_distribution
from Bias_detection.data_bias_detection import detect_dataset_bias
from Bias_detection.model_bias_detection import detect_model_bias
performance_metrics = compute_performance_metrics(y_test, y_pred)
plot_AUC(y_test, y_pred)
plot_AUPRC(y_test, y_pred)
test_data = pd.concat([X_test, y_test], axis=1)
summary = generate_outcome_distribution(test_data, outcome_name, sensitive_attributes)
test_data = pd.concat([X_test, y_test, y_pred], axis=1)
summary = detect_dataset_bias(test_data, outcome_name, sensitive_attributes, privileged_groups)
summary = detect_model_bias(test_data, outcome_name, prediction_name, sensitive_attributes, privileged_groups)
