Skip to content

app(flowerhub): Fed-OMOP Hospital readmission prediction on MIMIC-IV#6780

Open
manjahdani wants to merge 5 commits intoflwrlabs:mainfrom
manjahdani:fedomop
Open

app(flowerhub): Fed-OMOP Hospital readmission prediction on MIMIC-IV#6780
manjahdani wants to merge 5 commits intoflwrlabs:mainfrom
manjahdani:fedomop

Conversation

@manjahdani
Copy link

@manjahdani manjahdani commented Mar 17, 2026

Issue

Description

Related issues/PRs

Proposal

Explanation

Checklist

  • Implement proposed change
  • Write tests
  • Update documentation
  • Make CI checks pass
  • Ping maintainers on Slack (channel #contributions)

Any other comments?

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new fed-omop FlowerHub example for MIMIC-IV (v2.2) readmission prediction, including a preprocessing pipeline to generate tabular datasets, a Flower ServerApp/ClientApp implementation, and supporting docs/visualization.

Changes:

  • Introduce a MIMIC-IV preprocessing pipeline (extraction → feature processing → dataset generation → dataset build).
  • Add Flower ServerApp/ClientApp and model/dataset utilities for federated training/evaluation.
  • Add documentation and a results plotting script.

Reviewed changes

Copilot reviewed 26 out of 32 changed files in this pull request and generated 24 comments.

Show a summary per file
File Description
examples/fed-omop/pyproject.toml Declares the example package and Flower app config/runtime dependencies.
examples/fed-omop/result_visualization.py Plots per-round AUROC/AUPRC/accuracy from saved JSON results.
examples/fed-omop/fedomop/server_app.py Server-side orchestration (FedAvg), centralized evaluation, persistence, plotting.
examples/fed-omop/fedomop/client_app.py Client-side training/evaluation loop.
examples/fed-omop/fedomop/dataset.py Loads generated CSVs and creates partitioned train/val/test dataloaders.
examples/fed-omop/fedomop/model.py ResMLP model plus train/test metric computation.
examples/fed-omop/fedomop/task_utils.py Dataset/model registry, seeding, metric aggregation helpers.
examples/fed-omop/fedomop/log_utils.py Result JSON initialization and metric persistence.
examples/fed-omop/preprocess_MIMIC/** End-to-end dataset generation pipeline (config, utils, and steps).
examples/fed-omop/docs/mimiciv.md Detailed documentation for the MIMIC-IV preprocessing pipeline.
examples/fed-omop/README.md Top-level instructions for dataset generation and running simulations/deployments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +83 to +85
tab_name = str(uuid1())
pd.DataFrame(list_rows_lab).to_csv(f"{tab_name}.csv")

Comment on lines +147 to +158
def save_to_json(self, path: str | None = None):
"""Saves this config at `path` if provided, else in the same place as `self.out_path`"""
if path is None:
path, _ = splitext(self.out_path)

super().save_to_json(path + ".json")

def load_from_json(self, path: str | None = None):
"""Loads this config from `path` if provided, else from the same place as `self.out_path`"""
if path is None:
path, _ = splitext(self.out_path)

def read_d_labitems_table(mimic4_path):
labitems = dataframe_from_csv(os.path.join(mimic4_path, 'hosp/d_labitems.csv.gz'), chunksize=1000)
labitems.reset_index(inplace=True)
return labitems[['itemid', 'label', 'category', 'lonic_code']]
# Define anchor_year corresponding to the anchor_year_group 2017-2019. This is later used to prevent consideration
# of visits with prediction windows outside the dataset's time range (2008-2019)
#[[group_col, visit_col, admit_col, disch_col]]
if use_ICU:
def read_labs(mimic4_path):
labevents = read_labevents_table(mimic4_path)
labitems = read_d_labitems_table(mimic4_path)
return labevents(mimic4_path).merge(
if df_cohort.empty:
df_cohort=chunk_merged
else:
pd.concat(
Comment on lines +155 to +156
stat_all = stat_all.set_index('stay_id')
demo_all = demo_all.set_index('stay_id')
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new examples/fed-omop Flower example implementing a MIMIC-IV (v2.2) preprocessing pipeline and a federated readmission prediction workflow (server/client apps, dataset loader, and result plotting) intended for Flower simulation/deployment.

Changes:

  • Introduces a full MIMIC-IV preprocessing + dataset building pipeline under preprocess_MIMIC/.
  • Adds a Flower ServerApp/ClientApp plus tabular ResMLP model and dataset utilities under fedomop/.
  • Adds documentation for running the preprocessing and FL experiments, plus plotting utilities.

Reviewed changes

Copilot reviewed 26 out of 32 changed files in this pull request and generated 20 comments.

Show a summary per file
File Description
examples/fed-omop/result_visualization.py Plot metrics from saved JSON results and export figure
examples/fed-omop/pyproject.toml Packaging + dependencies + Flower app configuration for the example
examples/fed-omop/preprocess_MIMIC/utils/uom_conversion.py Utility to drop rows with infrequent units of measure per item
examples/fed-omop/preprocess_MIMIC/utils/readme.md Brief description of preprocessing utility files
examples/fed-omop/preprocess_MIMIC/utils/outlier_removal.py Outlier handling helper functions
examples/fed-omop/preprocess_MIMIC/utils/labs_preprocess_util.py Imputation of missing hospital admission IDs for lab events
examples/fed-omop/preprocess_MIMIC/utils/icu_preprocess_util.py ICU-specific preprocessing utilities (events/ICD pivoting, etc.)
examples/fed-omop/preprocess_MIMIC/utils/hosp_preprocess_util.py Hospital (non-ICU) preprocessing utilities (labs/meds/proc/ICD)
examples/fed-omop/preprocess_MIMIC/utils/config.py Dataclass-based preprocessing configuration + JSON save/load
examples/fed-omop/preprocess_MIMIC/steps/feature_selection.py Feature extraction/cleaning/selection step orchestration
examples/fed-omop/preprocess_MIMIC/steps/extraction.py Cohort extraction and labeling (mortality/readmission/LOS)
examples/fed-omop/preprocess_MIMIC/steps/disease_cohort.py Disease-cohort extraction based on ICD mappings
examples/fed-omop/preprocess_MIMIC/steps/data_generation_icu.py ICU time-series generation + CSV/dict outputs
examples/fed-omop/preprocess_MIMIC/steps/data_generation.py Non-ICU time-series generation + CSV/dict outputs
examples/fed-omop/preprocess_MIMIC/steps/build_dataset.py Consolidate generated CSVs into final X/Y outputs
examples/fed-omop/preprocess_MIMIC/generate_dataset.py Main pipeline runner with checkpoints and CLI handling
examples/fed-omop/preprocess_MIMIC/config.json Example configuration file for preprocessing run
examples/fed-omop/fedomop/task_utils.py Dataset/model registry, seeding, and metric aggregation utilities
examples/fed-omop/fedomop/server_app.py Flower ServerApp entrypoint + centralized evaluation + plotting
examples/fed-omop/fedomop/model.py ResMLP model and train/test loops for tabular binary classification
examples/fed-omop/fedomop/log_utils.py Result JSON initialization + per-round metric persistence
examples/fed-omop/fedomop/helpers.py Save/load model components into Flower state helpers
examples/fed-omop/fedomop/dataset.py Load/preprocess CSV dataset and provide partitioned DataLoaders
examples/fed-omop/fedomop/client_app.py Flower ClientApp train/evaluate handlers
examples/fed-omop/fedomop/init.py Package marker
examples/fed-omop/docs/mimiciv.md Detailed documentation of the preprocessing pipeline and outputs
examples/fed-omop/README.md Example-level README: install, generate dataset, run simulation/deployment

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

cols_t = [x + "_"+str(t) for x in cols]

concat_cols.extend(cols_t)
X , Y = getXY_consolidated(hids, labels , concat_cols , concat , use_ICU)
def read_labs(mimic4_path):
labevents = read_labevents_table(mimic4_path)
labitems = read_d_labitems_table(mimic4_path)
return labevents(mimic4_path).merge(
if df_cohort.empty:
df_cohort=chunk_merged
else:
pd.concat(
Comment on lines +155 to +156
stat_all = stat_all.set_index('stay_id')
demo_all = demo_all.set_index('stay_id')
Comment on lines +83 to +85
tab_name = str(uuid1())
pd.DataFrame(list_rows_lab).to_csv(f"{tab_name}.csv")

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@github-actions github-actions bot added the Contributor Used to determine what PRs (mainly) come from external contributors. label Mar 17, 2026
Copy link
Member

@yan-gao-GY yan-gao-GY left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@manjahdani thanks for creating this PR! Just a quick note that Flower Hub currently doesn’t support .txt, .csv, and .png files. I noticed these files in your repo—would the app still run correctly if we exclude them from the upload?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Contributor Used to determine what PRs (mainly) come from external contributors.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants