app(flowerhub): Fed-OMOP Hospital readmission prediction on MIMIC-IV#6780
app(flowerhub): Fed-OMOP Hospital readmission prediction on MIMIC-IV#6780manjahdani wants to merge 5 commits intoflwrlabs:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Adds a new fed-omop FlowerHub example for MIMIC-IV (v2.2) readmission prediction, including a preprocessing pipeline to generate tabular datasets, a Flower ServerApp/ClientApp implementation, and supporting docs/visualization.
Changes:
- Introduce a MIMIC-IV preprocessing pipeline (extraction → feature processing → dataset generation → dataset build).
- Add Flower ServerApp/ClientApp and model/dataset utilities for federated training/evaluation.
- Add documentation and a results plotting script.
Reviewed changes
Copilot reviewed 26 out of 32 changed files in this pull request and generated 24 comments.
Show a summary per file
| File | Description |
|---|---|
| examples/fed-omop/pyproject.toml | Declares the example package and Flower app config/runtime dependencies. |
| examples/fed-omop/result_visualization.py | Plots per-round AUROC/AUPRC/accuracy from saved JSON results. |
| examples/fed-omop/fedomop/server_app.py | Server-side orchestration (FedAvg), centralized evaluation, persistence, plotting. |
| examples/fed-omop/fedomop/client_app.py | Client-side training/evaluation loop. |
| examples/fed-omop/fedomop/dataset.py | Loads generated CSVs and creates partitioned train/val/test dataloaders. |
| examples/fed-omop/fedomop/model.py | ResMLP model plus train/test metric computation. |
| examples/fed-omop/fedomop/task_utils.py | Dataset/model registry, seeding, metric aggregation helpers. |
| examples/fed-omop/fedomop/log_utils.py | Result JSON initialization and metric persistence. |
| examples/fed-omop/preprocess_MIMIC/** | End-to-end dataset generation pipeline (config, utils, and steps). |
| examples/fed-omop/docs/mimiciv.md | Detailed documentation for the MIMIC-IV preprocessing pipeline. |
| examples/fed-omop/README.md | Top-level instructions for dataset generation and running simulations/deployments. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| tab_name = str(uuid1()) | ||
| pd.DataFrame(list_rows_lab).to_csv(f"{tab_name}.csv") | ||
|
|
| def save_to_json(self, path: str | None = None): | ||
| """Saves this config at `path` if provided, else in the same place as `self.out_path`""" | ||
| if path is None: | ||
| path, _ = splitext(self.out_path) | ||
|
|
||
| super().save_to_json(path + ".json") | ||
|
|
||
| def load_from_json(self, path: str | None = None): | ||
| """Loads this config from `path` if provided, else from the same place as `self.out_path`""" | ||
| if path is None: | ||
| path, _ = splitext(self.out_path) | ||
|
|
| def read_d_labitems_table(mimic4_path): | ||
| labitems = dataframe_from_csv(os.path.join(mimic4_path, 'hosp/d_labitems.csv.gz'), chunksize=1000) | ||
| labitems.reset_index(inplace=True) | ||
| return labitems[['itemid', 'label', 'category', 'lonic_code']] |
| # Define anchor_year corresponding to the anchor_year_group 2017-2019. This is later used to prevent consideration | ||
| # of visits with prediction windows outside the dataset's time range (2008-2019) | ||
| #[[group_col, visit_col, admit_col, disch_col]] | ||
| if use_ICU: |
| def read_labs(mimic4_path): | ||
| labevents = read_labevents_table(mimic4_path) | ||
| labitems = read_d_labitems_table(mimic4_path) | ||
| return labevents(mimic4_path).merge( |
| if df_cohort.empty: | ||
| df_cohort=chunk_merged | ||
| else: | ||
| pd.concat( |
| stat_all = stat_all.set_index('stay_id') | ||
| demo_all = demo_all.set_index('stay_id') |
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Adds a new examples/fed-omop Flower example implementing a MIMIC-IV (v2.2) preprocessing pipeline and a federated readmission prediction workflow (server/client apps, dataset loader, and result plotting) intended for Flower simulation/deployment.
Changes:
- Introduces a full MIMIC-IV preprocessing + dataset building pipeline under
preprocess_MIMIC/. - Adds a Flower
ServerApp/ClientAppplus tabularResMLPmodel and dataset utilities underfedomop/. - Adds documentation for running the preprocessing and FL experiments, plus plotting utilities.
Reviewed changes
Copilot reviewed 26 out of 32 changed files in this pull request and generated 20 comments.
Show a summary per file
| File | Description |
|---|---|
| examples/fed-omop/result_visualization.py | Plot metrics from saved JSON results and export figure |
| examples/fed-omop/pyproject.toml | Packaging + dependencies + Flower app configuration for the example |
| examples/fed-omop/preprocess_MIMIC/utils/uom_conversion.py | Utility to drop rows with infrequent units of measure per item |
| examples/fed-omop/preprocess_MIMIC/utils/readme.md | Brief description of preprocessing utility files |
| examples/fed-omop/preprocess_MIMIC/utils/outlier_removal.py | Outlier handling helper functions |
| examples/fed-omop/preprocess_MIMIC/utils/labs_preprocess_util.py | Imputation of missing hospital admission IDs for lab events |
| examples/fed-omop/preprocess_MIMIC/utils/icu_preprocess_util.py | ICU-specific preprocessing utilities (events/ICD pivoting, etc.) |
| examples/fed-omop/preprocess_MIMIC/utils/hosp_preprocess_util.py | Hospital (non-ICU) preprocessing utilities (labs/meds/proc/ICD) |
| examples/fed-omop/preprocess_MIMIC/utils/config.py | Dataclass-based preprocessing configuration + JSON save/load |
| examples/fed-omop/preprocess_MIMIC/steps/feature_selection.py | Feature extraction/cleaning/selection step orchestration |
| examples/fed-omop/preprocess_MIMIC/steps/extraction.py | Cohort extraction and labeling (mortality/readmission/LOS) |
| examples/fed-omop/preprocess_MIMIC/steps/disease_cohort.py | Disease-cohort extraction based on ICD mappings |
| examples/fed-omop/preprocess_MIMIC/steps/data_generation_icu.py | ICU time-series generation + CSV/dict outputs |
| examples/fed-omop/preprocess_MIMIC/steps/data_generation.py | Non-ICU time-series generation + CSV/dict outputs |
| examples/fed-omop/preprocess_MIMIC/steps/build_dataset.py | Consolidate generated CSVs into final X/Y outputs |
| examples/fed-omop/preprocess_MIMIC/generate_dataset.py | Main pipeline runner with checkpoints and CLI handling |
| examples/fed-omop/preprocess_MIMIC/config.json | Example configuration file for preprocessing run |
| examples/fed-omop/fedomop/task_utils.py | Dataset/model registry, seeding, and metric aggregation utilities |
| examples/fed-omop/fedomop/server_app.py | Flower ServerApp entrypoint + centralized evaluation + plotting |
| examples/fed-omop/fedomop/model.py | ResMLP model and train/test loops for tabular binary classification |
| examples/fed-omop/fedomop/log_utils.py | Result JSON initialization + per-round metric persistence |
| examples/fed-omop/fedomop/helpers.py | Save/load model components into Flower state helpers |
| examples/fed-omop/fedomop/dataset.py | Load/preprocess CSV dataset and provide partitioned DataLoaders |
| examples/fed-omop/fedomop/client_app.py | Flower ClientApp train/evaluate handlers |
| examples/fed-omop/fedomop/init.py | Package marker |
| examples/fed-omop/docs/mimiciv.md | Detailed documentation of the preprocessing pipeline and outputs |
| examples/fed-omop/README.md | Example-level README: install, generate dataset, run simulation/deployment |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| cols_t = [x + "_"+str(t) for x in cols] | ||
|
|
||
| concat_cols.extend(cols_t) | ||
| X , Y = getXY_consolidated(hids, labels , concat_cols , concat , use_ICU) |
| def read_labs(mimic4_path): | ||
| labevents = read_labevents_table(mimic4_path) | ||
| labitems = read_d_labitems_table(mimic4_path) | ||
| return labevents(mimic4_path).merge( |
| if df_cohort.empty: | ||
| df_cohort=chunk_merged | ||
| else: | ||
| pd.concat( |
| stat_all = stat_all.set_index('stay_id') | ||
| demo_all = demo_all.set_index('stay_id') |
| tab_name = str(uuid1()) | ||
| pd.DataFrame(list_rows_lab).to_csv(f"{tab_name}.csv") | ||
|
|
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
yan-gao-GY
left a comment
There was a problem hiding this comment.
@manjahdani thanks for creating this PR! Just a quick note that Flower Hub currently doesn’t support .txt, .csv, and .png files. I noticed these files in your repo—would the app still run correctly if we exclude them from the upload?
Issue
Description
Related issues/PRs
Proposal
Explanation
Checklist
#contributions)Any other comments?