Skip to content

app(flowerhub): add FedXGBoost for financial fraud detection implementation#6807

Open
eo4929 wants to merge 2 commits intoflwrlabs:mainfrom
eo4929:FedXGBoost
Open

app(flowerhub): add FedXGBoost for financial fraud detection implementation#6807
eo4929 wants to merge 2 commits intoflwrlabs:mainfrom
eo4929:FedXGBoost

Conversation

@eo4929
Copy link

@eo4929 eo4929 commented Mar 20, 2026

Summary

  • Implement FedXGBoost algorithm
  • Add training and evaluation pipeline
  • Ensure compatibility with Flower framework

Validation

  • ./framework/dev/format.sh
  • ./framework/dev/test.sh

I follow some instructions written in https://www.notion.so/flowerlabs/Guide-How-to-Publish-Apps-on-Flower-Hub-1d1d8ccd59cf8073a742da2c83ceb89b.

Copilot AI review requested due to automatic review settings March 20, 2026 16:12
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new Flower Hub app implementing a federated XGBoost-based workflow for financial fraud detection, including client/server apps, data utilities, and ensemble aggregation.

Changes:

  • Introduces a Flower ServerApp/ClientApp training + evaluation loop for “FedXGBBagging”.
  • Adds dataset preprocessing, partitioning utilities, and XGBoost model (de)serialization helpers.
  • Adds a large fed_xgb_bagging.py module implementing bagging and similarity-based aggregation utilities.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
examples/FinancialFraudDetection-app/pyproject.toml Declares the app package metadata, dependencies, and Flower app entrypoints/config.
examples/FinancialFraudDetection-app/frauddetection/task.py Implements preprocessing, data loading/partitioning, training, evaluation, and model serialization helpers.
examples/FinancialFraudDetection-app/frauddetection/server_app.py Implements the federated orchestration loop, collects per-round client models, builds an ensemble, and runs evaluation.
examples/FinancialFraudDetection-app/frauddetection/client_app.py Implements per-client local training and evaluation handlers and transmits serialized boosters.
examples/FinancialFraudDetection-app/frauddetection/fed_xgb_bagging.py Adds ensemble/bagging and similarity-based utilities used server-side for prediction/evaluation.
examples/FinancialFraudDetection-app/frauddetection/init.py Adds package marker and module docstring.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +5 to +19
[project]
name = "federated-fraud-detection"
version = "1.0.0"
description = "Federated Financial Fraud Detection with XGBoost and Flower"
license = "Apache-2.0"
dependencies = [
"flwr[simulation]>=1.26.1",
"xgboost>=2.0.0",
"scikit-learn>=1.3.0",
"pandas>=2.0.0",
"numpy>=1.24.0",
]

[tool.hatch.build.targets.wheel]
packages = ["."]
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

packages = [\".\"] is very likely to produce an incorrect wheel (it may include unintended files and/or fail to package frauddetection as an importable package). Define the actual package directory (e.g., frauddetection) via Hatch's package selection, and consider adding requires-python since the code uses modern typing syntax (e.g., dict | None in task.py) which requires Python 3.10+.

Copilot uses AI. Check for mistakes.
@@ -0,0 +1,191 @@
"""frauddetection: XGBoost model training and data utilities."""

import json
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

json is imported but not used in this module. Removing it avoids dead imports and keeps the module lint-clean.

Suggested change
import json

Copilot uses AI. Check for mistakes.
Comment on lines +83 to +94
df = pd.read_csv(data_csv)
df = df.sample(frac=1, random_state=42).reset_index(drop=True)

n = len(df)
size = n // num_partitions
start = partition_id * size
end = start + size if partition_id < num_partitions - 1 else n
partition = df.iloc[start:end].reset_index(drop=True)

X, y = preprocess_df(partition)
X_train, X_test, y_train, y_test = _split(X, y)
return X_train, X_test, y_train, y_test
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If len(df) < num_partitions, then size becomes 0 and most partitions will be empty (partition has 0 rows). That will cause train_test_split to throw (no samples) or produce invalid behavior. Consider handling size == 0 explicitly (e.g., cap num_partitions to n, or compute split indices using np.array_split, or raise a clear error).

Copilot uses AI. Check for mistakes.
Comment on lines +145 to +153
# Send the *first* collected model to ``fraction_evaluate`` of clients
# as a representative model so they can report per-partition metrics.
n_eval = max(1, int(fraction_evaluate * n_clients))
eval_node_ids = node_ids[:n_eval]

# Load the first model as the "representative" model for evaluation
with open(all_model_paths[0], "rb") as f:
rep_model_bytes = f.read()
rep_model_array = np.frombuffer(rep_model_bytes, dtype=np.uint8).copy()
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will crash with IndexError when all_model_paths is empty (e.g., n_clients == 0, no replies, or if model collection fails earlier). Add an explicit guard before indexing and either skip evaluation or raise a clear runtime error when no models were collected.

Copilot uses AI. Check for mistakes.
print("\n[Server] Central evaluation on held-out server data …")
X_test, y_test = _load_central_test(csv_path)
ensemble.test_data = X_test
y_pred, y_prob = ensemble.predict(X_test, y_true=y_test)
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing y_true=y_test into ensemble.predict (which auto-selects a best-F1 threshold) tunes the decision threshold on the same held-out set you then report metrics on, inflating results (data leakage). For reporting central test metrics, use a fixed threshold (e.g., 0.5) or select the threshold on a separate validation split.

Suggested change
y_pred, y_prob = ensemble.predict(X_test, y_true=y_test)
y_pred, y_prob = ensemble.predict(X_test)

Copilot uses AI. Check for mistakes.
Comment on lines +360 to +367
self.config = config
self.result_path = result_path
self.history = []
self.test_data = None

self.models = [self._load_model(path) for path in model_paths]

logger.info(f"Initialized EachBankModel for bank_name_round_number={self.config['bank_name_round_number']}")
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

config defaults to None, but the constructor immediately subscripts it (self.config['bank_name_round_number']), which will raise TypeError when config is None and KeyError when the key is missing. Make config always a dict (e.g., self.config = config or {}) and use .get(...) with defaults in logs/filenames to keep the class usable as a general utility.

Suggested change
self.config = config
self.result_path = result_path
self.history = []
self.test_data = None
self.models = [self._load_model(path) for path in model_paths]
logger.info(f"Initialized EachBankModel for bank_name_round_number={self.config['bank_name_round_number']}")
# Ensure config is always a dictionary to avoid TypeError/KeyError when accessing it.
self.config = config or {}
self.result_path = result_path
self.history = []
self.test_data = None
self.models = [self._load_model(path) for path in model_paths]
bank_name_round_number = self.config.get("bank_name_round_number", "unknown")
logger.info(f"Initialized EachBankModel for bank_name_round_number={bank_name_round_number}")

Copilot uses AI. Check for mistakes.
Comment on lines +599 to +624
def save_metrics_history(self, metrics, detection_summary, test_time=None, output_path=None):
logger.info("Saving metrics and detection summary to history")

def convert_to_serializable(obj):
if isinstance(obj, (np.generic, np.int64, np.float32)):
return obj.item()
elif isinstance(obj, np.ndarray):
return obj.tolist()
elif isinstance(obj, pd.Series):
return obj.to_dict()
elif isinstance(obj, pd.DataFrame):
return obj.to_dict(orient="index")
return obj

def flatten_grouped_stats(grouped_stats_df: pd.DataFrame) -> dict:
result = {}
for group in grouped_stats_df.index:
stats_dict = {}
for (feature, stat), value in grouped_stats_df.loc[group].items():
stats_dict[f"{feature}__{stat}"] = value
result[group] = stats_dict
return result

metrics_path = output_path / "metrics.json"
with open(metrics_path, "w") as f:
json.dump(metrics, f, indent=4, default=convert_to_serializable)
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

output_path defaults to None but is treated as a Path (output_path / \"metrics.json\"), which will throw at runtime if the caller doesn't pass it. Either make output_path a required Path parameter (preferred), or add a guard/auto-create a default directory before using it.

Copilot uses AI. Check for mistakes.

def _metadata_similarity(meta_i: Dict, meta_j: Dict) -> float:

score_parts: List[Tuple[str, float, float]] = [] # (이름, 점수, 가중치)
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are multiple comments/log lines in this module that appear to be mojibake (garbled encoding) and/or non-English in a way that’s not readable in the repository’s context. Please normalize these to clear, UTF-8, developer-facing English comments/log messages so future maintainers can understand intent (and to avoid tooling/display issues).

Suggested change
score_parts: List[Tuple[str, float, float]] = [] # (��, ��, ���)
score_parts: List[Tuple[str, float, float]] = [] # (name, score, weight)

Copilot uses AI. Check for mistakes.
Comment on lines +3 to +4
import os

Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

os is imported but not used in this file. Remove it to keep the module clean and avoid lint failures.

Suggested change
import os

Copilot uses AI. Check for mistakes.
@github-actions github-actions bot added the Contributor Used to determine what PRs (mainly) come from external contributors. label Mar 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Contributor Used to determine what PRs (mainly) come from external contributors.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants