send evaluation table to wandb when enabled by lfoppiano · Pull Request #182 · kermitt2/delft

lfoppiano · 2025-06-06T09:06:37Z

This PR allow to send the results of the evaluation to wandb. When running train_eval, it will be attached to the run, but with the eval command, it will be attached to a separate run in wandb. Nothing we can do about that.

Copilot

Pull Request Overview

This PR enables logging of evaluation metrics to Weights & Biases by propagating a new report_to_wandb flag and formatting results into a wandb.Table.

Adds a report_to_wandb parameter through the evaluation and training pipeline.
Implements to_wandb_table utility to convert metric maps into table format.
Updates the Grobid tagger CLI to accept and forward the report_to_wandb flag.

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File	Description
delft/sequenceLabelling/wrapper.py	Propagate `report_to_wandb` into evaluation and log tables to wandb
delft/sequenceLabelling/trainer.py	Rename wandb flag, implement `report_to_wandb` in `Scorer`, add helper
delft/applications/grobidTagger.py	Update `eval_` signature and forwarding of `report_to_wandb` flag

Comments suppressed due to low confidence (3)

delft/sequenceLabelling/trainer.py:33

The constructor parameter is now report_to_wandb but the instance attribute remains self.enable_wandb. Rename the attribute to self.report_to_wandb (or the parameter back to enable_wandb) for consistency.

report_to_wandb = False

delft/sequenceLabelling/trainer.py:497

This new utility function isn’t covered by tests. Consider adding unit tests for to_wandb_table to verify correct table structure for various input maps.

def to_wandb_table(report_as_map):

delft/applications/grobidTagger.py:521

Passing the wandb module here will lead to a NameError and is semantically incorrect—this parameter expects a boolean. Replace report_to_wandb=wandb with report_to_wandb=True (or the appropriate flag) and ensure you import wandb if needed.

report_to_wandb=wandb

Copilot · 2025-06-06T09:08:31Z

delft/sequenceLabelling/trainer.py


+
+def to_wandb_table(report_as_map):
+    columns = ["", "precision", "recall", "f1-score", "support"]


[nitpick] Using an empty string for the first column header is unclear. Consider renaming it to "label" or "class" to improve readability in the wandb table.

Suggested change

columns = ["", "precision", "recall", "f1-score", "support"]

columns = ["label", "precision", "recall", "f1-score", "support"]

Copilot · 2025-06-06T09:08:32Z

delft/sequenceLabelling/trainer.py

+        micro = report_as_map['macro']
+        micro_row = [
+            "all (macro avg.)",
+            round(micro['precision'], 4),
+            round(micro['recall'], 4),
+            round(micro['f1'], 4),
+            int(micro['support'])
+        ]
+        data.append(micro_row)


[nitpick] Reusing the variable micro for macro-averaged metrics can be confusing. Rename it to macro_metrics (or similar) when handling the 'macro' key.

Suggested change

micro = report_as_map['macro']

micro_row = [

"all (macro avg.)",

round(micro['precision'], 4),

round(micro['recall'], 4),

round(micro['f1'], 4),

int(micro['support'])

]

data.append(micro_row)

macro_metrics = report_as_map['macro']

macro_row = [

"all (macro avg.)",

round(macro_metrics['precision'], 4),

round(macro_metrics['recall'], 4),

round(macro_metrics['f1'], 4),

int(macro_metrics['support'])

]

data.append(macro_row)

send evaluation table to wandb when enabled

bb0a2cd

lfoppiano requested a review from Copilot June 6, 2025 09:06

Copilot AI reviewed Jun 6, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

send evaluation table to wandb when enabled#182

send evaluation table to wandb when enabled#182
lfoppiano wants to merge 1 commit intomasterfrom
feature/eval-with-wandb

lfoppiano commented Jun 6, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jun 6, 2025

Uh oh!

Copilot AI Jun 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants



		def to_wandb_table(report_as_map):
		columns = ["", "precision", "recall", "f1-score", "support"]

	columns = ["", "precision", "recall", "f1-score", "support"]
	columns = ["label", "precision", "recall", "f1-score", "support"]

Conversation

lfoppiano commented Jun 6, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Jun 6, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jun 6, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants