Quick Start Colab notebooks by carolinef35 · Pull Request #229 · TRAIS-Lab/dattri

carolinef35 · 2025-12-18T02:42:14Z

Description

Corrected two colab notebooks based on "influence_function_noisy_label.py" and "influence_function_lds.py". The notebooks have descriptive comments and are set up to run in under 5 minutes making their use efficient for new users of the dattri library.

jiaqima · 2025-12-22T04:12:38Z

Thanks for the PR!

@TheaperDeng I'm wondering whether it is better avoiding directly merging notebooks into our repo. We could instead have a README file in this quickstart folder, and link to externally hosted colab notebooks. In this way, users could even directly start running the notebooks by clicking the colab links.

WDYT?

TheaperDeng · 2025-12-22T06:11:51Z

Thanks for the PR!

@TheaperDeng I'm wondering whether it is better avoiding directly merging notebooks into our repo. We could instead have a README file in this quickstart folder, and link to externally hosted colab notebooks. In this way, users could even directly start running the notebooks by clicking the colab links.

WDYT?

I suggest we store our .ipynb files directly in the repository. Since notebooks are text-based (JSON) rather than binary, they are compatible with version control, even if Git diffs can be a bit cluttered. This is standard practice in many major projects; for example: https://github.com/pytorch/pytorch/tree/main/functorch/docs/source/tutorials

To improve accessibility, we can include an "Open in Colab" badge at the top of each notebook using the following snippet:
HTML (This is just an example code block).

jiaqima · 2025-12-22T06:46:26Z

Thanks for the PR!
@TheaperDeng I'm wondering whether it is better avoiding directly merging notebooks into our repo. We could instead have a README file in this quickstart folder, and link to externally hosted colab notebooks. In this way, users could even directly start running the notebooks by clicking the colab links.
WDYT?

I suggest we store our .ipynb files directly in the repository. Since notebooks are text-based (JSON) rather than binary, they are compatible with version control, even if Git diffs can be a bit cluttered. This is standard practice in many major projects; for example: https://github.com/pytorch/pytorch/tree/main/functorch/docs/source/tutorials

To improve accessibility, we can include an "Open in Colab" badge at the top of each notebook using the following snippet: HTML (This is just an example code block).

Ok. Sounds good.

jiaqima · 2025-12-22T06:50:29Z

Another comment is that do we want to keep the package installation command in the notebook? First, we have provided installation guide in our readme so we could just point the users to that guide or repeat the instructions in a text block in the notebook. Second, the current one-line command (!pip install dattri) itself won't get all dependency ready anyway (e.g., pytorch will still be missing). So it might be misleading.

TheaperDeng · 2025-12-30T08:38:33Z

Another comment is that do we want to keep the package installation command in the notebook? First, we have provided installation guide in our readme so we could just point the users to that guide or repeat the instructions in a text block in the notebook. Second, the current one-line command (!pip install dattri) itself won't get all dependency ready anyway (e.g., pytorch will still be missing). So it might be misleading.

That is a good point. I suggest we:

Retain the installation block in the notebook, but clarify that it is specifically designed for Google Colab and the use cases in this notebook.
Direct users to the README for the standard installation guide with a link.

The Colab notebook serves as a quick start for users to experience dattri firsthand. A "one-click" (Run All) experience is much better, especially since PyTorch is pre-installed. While we should keep the installation block for convenience, we should clearly state that it is intended only for the Colab environment.

TheaperDeng

Thanks! Could you add the test script after https://github.com/TRAIS-Lab/dattri/blob/main/.github/workflows/examples_test.yml#L54 to test the two quick start notebook?

A good way to do this is

pip install jupyter nbconvert  # add this line to "Install dependencies"

jupyter nbconvert --to script your_notebook.ipynb  
python your_notebook.py

Added test scripts to test the two quick start notebooks.

TheaperDeng

Please also check the command that convert ipynb file to py file in the example test.

TheaperDeng · 2026-02-14T07:14:38Z

quickstart/influence_function_noisy_label.ipynb

+        "id": "o2mEZymgc0a4"
+      },
+      "source": [
+        "Note: The installation block in the notebook is specifically designed for Google Colab and the use cases in this notebook. Standard installation instructions can me found in the [README](https://github.com/TRAIS-Lab/dattri/blob/main/README.md#quick-start)."


Standard installation instructions can be found

TheaperDeng · 2026-02-14T07:14:53Z

quickstart/influence_function_lds.ipynb

+        "id": "lAg59xgUpsGX"
+      },
+      "source": [
+        "Note: The installation block in the notebook is specifically designed for Google Colab and the use cases in this notebook. Standard installation instructions can me found in the [README](https://github.com/TRAIS-Lab/dattri/blob/main/README.md#quick-start)."


Standard installation instructions can be found

TheaperDeng · 2026-02-14T07:16:32Z

quickstart/influence_function_lds.ipynb

+      "source": [
+        "LDS Score: used to evaluate the overall performance of a data attribution method.\n",
+        "\n",
+        "* A score near 1 means the attribution method accurately predicts the model's response to data changes\n",


LDS close to 1 means ...

TheaperDeng · 2026-02-14T07:19:47Z

quickstart/influence_function_lds.ipynb

+        "id": "w7x4js5WvpTN"
+      },
+      "source": [
+        "LDS Score: used to evaluate the overall performance of a data attribution method.\n",


Linear Datamodeling Score (LDS) is a metric used to evaluate the performance of data attribution methods on the counterfactual estimation task of predicting model behavior given different subsets of the training set.

TheaperDeng · 2026-02-14T07:28:11Z

quickstart/influence_function_noisy_label.ipynb

+        "id": "yKAFh2xKeVxo"
+      },
+      "source": [
+        "Dictionary to manage and intialize different influence function algorithms with their specific configurations. Each key is a specific arritbution method and the corresponding value is a class constructor with some of its arguments already pre-filled."


Dictionary to manage different influence function algorithms with their specific configurations. Each key is a specific attribution method and the corresponding value is a class constructor with default arguments.

TheaperDeng · 2026-02-14T07:31:44Z

quickstart/influence_function_noisy_label.ipynb

+        "\n",
+        "\n",
+        "*   Higher influence indicates that a particular data point is problematic for the model.\n",
+        "*   Mislabeled samples will exert a stronger, often negative, influence on the model's traning process.\n",


Please introuduce self attribution here. (what is self attribution; why higher self attribution score indicates noisy label). Check https://arxiv.org/pdf/1703.04730 Sec 5.4 as reference.

TheaperDeng · 2026-02-14T07:35:23Z

Please also fix the coding style issues in "Lint with Ruff" test

carolinef35 added 2 commits November 19, 2025 17:26

added two colab files under the quickstart folder in examples

ebc348c

changed two colab files under the quickstart folder in examples

dcf032b

TheaperDeng changed the title ~~Corrected Colab examples~~ Quick start colab notebook Dec 19, 2025

TheaperDeng changed the title ~~Quick start colab notebook~~ Quick Start Colab notebooks Dec 19, 2025

carolinef35 added 2 commits January 6, 2026 18:47

added badge and clarified installation instructions

df9bd86

fixed badge link

58ce547

jiaqima closed this Jan 19, 2026

jiaqima reopened this Jan 19, 2026

TheaperDeng reviewed Jan 30, 2026

View reviewed changes

carolinef35 added 3 commits February 3, 2026 17:20

Update examples_test.yml

d7f3727

Added test scripts to test the two quick start notebooks.

moved colab files, fixed title, and replaced pip install command

ae26e6b

Merge branch 'main' into colab_examples

9d03d9f

TheaperDeng requested changes Feb 14, 2026

View reviewed changes

Conversation

carolinef35 commented Dec 18, 2025

Description

Uh oh!

jiaqima commented Dec 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

TheaperDeng commented Dec 22, 2025

Uh oh!

jiaqima commented Dec 22, 2025

Uh oh!

jiaqima commented Dec 22, 2025

Uh oh!

TheaperDeng commented Dec 30, 2025

Uh oh!

TheaperDeng left a comment

Choose a reason for hiding this comment

Uh oh!

TheaperDeng left a comment

Choose a reason for hiding this comment

Uh oh!

TheaperDeng Feb 14, 2026

Choose a reason for hiding this comment

Uh oh!

TheaperDeng Feb 14, 2026

Choose a reason for hiding this comment

Uh oh!

TheaperDeng Feb 14, 2026

Choose a reason for hiding this comment

Uh oh!

TheaperDeng Feb 14, 2026

Choose a reason for hiding this comment

Uh oh!

TheaperDeng Feb 14, 2026

Choose a reason for hiding this comment

Uh oh!

TheaperDeng Feb 14, 2026

Choose a reason for hiding this comment

Uh oh!

TheaperDeng commented Feb 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jiaqima commented Dec 22, 2025 •

edited

Loading