chebi: use spilt file to create new data class

aditya0by0 · aditya0by0 · commit b83e5cdce49d · 2024-11-02T11:23:35.000+01:00
diff --git a/tutorials/data_exploration_chebi.ipynb b/tutorials/data_exploration_chebi.ipynb
@@ -840,6 +840,22 @@
     "The `splits.csv` file contains the saved data splits from previous runs, including the train, validation, and test sets. During subsequent runs, this file is used to reconstruct these splits by filtering the encoded data (`data.pt`) based on the IDs stored in `splits.csv`. This ensures consistency and reproducibility in data splitting, allowing for reliable evaluation and comparison of model performance across different run.\n"
    ]
   },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "6dc3fd6c-7cf6-47ef-812f-54319a0cdeb9",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# You can specify a literal path for the `splits_file_path`, or if another `chebi_class` instance is already defined, \n",
+    "# you can use its existing `splits_file_path` attribute for consistency.\n",
+    "chebi_class_with_splits = ChEBIOver50(\n",
+    "    chebi_version=231, \n",
+    "    # splits_file_path=\"data/chebi_v231/ChEBI50/processed/splits.csv\",  # Literal path option\n",
+    "    splits_file_path=chebi_class.splits_file_path  # Use path from an existing `chebi_class` instance\n",
+    ")"
+   ]
+  },
   {
    "cell_type": "markdown",
    "id": "a5eb482c-ce5b-4efc-b2ec-85ac7b1a78ee",