Skip to content

Commit 3eeb8b2

Browse files
author
sfluegel
committed
resolve merge conflicts
2 parents 2288b83 + 0176517 commit 3eeb8b2

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

67 files changed

+2074
-344
lines changed

.github/workflows/black.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,4 +7,4 @@ jobs:
77
runs-on: ubuntu-latest
88
steps:
99
- uses: actions/checkout@v2
10-
- uses: psf/black@stable
10+
- uses: psf/black@stable

.gitignore

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -161,3 +161,8 @@ cython_debug/
161161
#.idea/
162162

163163
# configs/ # commented as new configs can be added as a part of a feature
164+
/.idea
165+
/data
166+
/logs
167+
/results_buffer
168+
electra_pretrained.ckpt

.pre-commit-config.yaml

Lines changed: 21 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,25 @@
11
repos:
2-
#- repo: https://github.com/PyCQA/isort
3-
# rev: "5.12.0"
4-
# hooks:
5-
# - id: isort
62
- repo: https://github.com/psf/black
73
rev: "24.2.0"
84
hooks:
9-
- id: black
5+
- id: black
6+
- id: black-jupyter # for formatting jupyter-notebook
7+
8+
- repo: https://github.com/pycqa/isort
9+
rev: 5.13.2
10+
hooks:
11+
- id: isort
12+
name: isort (python)
13+
args: ["--profile=black"]
14+
15+
- repo: https://github.com/asottile/seed-isort-config
16+
rev: v2.2.0
17+
hooks:
18+
- id: seed-isort-config
19+
20+
- repo: https://github.com/pre-commit/pre-commit-hooks
21+
rev: v4.6.0
22+
hooks:
23+
- id: check-yaml
24+
- id: end-of-file-fixer
25+
- id: trailing-whitespace

README.md

Lines changed: 23 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,25 @@
11
# ChEBai
22

3-
ChEBai is a deep learning library designed for the integration of deep learning methods with chemical ontologies, particularly ChEBI.
3+
ChEBai is a deep learning library designed for the integration of deep learning methods with chemical ontologies, particularly ChEBI.
44
The library emphasizes the incorporation of the semantic qualities of the ontology into the learning process.
55

6+
## Note for developers
7+
8+
If you have used ChEBai before PR #39, the file structure in which your ChEBI-data is saved has changed. This means that
9+
datasets will be freshly generated. The data however is the same. If you want to keep the old data (including the old
10+
splits), you can use a migration script. It copies the old data to the new location for a specific ChEBI class
11+
(including chebi version and other parameters). The script can be called by specifying the data module from a config
12+
```
13+
python chebai/preprocessing/migration/chebi_data_migration.py migrate --datamodule=[path-to-data-config]
14+
```
15+
or by specifying the class name (e.g. `ChEBIOver50`) and arguments separately
16+
```
17+
python chebai/preprocessing/migration/chebi_data_migration.py migrate --class_name=[data-class] [--chebi_version=[version]]
18+
```
19+
The new dataset will by default generate random data splits (with a given seed).
20+
To reuse a fixed data split, you have to provide the path of the csv file generated during the migration:
21+
`--data.init_args.splits_file_path=[path-to-processed_data]/splits.csv`
22+
623
## Installation
724

825
To install ChEBai, follow these steps:
@@ -21,7 +38,7 @@ pip install .
2138

2239
## Usage
2340

24-
The training and inference is abstracted using the Pytorch Lightning modules.
41+
The training and inference is abstracted using the Pytorch Lightning modules.
2542
Here are some CLI commands for the standard functionalities of pretraining, ontology extension, fine-tuning for toxicity and prediction.
2643
For further details, see the [wiki](https://github.com/ChEB-AI/python-chebai/wiki).
2744
If you face any problems, please open a new [issue](https://github.com/ChEB-AI/python-chebai/issues/new).
@@ -55,18 +72,18 @@ The `classes_path` is the path to the dataset's `raw/classes.txt` file that cont
5572

5673
## Evaluation
5774

58-
An example for evaluating a model trained on the ontology extension task is given in `tutorials/eval_model_basic.ipynb`.
75+
An example for evaluating a model trained on the ontology extension task is given in `tutorials/eval_model_basic.ipynb`.
5976
It takes in the finetuned model as input for performing the evaluation.
6077

6178
## Cross-validation
62-
You can do inner k-fold cross-validation, i.e., train models on k train-validation splits that all use the same test
79+
You can do inner k-fold cross-validation, i.e., train models on k train-validation splits that all use the same test
6380
set. For that, you need to specify the total_number of folds as
6481
```
6582
--data.init_args.inner_k_folds=K
6683
```
6784
and the fold to be used in the current optimisation run as
68-
```
85+
```
6986
--data.init_args.fold_index=I
7087
```
71-
To train K models, you need to do K such calls, each with a different `fold_index`. On the first call with a given
88+
To train K models, you need to do K such calls, each with a different `fold_index`. On the first call with a given
7289
`inner_k_folds`, all folds will be created and stored in the data directory

chebai/callbacks.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
import json
22
import os
33

4-
from lightning.pytorch.callbacks import BasePredictionWriter
54
import torch
5+
from lightning.pytorch.callbacks import BasePredictionWriter
66

77

88
class ChebaiPredictionWriter(BasePredictionWriter):

chebai/callbacks/prediction_callback.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,9 @@
1-
from lightning.pytorch.callbacks import BasePredictionWriter
2-
import torch
31
import os
42
import pickle
53

4+
import torch
5+
from lightning.pytorch.callbacks import BasePredictionWriter
6+
67

78
class PredictionWriter(BasePredictionWriter):
89
def __init__(self, output_dir, write_interval):

chebai/loggers/custom.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,11 @@
1-
from datetime import datetime
2-
from typing import Literal, Optional, Union, List
31
import os
2+
from datetime import datetime
3+
from typing import List, Literal, Optional, Union
44

5+
import wandb
56
from lightning.fabric.utilities.types import _PATH
67
from lightning.pytorch.callbacks import ModelCheckpoint
78
from lightning.pytorch.loggers import WandbLogger
8-
import wandb
99

1010

1111
class CustomLogger(WandbLogger):

chebai/loss/bce_weighted.py

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,11 @@
1+
import os
2+
import pickle
3+
4+
import pandas as pd
15
import torch
6+
27
from chebai.preprocessing.datasets.base import XYBaseDataModule
38
from chebai.preprocessing.datasets.pubchem import LabeledUnlabeledMixed
4-
import pandas as pd
5-
import os
6-
import pickle
79

810

911
class BCEWeighted(torch.nn.BCEWithLogitsLoss):

chebai/loss/semantic.py

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,15 @@
11
import csv
2+
import math
23
import os
34
import pickle
45

5-
import math
66
import torch
7+
78
from typing import Literal, Union
89

9-
from chebai.preprocessing.datasets.chebi import _ChEBIDataExtractor, ChEBIOver100
10-
from chebai.preprocessing.datasets.pubchem import LabeledUnlabeledMixed
1110
from chebai.loss.bce_weighted import BCEWeighted
11+
from chebai.preprocessing.datasets.chebi import ChEBIOver100, _ChEBIDataExtractor
12+
from chebai.preprocessing.datasets.pubchem import LabeledUnlabeledMixed
1213

1314

1415
class ImplicationLoss(torch.nn.Module):

chebai/models/base.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
1-
from typing import Optional
21
import logging
32
import typing
3+
from typing import Optional
44

5-
from lightning.pytorch.core.module import LightningModule
65
import torch
6+
from lightning.pytorch.core.module import LightningModule
77

88
from chebai.preprocessing.structures import XYData
99

0 commit comments

Comments
 (0)