Skip to content

Rerunning fine tuning with normal cell type atlas samples #27

@alvinwt

Description

@alvinwt

Hi I am trying to train MethylBERT on a combination of colon and human leukocyte samples from the normal cell-type methylation atlas. I have converted the .pat files to reads.csv using wgbs_atlas_simulation.

I combined the reads.csv files for different blood and colon samples but the code fails when I use 1 sample as well.

Are there additional steps to process the reads prior to fine tuning or am I running with the wrong parameters?

command:

methylbert finetune 
-c ~/software/wgbs_atlas_simulation/res/GSM5652299_Blood-NK-Z000000TM.hg38_reads.csv 
-t ~/software/methylbert/test/data/processed/test_seq.csv \
-o model/ \
-l 4 \
-s 161 \
-b 256 \
--gradient_accumulation_steps 4 \
-e 600 \
-w 2 \
--log_freq 1 \
--eval_freq 1 \
--warm_up 1 \
--lr 1e-4 \
--decrease_steps 200 \
--loss focal_bce \
--with_cuda > stdout.txt 2> stderr.txt

and got

stdout.txt

MethylBERT v2.0.2
Create a tokenizer for 3-mers
Building Vocab
Vocab Size:  69
CPU info: 40 80
Loading Train Dataset: /home/alvin.ngwt/software/wgbs_atlas_simulation/res/GSM5652299_Blood-NK-Z000000TM.hg38_reads.csv
Total number of sequences :  122276
# of reads in each label:  [103658.  18618.]
122276 seqs with 1497 labels 
Loading Test Dataset: /home/alvin.ngwt/software/methylbert/test/data/processed/test_seq.csv
Total number of sequences :  612
# of reads in each label:  [319. 293.]
Creating Dataloader
Local step batch size :  64
Creating BERT Trainer
The model is loaded on GPU
Pre-trained MethylBERT model for 4 encoder blocks is selected.
Restore the pretrained model hanyangii/methylbert_hg19_4l
Focal loss assigned
Total Parameters: 49817090
Training Start
False

stderr.txt

File "/home/alvin.ngwt/miniconda3/envs/methylbert/bin/methylbert", line 7, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/alvin.ngwt/miniconda3/envs/methylbert/lib/python3.11/site-packages/methylbert/cli.py", line 315, in main
    run_finetune(args)
  File "/home/alvin.ngwt/miniconda3/envs/methylbert/lib/python3.11/site-packages/methylbert/cli.py", line 182, in run_finetune
    trainer.train(args.steps)
  File "/home/alvin.ngwt/miniconda3/envs/methylbert/lib/python3.11/site-packages/methylbert/trainer.py", line 123, in train
    return self._iteration(steps, self.train_data, verbose)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alvin.ngwt/miniconda3/envs/methylbert/lib/python3.11/site-packages/methylbert/trainer.py", line 503, in _iteration
    for i, batch in enumerate(data_loader):
  File "/home/alvin.ngwt/miniconda3/envs/methylbert/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 630, in __next__
    data = self._next_data()
           ^^^^^^^^^^^^^^^^^
  File "/home/alvin.ngwt/miniconda3/envs/methylbert/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 1344, in _next_data
    return self._process_data(data)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alvin.ngwt/miniconda3/envs/methylbert/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 1370, in _process_data
    data.reraise()
  File "/home/alvin.ngwt/miniconda3/envs/methylbert/lib/python3.11/site-packages/torch/_utils.py", line 706, in reraise
    raise exception
RuntimeError: Caught RuntimeError in DataLoader worker process 0.

Original Traceback (most recent call last):
  File "/home/alvin.ngwt/miniconda3/envs/methylbert/lib/python3.11/site-packages/torch/utils/data/_utils/worker.py", line 309, in _worker_loop
    data = fetcher.fetch(index)  # type: ignore[possibly-undefined]
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/alvin.ngwt/miniconda3/envs/methylbert/lib/python3.11/site-packages/torch/utils/data/_utils/fetch.py", line 55, in fetch
    return self.collate_fn(data)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/alvin.ngwt/miniconda3/envs/methylbert/lib/python3.11/site-packages/torch/utils/data/_utils/collate.py", line 317, in default_collate
    return collate(batch, collate_fn_map=default_collate_fn_map)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alvin.ngwt/miniconda3/envs/methylbert/lib/python3.11/site-packages/torch/utils/data/_utils/collate.py", line 155, in collate
    clone.update({key: collate([d[key] for d in batch], collate_fn_map=collate_fn_map) for key in elem})
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alvin.ngwt/miniconda3/envs/methylbert/lib/python3.11/site-packages/torch/utils/data/_utils/collate.py", line 155, in <dictcomp>
    clone.update({key: collate([d[key] for d in batch], collate_fn_map=collate_fn_map) for key in elem})
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alvin.ngwt/miniconda3/envs/methylbert/lib/python3.11/site-packages/torch/utils/data/_utils/collate.py", line 142, in collate
    return collate_fn_map[elem_type](batch, collate_fn_map=collate_fn_map)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/alvin.ngwt/miniconda3/envs/methylbert/lib/python3.11/site-packages/torch/utils/data/_utils/collate.py", line 213, in collate_tensor_fn
    out = elem.new(storage).resize_(len(batch), *list(elem.size()))
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Trying to resize storage that is not resizable

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions