-
Notifications
You must be signed in to change notification settings - Fork 7
Open
Description
Hi I am trying to train MethylBERT on a combination of colon and human leukocyte samples from the normal cell-type methylation atlas. I have converted the .pat files to reads.csv using wgbs_atlas_simulation.
I combined the reads.csv files for different blood and colon samples but the code fails when I use 1 sample as well.
Are there additional steps to process the reads prior to fine tuning or am I running with the wrong parameters?
command:
methylbert finetune
-c ~/software/wgbs_atlas_simulation/res/GSM5652299_Blood-NK-Z000000TM.hg38_reads.csv
-t ~/software/methylbert/test/data/processed/test_seq.csv \
-o model/ \
-l 4 \
-s 161 \
-b 256 \
--gradient_accumulation_steps 4 \
-e 600 \
-w 2 \
--log_freq 1 \
--eval_freq 1 \
--warm_up 1 \
--lr 1e-4 \
--decrease_steps 200 \
--loss focal_bce \
--with_cuda > stdout.txt 2> stderr.txt
and got
stdout.txt
MethylBERT v2.0.2
Create a tokenizer for 3-mers
Building Vocab
Vocab Size: 69
CPU info: 40 80
Loading Train Dataset: /home/alvin.ngwt/software/wgbs_atlas_simulation/res/GSM5652299_Blood-NK-Z000000TM.hg38_reads.csv
Total number of sequences : 122276
# of reads in each label: [103658. 18618.]
122276 seqs with 1497 labels
Loading Test Dataset: /home/alvin.ngwt/software/methylbert/test/data/processed/test_seq.csv
Total number of sequences : 612
# of reads in each label: [319. 293.]
Creating Dataloader
Local step batch size : 64
Creating BERT Trainer
The model is loaded on GPU
Pre-trained MethylBERT model for 4 encoder blocks is selected.
Restore the pretrained model hanyangii/methylbert_hg19_4l
Focal loss assigned
Total Parameters: 49817090
Training Start
False
stderr.txt
File "/home/alvin.ngwt/miniconda3/envs/methylbert/bin/methylbert", line 7, in <module>
sys.exit(main())
^^^^^^
File "/home/alvin.ngwt/miniconda3/envs/methylbert/lib/python3.11/site-packages/methylbert/cli.py", line 315, in main
run_finetune(args)
File "/home/alvin.ngwt/miniconda3/envs/methylbert/lib/python3.11/site-packages/methylbert/cli.py", line 182, in run_finetune
trainer.train(args.steps)
File "/home/alvin.ngwt/miniconda3/envs/methylbert/lib/python3.11/site-packages/methylbert/trainer.py", line 123, in train
return self._iteration(steps, self.train_data, verbose)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/alvin.ngwt/miniconda3/envs/methylbert/lib/python3.11/site-packages/methylbert/trainer.py", line 503, in _iteration
for i, batch in enumerate(data_loader):
File "/home/alvin.ngwt/miniconda3/envs/methylbert/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 630, in __next__
data = self._next_data()
^^^^^^^^^^^^^^^^^
File "/home/alvin.ngwt/miniconda3/envs/methylbert/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 1344, in _next_data
return self._process_data(data)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/alvin.ngwt/miniconda3/envs/methylbert/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 1370, in _process_data
data.reraise()
File "/home/alvin.ngwt/miniconda3/envs/methylbert/lib/python3.11/site-packages/torch/_utils.py", line 706, in reraise
raise exception
RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/alvin.ngwt/miniconda3/envs/methylbert/lib/python3.11/site-packages/torch/utils/data/_utils/worker.py", line 309, in _worker_loop
data = fetcher.fetch(index) # type: ignore[possibly-undefined]
^^^^^^^^^^^^^^^^^^^^
File "/home/alvin.ngwt/miniconda3/envs/methylbert/lib/python3.11/site-packages/torch/utils/data/_utils/fetch.py", line 55, in fetch
return self.collate_fn(data)
^^^^^^^^^^^^^^^^^^^^^
File "/home/alvin.ngwt/miniconda3/envs/methylbert/lib/python3.11/site-packages/torch/utils/data/_utils/collate.py", line 317, in default_collate
return collate(batch, collate_fn_map=default_collate_fn_map)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/alvin.ngwt/miniconda3/envs/methylbert/lib/python3.11/site-packages/torch/utils/data/_utils/collate.py", line 155, in collate
clone.update({key: collate([d[key] for d in batch], collate_fn_map=collate_fn_map) for key in elem})
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/alvin.ngwt/miniconda3/envs/methylbert/lib/python3.11/site-packages/torch/utils/data/_utils/collate.py", line 155, in <dictcomp>
clone.update({key: collate([d[key] for d in batch], collate_fn_map=collate_fn_map) for key in elem})
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/alvin.ngwt/miniconda3/envs/methylbert/lib/python3.11/site-packages/torch/utils/data/_utils/collate.py", line 142, in collate
return collate_fn_map[elem_type](batch, collate_fn_map=collate_fn_map)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/alvin.ngwt/miniconda3/envs/methylbert/lib/python3.11/site-packages/torch/utils/data/_utils/collate.py", line 213, in collate_tensor_fn
out = elem.new(storage).resize_(len(batch), *list(elem.size()))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Trying to resize storage that is not resizable
Metadata
Metadata
Assignees
Labels
No labels