Hi,
Thanks for the great tool. I'm exploring meta transcriptome assemblies with Altesch and I try to run it on a TCGA cohort with about 500 samples.
I first generate the profiles, using:
aletsch --profile -i $bams -p $profile_folder > profiles_out.txt -t 16 --max_group_size 200 --boost_precision
Then I try to create the meta assembly using:
aletsch -i $bams -o $assembly_folder -p $profile_folder -d gtf_folder -t 16 --max_group_size 200 --boost_precision
Doing this works fine for a smaller test set of about 20 samples. The initial results seem to be very messy, compared to for instance the StringTie approach (individual assemblies followed by stringtie --merge). But after ranking them with score.py and -p 0.5 the results look promising. However for all 500 samples I get an out of memory error. I run it on a HPC with 250GB ram via slurm and I allocated all 250GB to the job.
From the supplementary data of the manuscript it seems that you could run the SC-H2 data set, which has about 1000 cells in it with 25GB RAM.
Could you help me figure out what is going on here? Is the --max_group_size 200 maybe wrong or is the --boost_precision not used correctly?
Also what does --boost_precision actually do?
Any help is much appreciated!
Best, Mirko
Hi,
Thanks for the great tool. I'm exploring meta transcriptome assemblies with Altesch and I try to run it on a TCGA cohort with about 500 samples.
I first generate the profiles, using:
aletsch --profile -i $bams -p $profile_folder > profiles_out.txt -t 16 --max_group_size 200 --boost_precisionThen I try to create the meta assembly using:
aletsch -i $bams -o $assembly_folder -p $profile_folder -d gtf_folder -t 16 --max_group_size 200 --boost_precisionDoing this works fine for a smaller test set of about 20 samples. The initial results seem to be very messy, compared to for instance the StringTie approach (individual assemblies followed by stringtie --merge). But after ranking them with
score.pyand-p 0.5the results look promising. However for all 500 samples I get an out of memory error. I run it on a HPC with 250GB ram via slurm and I allocated all 250GB to the job.From the supplementary data of the manuscript it seems that you could run the SC-H2 data set, which has about 1000 cells in it with 25GB RAM.
Could you help me figure out what is going on here? Is the
--max_group_size 200maybe wrong or is the--boost_precisionnot used correctly?Also what does
--boost_precisionactually do?Any help is much appreciated!
Best, Mirko