Skip to content

KeyError: 'edge_26' #38

@jianshu93

Description

@jianshu93

Hi @AndreLamurias,
I have no problems following the pre-preprocessing pipeline, however, when I random subsample reads (25%) and then assembly via metaFlye (same parameters), I have the following error after pre-processing.

(graphmb) bash-4.2$ graphmb --assembly ./assembly_input --outdir output_graphmb --numcores 12
logging to output_graphmb/20241026-095226graphmb_output.log
Running GraphMB 0.2.6
using cuda: False
setting seed to 1
setting tf seed
Reading cache from

DATASET STATS:
number of sequences: 1356
assembly length: 0.116 Gb
assembly N50: 0.258 Mb
assembly average length (Mb): 0.086 max: 3.596 min: 0.001
coverage samples: 1
Graph file found and read
graph edges: 522
contig paths: 1335
total ref markers sets: 58
total ref markers: 104
contigs with one or more markers: 472/1356
max SCGs on one contig: 104, average(excluding 0): 6.809
candidate k0s [33, 34, 35, 36, 37, 38]
SCG contig count min: 16 contigs
edges with overlapping scgs (max=20): [(14, 2), (6, 2), (1, 2)]

==============Running VAE model=====================
setting tf seed
edges with overlapping scgs (max=20): [(14, 2), (6, 2), (1, 2)]
deleted 6 edges with same SCGs
**** Num of edges: 1822
******* Running model: CCVAE **********
***** using edge weights: True ******
***** cluster markers only: False *****
***** self edges only: False *****
***** Using raw kmer+abund features: True
***** SCG neg pairs: (15328, 2)
***** input features dimension: (1356, 104)
Uncaught exception
Traceback (most recent call last):
File "/home/jiz322/miniconda3/envs/graphmb/bin/graphmb", line 8, in
sys.exit(main())
File "/home/jiz322/miniconda3/envs/graphmb/lib/python3.9/site-packages/graphmb/main.py", line 499, in main
vae_embs, _, _ = train_ccvae.run_model_ccvae(dataset, args, logger, 0,
File "/home/jiz322/miniconda3/envs/graphmb/lib/python3.9/site-packages/graphmb/train_ccvae.py", line 170, in run_model_ccvae
cluster_labels, stats, _, hq_bins = compute_clusters_and_stats(
File "/home/jiz322/miniconda3/envs/graphmb/lib/python3.9/site-packages/graphmb/evaluate.py", line 367, in compute_clusters_and_stats
unresolved_contigs_with_scgs = np.array([n for i,n in enumerate(node_names)
File "/home/jiz322/miniconda3/envs/graphmb/lib/python3.9/site-packages/graphmb/evaluate.py", line 368, in
if labels[i] not in positive_clusters and len(dataset.contig_markers[n]) > 0])
KeyError: 'edge_26'

additionally data can be found here: https://drive.google.com/file/d/1ztlDGWfkPf7AZlH4Ey39RWUHg8Q8Js7u/view?usp=sharing

What could be the reason (I installed from the most recent github)?

Thanks,
Jianshu

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions