Skip to content

swap IQ-TREE with piqtree? #1930

@GavinHuttley

Description

@GavinHuttley

Hi developers of the amazing nextstrain!

The developers of IQ-TREE have noted usage of the command-line iqtree application in your workflows and wanted to bring to your attention piqtree (you can get our preprint here). piqtree is a Python package that binds the IQ-TREE routines. There is a negligible difference in compute performance to the original tool. It integrates into Python with cogent3.

We wondered if you'd be interested in using piqtree instead? We think it will reduce your development burden.

I've taken a stab at that below by rewriting your build_iqtree() function. As you can see, it simplifies your code.

I'm happy to make this a formal PR if you'd like, but I thought I'd check here first.

# replacing the `build_iqtree` function with cogent3 alignment and tree objects and
# piqtree for the phylogeny estimation.
def build_iqtree(aln_file, out_file, substitution_model="GTR", nthreads=1, tree_builder_args=None):
    """
    build tree using piqtree
    arguments:
        aln_file    file name of input aligment
        out_file    file name to write tree to
    """
    import cogent3 as c3

    aln = c3.load_aligned_seqs(aln_file, moltype="dna")
    # make sure the seq names won't cause problems with IQ-TREE algorithms
    seqname_aliases = {n: f"s{i}" for i, n in enumerate(aln.names)}
    rn_aln = aln.rename_seqs(lambda x: seqname_aliases[x])
    build_tree = c3.get_app(
        "piq_build_tree", model=substitution_model, num_threads=nthreads, other_options=tree_builder_args
    )
    tree = build_tree(rn_aln)
    # revert back to the original sequence names
    rn_tree = tree.renamed_nodes({a: n for n, a in seqname_aliases.items()})
    rn_tree.write(out_file)
    return rn_tree, aln.names

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions