Skip to content

3. Advanced

Max Brown edited this page Jul 1, 2022 · 4 revisions

There are some features of goat-cli which may go beyond basic usage.

goat-cli taxon search expressions

A user can pass an expression to goat-cli taxon search to filter the query before it is returned. For a list of variables and the operators you can use, run goat-cli taxon search --print-expression. In addition, at the bottom is a list of valid NCBI taxon ranks to pass to the --tax-rank.

To explain this table briefly we can take the first two rows as examples. For the expression name (variable) assembly_date, we can use the operators in the third column on this variable. Include these in a query with the -e flag. For example:

goat-cli taxon search -dt Arabidopsis -v "assembly_date" -e "assembly_date > 2020-02-27"

Will find assemblies more recent than the given date. The min/max functions can be wrapped around the assembly date here:

goat-cli taxon search -dt Arabidopsis -v "assembly_date" -e "max(assembly_date) > 2020-02-27"

So we are not using the average of dates for each species, but the result of the corresponding function.

The second row is assembly level, which is categorical, and most of the boolean operators applied to assembly_date make no sense here. For categorical variables only equality is supported.

goat-cli taxon search -dt Arabidopsis -v "assembly_level" -e "assembly_level == chromosome"

Select multiple categories with comma separated lists (this is equivalent to an OR, or || operator, which are not allowed by the parser):

goat-cli taxon search -dt Arabidopsis -v "assembly_level" -e "assembly_level == chromosome, scaffold"

 Variable names in GoaT, with functional operator annotation.
+----------------------------------+------------------------------------+---------------------------------+--------------+
|         Expression Name          |            Display Name            |       Operators/Keywords        | Function(s)  |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|          assembly_date           |            Last updated            |     !=, <, <=, =, ==, >, >=     |   min, max   |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|          assembly_level          |           Assembly level           | == complete genome,             |              |
|                                  |                                    | chromosome, scaffold, contig    |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|          assembly_span           |           Assembly span            |     !=, <, <=, =, ==, >, >=     |   min, max   |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|            bioproject            |             bioproject             |                                 |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|            biosample             |             biosample              |                                 |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|        busco_completeness        |         BUSCO completeness         |     !=, <, <=, =, ==, >, >=     |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|          busco_lineage           |           BUSCO lineage            |                                 |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|           busco_string           |            BUSCO string            |                                 |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|             c_value              |              C value               |     !=, <, <=, =, ==, >, >=     |   min, max   |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|        c_value_cell_type         |         C value cell type          | == antennae, antennal gland,    |              |
|                                  |                                    | blood cells, brain, buccal      |              |
|                                  |                                    | epithelium, coelomocytes,       |              |
|                                  |                                    | corneal epithelium, digestive   |              |
|                                  |                                    | gland, dorsal fin clip, egg,    |              |
|                                  |                                    | embyro, epidermis, exopodite,   |              |
|                                  |                                    | fibroblasts, fin clips,         |              |
|                                  |                                    | germarium, gills, haemocytes,   |              |
|                                  |                                    | heart cells, individual         |              |
|                                  |                                    | chromosomes, intestine, kidney  |              |
|                                  |                                    |  cells, legs, leukocytes,       |              |
|                                  |                                    | liver, lung (culture), mantle,  |              |
|                                  |                                    |  midgut, muscle cells, ne, not  |              |
|                                  |                                    |  specified, oocytes, ovaries,   |              |
|                                  |                                    | pancreas, pharynx, polypide     |              |
|                                  |                                    | cells in suspension, red blood  |              |
|                                  |                                    |  cells, retinal cells,          |              |
|                                  |                                    | salivary gland, somatic cells,  |              |
|                                  |                                    |  sperm, spleen, tentacles,      |              |
|                                  |                                    | testes, thymus, tissue          |              |
|                                  |                                    | culture, various, ventral       |              |
|                                  |                                    | hypodermal chord, whole body,   |              |
|                                  |                                    | whole body squash               |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|          c_value_method          |           C value method           | == biochemical analysis, bulk   |              |
|                                  |                                    | fluorometric assay, complete    |              |
|                                  |                                    | genome sequencing, feulgen      |              |
|                                  |                                    | densitometry, feulgen image     |              |
|                                  |                                    | analysis densitometry, flow     |              |
|                                  |                                    | cytometry, flow karyotyping,    |              |
|                                  |                                    | fluorescence fading analysis,   |              |
|                                  |                                    | gallocyanin chrom alum          |              |
|                                  |                                    | densitometry, methyl green      |              |
|                                  |                                    | densitometry, not specified,    |              |
|                                  |                                    | static cell fluorometry,        |              |
|                                  |                                    | ultraviolet microscopy,         |              |
|                                  |                                    | unknown, biochemical analysis,  |              |
|                                  |                                    |  feulgen image analysis         |              |
|                                  |                                    | densitometry, flow cytometry,   |              |
|                                  |                                    | feulgen densitometry, feulgen   |              |
|                                  |                                    | densitometry & flow cytometry,  |              |
|                                  |                                    |  microdensitometry, pulse       |              |
|                                  |                                    | field gel electrophoresis,      |              |
|                                  |                                    | reassociation kinetics, whole   |              |
|                                  |                                    | genome sequencing               |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|        chromosome_number         |         Chromosome number          |     !=, <, <=, =, ==, >, >=     |   min, max   |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|            contig_n50            |             Contig N50             |     !=, <, <=, =, ==, >, >=     |   min, max   |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|           country_list           |            Country list            |            == gb, ie            |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|         ebp_metric_date          |          EBP metric date           |     !=, <, <=, =, ==, >, >=     |   min, max   |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|             echabs92             |     EC Habitats Directive 1992     | == echabs92_annex_iib,          |              |
|                                  |                                    | echabs92_annex_ivb,             |              |
|                                  |                                    | echabs92_annex_iva              |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|      family_representative       |       family_representative        | == asg, cbp, ebpn, ein, cfgp,   |              |
|                                  |                                    | dtol, ebpn, endemixit, erga,    |              |
|                                  |                                    | eurofish, gaga, squalomix,      |              |
|                                  |                                    | metainvert, vgp, agi, arg,      |              |
|                                  |                                    | gap, gbr, omg, tsi, b10k        |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|            gc_percent            |             GC percent             |     !=, <, <=, =, ==, >, >=     |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|            gene_count            |             Gene count             |     !=, <, <=, =, ==, >, >=     |   min, max   |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|           genome_size            |            Genome size             |     !=, <, <=, =, ==, >, >=     |   min, max   |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|        genome_size_draft         |         Genome size draft          |     !=, <, <=, =, ==, >, >=     |   min, max   |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|         genome_size_kmer         |          Genome size kmer          |     !=, <, <=, =, ==, >, >=     |   min, max   |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|           habreg_2017            |   Conservation of Habitats and     |   == habreg-sch2, habreg-sch5   |              |
|                                  |   Species Regulations 2017         |                                 |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|          haploid_number          |           Haploid number           |     !=, <, <=, =, ==, >, >=     |   min, max   |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|           in_progress            |            in_progress             |                                 |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|            insdc_open            |             insdc_open             |                                 |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|      isb_wildlife_act_1976       |   Irish Statute Book Wildlife      |     == iwa-nsch3, iwa-sch5      |              |
|                                  |   Act, 1976                        |                                 |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|            long_list             |             long_list              | == asg, cbp, ebpn, ein, cfgp,   |              |
|                                  |                                    | dtol, ebpn, endemixit, erga,    |              |
|                                  |                                    | eurofish, gaga, squalomix,      |              |
|                                  |                                    | metainvert, vgp, agi, arg,      |              |
|                                  |                                    | gap, gbr, omg, tsi, b10k        |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|          marhabreg-2017          |   Conservation of Offshore         |        == marhabreg-sch1        |              |
|                                  |   Marine Habitats and Species      |                                 |              |
|                                  |   Regulations 2017                 |                                 |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|   mitochondrion_assembly_span    |         mitochondrion span         |     !=, <, <=, =, ==, >, >=     |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|     mitochondrion_gc_percent     |         mitochondrion GC%          |     !=, <, <=, =, ==, >, >=     |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|            n_percent             |             N percent              |     !=, <, <=, =, ==, >, >=     |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|              nohit               |               No hit               |     !=, <, <=, =, ==, >, >=     |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|          odb10_lineage           |        Busco_odb10 lineage         | == aconoidasida_odb10,          |              |
|                                  |                                    | actinopterygii_odb10,           |              |
|                                  |                                    | agaricales_odb10,               |              |
|                                  |                                    | agaricomycetes_odb10,           |              |
|                                  |                                    | alveolata_odb10,                |              |
|                                  |                                    | apicomplexa_odb10,              |              |
|                                  |                                    | arachnida_odb10,                |              |
|                                  |                                    | arthropoda_odb10,               |              |
|                                  |                                    | ascomycota_odb10, aves_odb10,   |              |
|                                  |                                    | basidiomycota_odb10,            |              |
|                                  |                                    | boletales_odb10,                |              |
|                                  |                                    | brassicales_odb10,              |              |
|                                  |                                    | capnodiales_odb10,              |              |
|                                  |                                    | carnivora_odb10,                |              |
|                                  |                                    | cetartiodactyla_odb10,          |              |
|                                  |                                    | chaetothyriales_odb10,          |              |
|                                  |                                    | chlorophyta_odb10,              |              |
|                                  |                                    | coccidia_odb10,                 |              |
|                                  |                                    | cyprinodontiformes_odb10,       |              |
|                                  |                                    | diptera_odb10,                  |              |
|                                  |                                    | dothideomycetes_odb10,          |              |
|                                  |                                    | embryophyta_odb10,              |              |
|                                  |                                    | endopterygota_odb10,            |              |
|                                  |                                    | euarchontoglires_odb10,         |              |
|                                  |                                    | eudicots_odb10,                 |              |
|                                  |                                    | euglenozoa_odb10,               |              |
|                                  |                                    | eukaryota_odb10,                |              |
|                                  |                                    | eurotiales_odb10,               |              |
|                                  |                                    | eurotiomycetes_odb10,           |              |
|                                  |                                    | eutheria_odb10, fabales_odb10,  |              |
|                                  |                                    |  fungi_odb10, glires_odb10,     |              |
|                                  |                                    | glomerellales_odb10,            |              |
|                                  |                                    | helotiales_odb10,               |              |
|                                  |                                    | hemiptera_odb10,                |              |
|                                  |                                    | hymenoptera_odb10,              |              |
|                                  |                                    | hypocreales_odb10,              |              |
|                                  |                                    | insecta_odb10,                  |              |
|                                  |                                    | laurasiatheria_odb10,           |              |
|                                  |                                    | leotiomycetes_odb10,            |              |
|                                  |                                    | lepidoptera_odb10,              |              |
|                                  |                                    | liliopsida_odb10,               |              |
|                                  |                                    | mammalia_odb10, metazoa_odb10,  |              |
|                                  |                                    |  microsporidia_odb10,           |              |
|                                  |                                    | mollusca_odb10,                 |              |
|                                  |                                    | mucorales_odb10,                |              |
|                                  |                                    | mucoromycota_odb10,             |              |
|                                  |                                    | nematoda_odb10,                 |              |
|                                  |                                    | onygenales_odb10,               |              |
|                                  |                                    | passeriformes_odb10,            |              |
|                                  |                                    | plasmodium_odb10,               |              |
|                                  |                                    | pleosporales_odb10,             |              |
|                                  |                                    | poales_odb10,                   |              |
|                                  |                                    | polyporales_odb10,              |              |
|                                  |                                    | primates_odb10,                 |              |
|                                  |                                    | saccharomycetes_odb10,          |              |
|                                  |                                    | sauropsida_odb10,               |              |
|                                  |                                    | solanales_odb10,                |              |
|                                  |                                    | sordariomycetes_odb10,          |              |
|                                  |                                    | stramenopiles_odb10,            |              |
|                                  |                                    | tetrapoda_odb10,                |              |
|                                  |                                    | tremellomycetes_odb10,          |              |
|                                  |                                    | vertebrata_odb10,               |              |
|                                  |                                    | viridiplantae_odb10             |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|          other_priority          |           other_priority           | == asg, cbp, ebpn, ein, cfgp,   |              |
|                                  |                                    | dtol, ebpn, endemixit, erga,    |              |
|                                  |                                    | eurofish, gaga, squalomix,      |              |
|                                  |                                    | metainvert, vgp, agi, arg,      |              |
|                                  |                                    | gap, gbr, omg, tsi, b10k        |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|      plastid_assembly_span       |            plastid span            |     !=, <, <=, =, ==, >, >=     |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|        plastid_gc_percent        |            plastid GC%             |     !=, <, <=, =, ==, >, >=     |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|              ploidy              |               Ploidy               |     !=, <, <=, =, ==, >, >=     |   min, max   |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|  protection_of_badgers_act_1992  |   Protection of Badgers Act 1992   |          == badgers92           |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|            published             |             published              |                                 |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|         sample_acquired          |          sample_acquired           |                                 |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|         sample_collected         |          sample_collected          |                                 |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|       sample_collected_by        |        sample_collected_by         |       == nhm, oxf, sanger       |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|         sample_location          |              location              |                                 |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|            sample_sex            |             Sample sex             |                                 |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|           scaffold_n50           |            Scaffold N50            |     !=, <, <=, =, ==, >, >=     |   min, max   |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|        sequencing_status         |         sequencing_status          | == published, insdc_open,       |              |
|                                  |                                    | in_progress, sample_acquired,   |              |
|                                  |                                    | sample_collected                |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|      sequencing_status_asg       |       sequencing_status_asg        | == published, insdc_open,       |              |
|                                  |                                    | in_progress, sample_acquired,   |              |
|                                  |                                    | sample_collected                |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|      sequencing_status_b10k      |       sequencing_status_b10k       | == published, insdc_open,       |              |
|                                  |                                    | in_progress, sample_acquired,   |              |
|                                  |                                    | sample_collected                |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|     sequencing_status_berri      |      sequencing_status_berri       | == published, insdc_open,       |              |
|                                  |                                    | in_progress, sample_acquired,   |              |
|                                  |                                    | sample_collected                |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|      sequencing_status_blax      |       sequencing_status_blax       | == published, insdc_open,       |              |
|                                  |                                    | in_progress, sample_acquired,   |              |
|                                  |                                    | sample_collected                |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|      sequencing_status_cbp       |       sequencing_status_cbp        | == published, insdc_open,       |              |
|                                  |                                    | in_progress, sample_acquired,   |              |
|                                  |                                    | sample_collected                |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|      sequencing_status_cfgp      |       sequencing_status_cfgp       | == published, insdc_open,       |              |
|                                  |                                    | in_progress, sample_acquired,   |              |
|                                  |                                    | sample_collected                |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|      sequencing_status_dtol      |       sequencing_status_dtol       | == published, insdc_open,       |              |
|                                  |                                    | in_progress, sample_acquired,   |              |
|                                  |                                    | sample_collected                |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|      sequencing_status_durb      |       sequencing_status_durb       | == published, insdc_open,       |              |
|                                  |                                    | in_progress, sample_acquired,   |              |
|                                  |                                    | sample_collected                |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|      sequencing_status_ebp       |       sequencing_status_ebp        | == published, insdc_open,       |              |
|                                  |                                    | in_progress, sample_acquired,   |              |
|                                  |                                    | sample_collected                |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|      sequencing_status_ebpn      |       sequencing_status_ebpn       | == published, insdc_open,       |              |
|                                  |                                    | in_progress, sample_acquired,   |              |
|                                  |                                    | sample_collected                |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|   sequencing_status_endemixit    |    sequencing_status_endemixit     | == published, insdc_open,       |              |
|                                  |                                    | in_progress, sample_acquired,   |              |
|                                  |                                    | sample_collected                |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|      sequencing_status_erga      |       sequencing_status_erga       | == published, insdc_open,       |              |
|                                  |                                    | in_progress, sample_acquired,   |              |
|                                  |                                    | sample_collected                |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|     sequencing_status_ergapi     |      sequencing_status_ergapi      | == published, insdc_open,       |              |
|                                  |                                    | in_progress, sample_acquired,   |              |
|                                  |                                    | sample_collected                |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|    sequencing_status_eurofish    |     sequencing_status_eurofish     | == published, insdc_open,       |              |
|                                  |                                    | in_progress, sample_acquired,   |              |
|                                  |                                    | sample_collected                |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|      sequencing_status_gaga      |       sequencing_status_gaga       | == published, insdc_open,       |              |
|                                  |                                    | in_progress, sample_acquired,   |              |
|                                  |                                    | sample_collected                |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|      sequencing_status_lawn      |       sequencing_status_lawn       | == published, insdc_open,       |              |
|                                  |                                    | in_progress, sample_acquired,   |              |
|                                  |                                    | sample_collected                |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|     sequencing_status_meier      |      sequencing_status_meier       | == published, insdc_open,       |              |
|                                  |                                    | in_progress, sample_acquired,   |              |
|                                  |                                    | sample_collected                |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|   sequencing_status_metainvert   |    sequencing_status_metainvert    | == published, insdc_open,       |              |
|                                  |                                    | in_progress, sample_acquired,   |              |
|                                  |                                    | sample_collected                |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|      sequencing_status_misk      |       sequencing_status_misk       | == published, insdc_open,       |              |
|                                  |                                    | in_progress, sample_acquired,   |              |
|                                  |                                    | sample_collected                |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|      sequencing_status_rnd       |       sequencing_status_rnd        | == published, insdc_open,       |              |
|                                  |                                    | in_progress, sample_acquired,   |              |
|                                  |                                    | sample_collected                |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|   sequencing_status_squalomix    |    sequencing_status_squalomix     | == published, insdc_open,       |              |
|                                  |                                    | in_progress, sample_acquired,   |              |
|                                  |                                    | sample_collected                |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|      sequencing_status_vgp       |       sequencing_status_vgp        | == published, insdc_open,       |              |
|                                  |                                    | in_progress, sample_acquired,   |              |
|                                  |                                    | sample_collected                |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|        sex_determination         |         Sex Determination          |                                 |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|              target              |               Target               |     !=, <, <=, =, ==, >, >=     |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
|            waca_1981             | Wildlife and Countryside Act 1981  |     == waca-sch1, waca-sch5     |              |
+----------------------------------+------------------------------------+---------------------------------+--------------+
 NCBI taxon ranks:

 biotype, clade, class, cohort, family, forma, forma specialis, genotype, genus, infraclass,
 infraorder, isolate, kingdom, morph, no rank, order, parvorder, pathogroup, phylum, section, series, serogroup,
 serotype, species, species group, species subgroup, strain, subclass, subcohort, subfamily, subgenus, subkingdom,
 suborder, subphylum, subsection, subspecies, subtribe, superclass, superfamily, superkingdom, superorder, superphylum,
 tribe, varietas

Let's take a look at a few more examples. We can make a search query across all flowering plants for genome sizes:

goat-cli taxon search -dgt Magnoliopsida

This query returns over 12,000 results. We might only be interested in smaller genomes. Let's use an expression to filter our results:

goat-cli taxon search -dgt Magnoliopsida -e "genome_size < 100000000"

This will return all flowering plant species with genome sizes less than 100Mb. Chain commands with the AND keyword.

goat-cli taxon search -dgkt Magnoliopsida -e "genome_size < 100000000 AND chromosome_number = 40"

OR expressions can only be used in categorical variables, and are expressed using comma syntax:

goat-cli taxon search -dt 2759 -e 'waca_1981 == waca-sch1, waca-sch5'

This will return results for both of those variable levels. If you want one, but want to exlude the other:

goat-cli taxon search -dt 2759 -e 'waca_1981 == waca-sch1, !waca-sch5'

Use the exclamation mark operator before a category. Note (in BASH anyway) you will have to use single quotes, using double quotes may result in an error, as the exclamation mark is a reserved character.

Querying goat-cli taxon search with multiple species

Multiple species can be specified on the command line using a comma separated list to the -t flag. For example:

goat-cli taxon search -t "Arabidopsis thaliana, Zea mays, Anagallis arvensis"

Multiple species can also be specified in a separate file, with one taxon per line. The maximum file size limit is 500 taxa.

Arabidopsis thaliana
Zea mays
Anagallis arvensis
...

goat-cli taxon search variable control

As I started to use above, a user can control the variables selected more tightly with the -v, --variables flag. For example if you want only assembly span data and nothing else, you must specify it here. If you run:

goat-cli taxon search -dat Magnoliopsida

This returns assembly_level and assembly_span by default:

[-]     For search query Magnoliopsida, size specified (50) was less than the number of results returned, (1414).
taxon_id        taxon_rank      scientific_name assembly_level  assembly_span
74524   species Trifolium occidentale   Chromosome      500542700
3847    species Glycine max     Chromosome      990037049
3827    species Cicer arietinum Chromosome      511683526
...

Whereas goat-cli taxon search -dt Magnoliopsida -v "assembly_span" will only return assembly_span:

[-]     For search query Magnoliopsida, size specified (50) was less than the number of results returned, (1216).
taxon_id        taxon_rank      scientific_name assembly_span
74524   species Trifolium occidentale   500542700
3847    species Glycine max     990037049
3827    species Cicer arietinum 511683526
304102  forma   Medicago truncatula f. tricycla 419412056
...

Clone this wiki locally