fdog.assembly runs without errors, but .phyloprofile contains only reference species, ignoring all input assemblies

Hi there,  
I'm using `fdog.assembly` to search for orthologs of the gene Xrcc2 across ~67 assemblies. The run completes very quickly (~16 sec !!!), without any error messages. However, the resulting `.phyloprofile` file contains only the reference species (`NASVI@7425@2`), and no orthologs are detected in any other taxa. Because it runs so fast I am suspecting that `fdog.assembly` actually ignores my assemblies. I added the assemblies manually, i did not use `fdog.addAssembly`

This is my test script:

My test script looks like this:

```
fdog.assembly \
  --gene Xrcc2 \
--refSpec NASVI@7425@2 \
  --assemblyPath /mnt/ceph-hdd/workspaces/ws/scc_ubzo_sscheu/u17921-bachelor_project/tools/fDOG/fdog/data/assembly_dir \
  --dataPath /mnt/ceph-hdd/workspaces/ws/scc_ubzo_sscheu/u17921-bachelor_project/tools/fDOG/fdog/data \
  --coregroupPath /mnt/ceph-hdd/workspaces/ws/scc_ubzo_sscheu/u17921-bachelor_project/data/fdog_input/orthologs/nasvi/core_orthologs \
  --out /mnt/ceph-hdd/workspaces/ws/scc_ubzo_sscheu/u17921-bachelor_project/results/ \
  --augustus \
  --augustusRefSpec tribolium2012 \
  --checkCoorthologsRef \
  --parallel \
  --gff \
  --isoforms \
  --force
```


output:

```
(fdog_env) ./test_assembly.sh 
Gene: Xrcc2
fDOG reference species: NASVI@7425@2 

Building a consensus sequence
	 ...finished

/mnt/ceph-hdd/workspaces/ws/scc_ubzo_sscheu/u17921-bachelor_project/results//Xrcc2//tmp/Xrcc2.con
Building a block profile ...
	 ...finished 

/mnt/ceph-hdd/workspaces/ws/scc_ubzo_sscheu/u17921-bachelor_project/results//Xrcc2//tmp/Xrcc2.prfl
Searching for orthologs ...
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 67/67 [00:10<00:00,  6.39it/s]
	 ...finished 

Calculating FAS scores ...
	 ...finished 

fDOG-Assembly finished completely in 16.10099506378174seconds.
Group preparation: 0.07499980926513672 	 Ortholog search: 10.596382856369019 	 FAS: 5.388456344604492 
```


The structure of my --dataPath looks like this:


```
├── annotation_dir -> /mnt/ceph-hdd/workspaces/ws/scc_ubzo_sscheu/u17921-bachelor_project/fdog_output/annotation_dir
├── assembly_dir
├── coreTaxa_dir -> /mnt/ceph-hdd/workspaces/ws/scc_ubzo_sscheu/u17921-bachelor_project/fdog_output/coreTaxa_dir
├── searchTaxa_dir -> /mnt/ceph-hdd/workspaces/ws/scc_ubzo_sscheu/u17921-bachelor_project/fdog_output/searchTaxa_dir

```
e.g. I want  to find from my ref species (NASVI@7425@2) in ACHCO@229769@15102025

```
└── assembly_dir
    └── ACHCO@229769@15102025
        ├── blast_dir
        │   ├── ACHCO@229769@15102025.fa.ndb
        │   ├── ACHCO@229769@15102025.fa.nhr
        │   ├── ACHCO@229769@15102025.fa.nin
        │   ├── ACHCO@229769@15102025.fa.njs
        │   ├── ACHCO@229769@15102025.fa.nog
        │   ├── ACHCO@229769@15102025.fa.nos
        │   ├── ACHCO@229769@15102025.fa.not
        │   ├── ACHCO@229769@15102025.fa.nsq
        │   ├── ACHCO@229769@15102025.fa.ntf
        │   └── ACHCO@229769@15102025.fa.nto
        └── ACHCO@229769@15102025.fna

annotation_dir/NASVI@7425@2.json 
coreTaxa_dir/NASVI@7425@2
├── NASVI@7425@2.fa
├── NASVI@7425@2.fa.checked
├── NASVI@7425@2.fa.fai
├── NASVI@7425@2.pdb
├── NASVI@7425@2.phr
├── NASVI@7425@2.pin
├── NASVI@7425@2.pjs
├── NASVI@7425@2.pot
├── NASVI@7425@2.psq
├── NASVI@7425@2.ptf
└── NASVI@7425@2.pto
searchTaxa_dir/NASVI@7425@2
├── NASVI@7425@2.fa
├── NASVI@7425@2.fa.checked
└── NASVI@7425@2.fa.fai
```


same structure it is for all other assemblies. 
If it matters: the headder of every assembly .fa looks similar to this:  
`>JAQFYK010000001.1 Achipteria coleoptrata isolate Gr-Ori-00870 scaffold1_size53908, whole genome shotgun sequence`

Do you have any idea what could be causing this?








Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fdog.assembly runs without errors, but .phyloprofile contains only reference species, ignoring all input assemblies #62

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

fdog.assembly runs without errors, but .phyloprofile contains only reference species, ignoring all input assemblies #62

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions