Skip to content

Conversation

@SophieS9
Copy link
Contributor

@SophieS9 SophieS9 commented Aug 5, 2025

WORK IN PROGRESS

See AWGL/DragenQC#7 for more details on running.

As a note, for testing, bcftools is currently not installed/available on the dragen. Hopefully we can rectify this with some internet soon, so for now, after the sv calling is finished the script will crash. Manually copy over the sv_calling folder from the dragen and run on the landing server as follows:

#ON THE DRAGEN
cp -r /staging/data/results/<RUNID>/<PANEL>/sv_calling /mnt/Data-MSA/results/<RUNID>/<PANEL>

#ON VSTOR
cd  /Data-MSA/results/<RUNID>/<PANEL>
/Data-MSA/diagnostics/apps/bcftools-1.22/bcftools merge -m none -F x sv_calling/*.vcf.gz > <RUNID>.sv.vcf
/Data-MSA/diagnostics/apps/bcftools-1.22/htslib-1.22/bgzip <RUNID>.sv.vcf
/Data-MSA/diagnostics/apps/bcftools-1.22/htslib-1.22/tabix <RUNID>.sv.vcf.gz
rm -r sv_calling
scp <RUNID>.sv.vcf.gz* dragen@192.168.1.19:/staging/data/results/<RUNID>/<PANEL>

Then go back to the dragen, make a patch script for DragenGE in the NTC folder just to re-run the end of the script after the bcftools bit

Copy link
Contributor

@josephhalstead josephhalstead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Only a few additions

  1. Consider using trap function to delete marker file on crash
  2. Make a marker when we first make the results dir on /mnt/Data-MSA/results/seqid/sync_required. Will let future cron on landing server know to starts ycning that directory to bucket.,

fi

# Remove lock file from dragen
rm /mnt/Data-MSA/raw/dragen_markers/${seqId}_*_locked
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wonder if we could use the TRAP function in bash to delete the marker file if the script crashes? Not sure

@josephhalstead
Copy link
Contributor

Also we might be able to update python by transfering the rpm directly:

# On landing server
wget http://mirror.centos.org/centos/7/updates/x86_64/Packages/python-2.7.5-92.el7_9.x86_64.rpm # or whatever python 2 version fixes it
scp python-2.7.5-92.el7_9.x86_64.rpm root@192.168.1.19:/tmp/

# On draegb
sudo rpm -Uvh /tmp/python-2.7.5-92.el7_9.x86_64.rpm

@SophieS9
Copy link
Contributor Author

  • I've managed to upgrade the dragen using a variation of that suggestion with the rpm files.
  • I've found a solution to get bcftools working too, via a conda environment in the mount.
  • For your comments:
    • Consider using trap function to delete marker file on crash.
      I actually think we want to keep the marker file on crash? Otherwise another run might kick off and there won't be space for it on the dragen?
    • Make a marker when we first make the results dir on /mnt/Data-MSA/results/seqid/sync_required. Will let future cron on landing server know to starts ycning that directory to bucket.
      I've added this to the DragenQC script (which is now in the landing_server_infra repository) so that the sync kicks off immediately.

@SophieS9
Copy link
Contributor Author

SophieS9 commented Aug 21, 2025

All changes done and now ready for re-review @josephhalstead

I've put run 250606_A00748_0707_BHTLW3DRX5 on the landing server for you to test.

If you want to test a different run, here are the instructions:
Log into the landing server as awmgs@10.69.115.25 (password in manager as rwmbvsrvmgstor1.cymru.nhs.uk [10.69.115.25] VSTOR). You need to use L:\Bioinformatics\New_Putty to do this as old putty doesn't work.
Copy a WES run from the webserver mount to the landing server mount (this is pretending to be the sequencer) as follows

scp -r root@10.69.115.27:/mnt/wren/wren_archive/novaseq/<RUNID> /Data-MSA/raw/novaseq

(Note this has to be the root user on the webserver, scp won't work with other users due to the size of the message when you log in to the webserver. Bizarre right?)

Rename the _RTAComplete.txt so the kick off script thinks it's not been processed.

mv /Data-MSA/raw/novaseq/<RUNID>/_RTAComplete.txt /Data-MSA/raw/novaseq/<RUNID>/RTAComplete.txt

Kick off the pipeline with the new cron job

bash /Data-MSA/diagnostics/pipelines/landing_server_infra/landing_server_infra-development/start_pipelines_cron.sh

Analysis will run on dragen1. Log file in /Data-MSA/raw/logs.
Data will copy to /Data-MSA/results/ after each sample is analysed and then again at the end.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants