Skip to content
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
76 changes: 76 additions & 0 deletions docs/pipelines/DeepClone.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
# DeepClone pipelines

## Introduction

This page is to summarize the usage of the pipelines and tools used within the context of DeepClone.
The main steps are: duplex library preparation protocol, deepUMIcaller, the generation of duplex metrics and deepCSA.

The documentation and basic information regarding DeepClone can be found in the protocols paper that can be found here:
[protocols.io link](https://www.protocols.io/view/deepclone-an-end-to-end-protocol-to-study-somatic-dm6gp1jodgzp/v2)

You will find the basic list of steps in the website and also the main version of the manuscript and then you can check
for a more detailed explanation of all the steps in the supplementary document also available in protocols.io.

There are some internal definitions on how we use the pipelines but the access to this information is restricted and
should be requested internally to the PROMINENT team.

## Duplex protocol

The steps are described in the protocol, and there is an alternative and more useful version of it in the supplementary material.

Check failure on line 19 in docs/pipelines/DeepClone.md

View workflow job for this annotation

GitHub Actions / Lint Markdown Files

Line length

docs/pipelines/DeepClone.md:19:121 MD013/line-length Line length [Expected: 120; Actual: 129] https://github.com/DavidAnson/markdownlint/blob/v0.37.4/doc/md013.md

Check failure on line 19 in docs/pipelines/DeepClone.md

View workflow job for this annotation

GitHub Actions / Lint Markdown Files

Line length

docs/pipelines/DeepClone.md:19:121 MD013/line-length Line length [Expected: 120; Actual: 129] https://github.com/DavidAnson/markdownlint/blob/v0.37.4/doc/md013.md
We recommend users to use the supplementary material one.

## deepUMIcaller

We use the code available in [deepUMIcaller](https://github.com/bbglab/deepUMIcaller.git), we generally use the dev branch

Check failure on line 24 in docs/pipelines/DeepClone.md

View workflow job for this annotation

GitHub Actions / Lint Markdown Files

Line length

docs/pipelines/DeepClone.md:24:121 MD013/line-length Line length [Expected: 120; Actual: 122] https://github.com/DavidAnson/markdownlint/blob/v0.37.4/doc/md013.md

Check failure on line 24 in docs/pipelines/DeepClone.md

View workflow job for this annotation

GitHub Actions / Lint Markdown Files

Line length

docs/pipelines/DeepClone.md:24:121 MD013/line-length Line length [Expected: 120; Actual: 122] https://github.com/DavidAnson/markdownlint/blob/v0.37.4/doc/md013.md
since this contains the most updated version of the code and is generally stable.

We run it via Seqera platform so that we have full record of the runs and coordination of the different projects.

We always put the work directory in /scratch and the outputs can either go to the s3 or to nobackup or nobackup2.

If you have to access the s3 either for saving concats or for storing the output of deepUMIcaller there,
check the [S3 entry](https://bbglab.github.io/bbgwiki/Cluster_basics/s3/#terminal) in this wiki.

## Metrics

These are a set of metrics that help us understand two key aspects: whether our duplex libraries have properly worked and importantly to estimate how much sequencing output should be requested to avoid undersequencing but most importantly oversequencing.

Check failure on line 36 in docs/pipelines/DeepClone.md

View workflow job for this annotation

GitHub Actions / Lint Markdown Files

Line length

docs/pipelines/DeepClone.md:36:121 MD013/line-length Line length [Expected: 120; Actual: 256] https://github.com/DavidAnson/markdownlint/blob/v0.37.4/doc/md013.md

Check failure on line 36 in docs/pipelines/DeepClone.md

View workflow job for this annotation

GitHub Actions / Lint Markdown Files

Line length

docs/pipelines/DeepClone.md:36:121 MD013/line-length Line length [Expected: 120; Actual: 256] https://github.com/DavidAnson/markdownlint/blob/v0.37.4/doc/md013.md

*When should I run metrics*

Check failure on line 38 in docs/pipelines/DeepClone.md

View workflow job for this annotation

GitHub Actions / Lint Markdown Files

Emphasis used instead of a heading

docs/pipelines/DeepClone.md:38 MD036/no-emphasis-as-heading Emphasis used instead of a heading [Context: "When should I run metrics"] https://github.com/DavidAnson/markdownlint/blob/v0.37.4/doc/md036.md

Check failure on line 38 in docs/pipelines/DeepClone.md

View workflow job for this annotation

GitHub Actions / Lint Markdown Files

Emphasis used instead of a heading

docs/pipelines/DeepClone.md:38 MD036/no-emphasis-as-heading Emphasis used instead of a heading [Context: "When should I run metrics"] https://github.com/DavidAnson/markdownlint/blob/v0.37.4/doc/md036.md

Everytime you do a new deepUMIcaller run and before running deepCSA.

*Why?*

1. To validate you have included all GBs of data available for that library (all lanes and reseqs)
2. To check whether you library has been sequenced to optimal or additional reseq needs to be requested
3. To continue the effort of compiling these metrics to keep improving our understanding of the duplex protocol

*How*

Check failure on line 48 in docs/pipelines/DeepClone.md

View workflow job for this annotation

GitHub Actions / Lint Markdown Files

Emphasis used instead of a heading

docs/pipelines/DeepClone.md:48 MD036/no-emphasis-as-heading Emphasis used instead of a heading [Context: "How"] https://github.com/DavidAnson/markdownlint/blob/v0.37.4/doc/md036.md

Check failure on line 48 in docs/pipelines/DeepClone.md

View workflow job for this annotation

GitHub Actions / Lint Markdown Files

Emphasis used instead of a heading

docs/pipelines/DeepClone.md:48 MD036/no-emphasis-as-heading Emphasis used instead of a heading [Context: "How"] https://github.com/DavidAnson/markdownlint/blob/v0.37.4/doc/md036.md

You can find the instructions on how to run them and additional documentation on metrics in our internal duplex documentation.

Check failure on line 50 in docs/pipelines/DeepClone.md

View workflow job for this annotation

GitHub Actions / Lint Markdown Files

Line length

docs/pipelines/DeepClone.md:50:121 MD013/line-length Line length [Expected: 120; Actual: 126] https://github.com/DavidAnson/markdownlint/blob/v0.37.4/doc/md013.md

Check failure on line 50 in docs/pipelines/DeepClone.md

View workflow job for this annotation

GitHub Actions / Lint Markdown Files

Line length

docs/pipelines/DeepClone.md:50:121 MD013/line-length Line length [Expected: 120; Actual: 126] https://github.com/DavidAnson/markdownlint/blob/v0.37.4/doc/md013.md

## deepCSA

We use the code available in [deepCSA](https://github.com/bbglab/deepCSA.git), we generally use the dev branch
since this contains the most updated version of the code and is generally stable.

We run it via Seqera platform so that we have full record of the runs and coordination of the different projects.

Add the irbcluster profile when running the pipeline so that the default structural parameters are automatically set.

We always put the work directory in /scratch and the outputs usually in nobackup or nobackup2.

## References

Duplex library prep. protocol:

- Morena Pinheiro
- Erika López-Arribillaga
- Nuría Samper

Computational pipelines:

- Ferriol Calvet (main developer)
- Elisabet Figuerola (owns extensive internal documentation)
- Rocío Chamorro (in particular for metrics)
- Miguel Grau (developer)
Loading