Added covariate modelling in DESeq2 formula and PCA plots. by RFillinger · Pull Request #57 · vari-bbc/rnaseq_workflow

RFillinger · 2026-02-27T21:26:08Z

No description provided.

kew24 · 2026-02-27T21:49:53Z

resources/deseq_template.Rmd

-dds <- DESeqDataSet(se, design = ~ group)
+
+
+if (gsub(" ", "", params$group_reg_formula ) == ""){


just for ease of reading, I might replace this with something like: if (params$group_reg_formula %in% c("", " ", NA)){.
But, test it out and see what makes sense here with potential user input.

kew24 · 2026-02-27T21:51:17Z

resources/deseq_template.Rmd

+
 ```

+#### The model formula used: `formula`


if you want this to be evaluated, you might need `r formula` -- but test & fact-check me.

kew24 · 2026-02-27T21:52:31Z

resources/deseq_template.Rmd

+**Note, again, that covariates can be included in this analysis**; 
+Include a new column in the `units.tsv` file, 
+then add a new comparison to the `comparisons.tsv` file.
+If you have a more complex experimental setup with additional covariates and/or confounders, 


I'm realizing this type of comment would be helpful to include in the README – perhaps add that as an addition to this PR too. By the time our audience reads this, it feels a little late.

Agree that this text is not needed here. What is needed is to emphasize the design used, especially if 1) we are enforcing a default design of ~group 2) allowing different designs for each row in comparisons.tsv. This is already kind of implemented at https://github.com/vari-bbc/rnaseq_workflow/pull/57/changes#diff-22a091377f50349798518be7ff92f0fb7f05c174d48a7a65793a4c327c998666L159 but it could be useful to make it super obvious.

the link above doesn't work well. I was referring to this line (L159):

Coefficients in the model are: `r paste(resultsNames(dds), collapse = ", ")`.

Also, print an obvious warning if a variable is being treated as numeric in case it was meant to be categorical. This should be explained in the README also.

kew24 · 2026-02-27T21:53:54Z

resources/deseq_template.Rmd

-
  df <- cbind(pcaData$x, col_data)
-
+  df[,cols] = factor(df[,cols])


for consistency in code style, I might recommend changing the = to a <-

I know that's super nitpicky and won't really change anything though :)

kew24 · 2026-02-27T21:56:09Z

resources/deseq_template.Rmd

    out_patchwork <- c(out_patchwork, list(ggplot(df, aes(!!!gg_args)) +
                                             geom_point(size=1) +
-                                             scale_color_manual(values = setNames(c("#440154FF","#2A788EFF", ggsci::pal_npg()(length(levels(in_dds@colData[[cols[i]]]))-2)),
+                                             scale_color_manual(values = setNames(c("#440154FF","#2A788EFF", ggsci::pal_npg()(length(levels(factor(in_dds@colData[[cols[i]]], levels = unique(in_dds@colData[[cols[i]]]))))-2)),


I'm hopeful that there's a better way to write this. Perhaps by levels addition wasn't the best earlier... try just replacing the whole thing with your unique call instead. Something like:

(length(unique(in_dds@colData[[cols[i]]])) - 2)

kew24 · 2026-02-27T21:56:53Z

resources/deseq_template.Rmd

-(make_PCA(vsd, ntop=10000, cols="group") | 
-    (make_PCA(vsd, ntop=10000, cols="group", PCx = 3, PCy = 4))) +
+
+param_list = strsplit(gsub("~", "", params$group_reg_formula), "[+]")[[1]]


same nitpicky codestyle comment as earlier – replace = with <-

In addition to Kaitlyn's comment. Check compatibility with spaces in the formula. For example, "~ batch + condition".

kew24 · 2026-02-27T21:58:16Z

I like this! I've added a few comments to things to change – after those changes + after I actually test out your branch, I'll approve the PR.

genomics-kl · 2026-02-28T18:05:32Z

resources/deseq_template.Rmd

    out_patchwork <- c(out_patchwork, list(ggplot(df, aes(!!!gg_args)) +
                                             geom_point(size=1) +
-                                             scale_color_manual(values = setNames(c("#440154FF","#2A788EFF", ggsci::pal_npg()(length(levels(in_dds@colData[[cols[i]]]))-2)),
+                                             scale_color_manual(values = setNames(c("#440154FF","#2A788EFF", ggsci::pal_npg()(length(levels(factor(in_dds@colData[[cols[i]]], levels = unique(in_dds@colData[[cols[i]]]))))-2)),


This assumes the added covariate(s) is categorical. It is uncommon for us in the BBC, but it is possible/can be desirable for the covariate to be numeric.

This is a good point. @RFillinger, you should be able to assign a discrete vs. continuous color scale depending on the type of covariate (or if this is too crazy, you can just stick w/ the ggplot defaults). Just make sure to keep group colors consistent between this and the heatmap (line 462).

group_lvls <- unique(ht_col_annot$group) ht_col_colors <- list(group=setNames(c("#440154FF","#2A788EFF", ggsci::pal_npg()(length(group_lvls)-2)), nm=group_lvls))

Added covariate modelling in DESeq2 formula and PCA plots.

3f25b01

RFillinger requested review from genomics-kl and kew24 February 27, 2026 21:26

kew24 linked an issue Feb 27, 2026 that may be closed by this pull request

add custom model formula to config #56

Open

Added default formula column to comparisons.tsv

a5ddf72

kew24 reviewed Feb 27, 2026

View reviewed changes

genomics-kl reviewed Feb 28, 2026

View reviewed changes

		dds <- DESeqDataSet(se, design = ~ group)


		if (gsub(" ", "", params$group_reg_formula ) == ""){


		df <- cbind(pcaData$x, col_data)

		df[,cols] = factor(df[,cols])

Conversation

RFillinger commented Feb 27, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kew24 commented Feb 27, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants