Skip to content

fix(diann): Add q-value filtering to DIANN big clean function#11

Merged
tonywu1999 merged 2 commits intodevelfrom
fix-diann
Feb 8, 2026
Merged

fix(diann): Add q-value filtering to DIANN big clean function#11
tonywu1999 merged 2 commits intodevelfrom
fix-diann

Conversation

@tonywu1999
Copy link
Contributor

@tonywu1999 tonywu1999 commented Feb 6, 2026

Summary by CodeRabbit

  • New Features

    • Added three new optional quality value cutoff parameters to the DIANN converter function (global_qvalue_cutoff, qvalue_cutoff, pg_qvalue_cutoff), each with a default value of 0.01 for backward compatibility.
  • Documentation

    • Updated documentation to describe the new quality filtering parameters and their defaults.

@coderabbitai
Copy link

coderabbitai bot commented Feb 6, 2026

Warning

Rate limit exceeded

@tonywu1999 has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 24 minutes and 13 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📝 Walkthrough

Walkthrough

This pull request extends the DIANN converter pipeline to support three new q-value cutoff parameters (global_qvalue_cutoff, qvalue_cutoff, pg_qvalue_cutoff), propagating them through bigDIANNtoMSstatsFormat to downstream functions. Function signatures are updated and parameters are explicitly forwarded, with documentation added.

Changes

Cohort / File(s) Summary
DIANN Converter Pipeline
R/clean_DIANN.R, R/converters.R
Added three new q-value cutoff parameters to bigDIANNtoMSstatsFormat and forwarded them through reduceBigDIANN to cleanDIANNChunk. Replaced positional arguments with explicit named parameters in MSstatsClean call.
Function Documentation
man/bigDIANNtoMSstatsFormat.Rd
Updated documentation to describe the three new cutoff parameters (global_qvalue_cutoff, qvalue_cutoff, pg_qvalue_cutoff), their defaults, and expanded quantificationColumn description for DIANN 2.x behavior.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Suggested reviewers

  • Rudhik1904

Poem

🐰 Three cutoffs hop along the way,
Through converter tunnels, they gently sway,
From big to small, they journey far,
Each parameter a guiding star,
The pipeline flows, the workflow's tight! ✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title accurately describes the main change: adding q-value filtering parameters (global_qvalue_cutoff, qvalue_cutoff, pg_qvalue_cutoff) to the DIANN conversion function, which is clearly evident across all modified files.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix-diann

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@R/converters.R`:
- Around line 186-190: The intermediate output path is built by concatenating
strings which breaks when output_file_name contains directories; update the
calls around reduceBigDIANN to construct the path with
file.path(dirname(output_file_name), paste0("reduce_output_",
basename(output_file_name))) (and the analogous construct used at the other
occurrence on line ~194) so the intermediate file is created in the same
directory as output_file_name; change both references to use dirname(...) and
basename(...) wrapped with file.path(...) instead of paste0("reduce_output_",
output_file_name).
🧹 Nitpick comments (2)
R/clean_DIANN.R (1)

25-26: Consider using named arguments in the closure as well.

The cutoff parameters are passed positionally to cleanDIANNChunk. For consistency with the named-argument style adopted at the MSstatsClean call site (line 53-60), consider using named arguments here too to guard against future parameter reordering.

♻️ Suggested change
-  diann_chunk <- function(x, pos) cleanDIANNChunk(x, output_path, MBR, quantificationColumn, pos,
-                     global_qvalue_cutoff, qvalue_cutoff, pg_qvalue_cutoff)
+  diann_chunk <- function(x, pos) cleanDIANNChunk(
+    input = x, output_path = output_path, MBR = MBR,
+    quantificationColumn = quantificationColumn, pos = pos,
+    global_qvalue_cutoff = global_qvalue_cutoff,
+    qvalue_cutoff = qvalue_cutoff,
+    pg_qvalue_cutoff = pg_qvalue_cutoff
+  )
R/converters.R (1)

189-190: Consider using named arguments for the reduceBigDIANN call.

The q-value cutoff parameters are passed positionally. Named arguments would make the call more readable and resilient to parameter reordering.

♻️ Suggested change
   reduceBigDIANN(input_file, 
-                 paste0("reduce_output_", output_file_name),
-                 MBR,
-                 quantificationColumn,
-                 global_qvalue_cutoff, qvalue_cutoff, pg_qvalue_cutoff)
+                 intermediate_file,
+                 MBR = MBR,
+                 quantificationColumn = quantificationColumn,
+                 global_qvalue_cutoff = global_qvalue_cutoff,
+                 qvalue_cutoff = qvalue_cutoff,
+                 pg_qvalue_cutoff = pg_qvalue_cutoff)

Comment on lines 186 to +190
reduceBigDIANN(input_file,
paste0("reduce_output_", output_file_name),
MBR,
quantificationColumn)
quantificationColumn,
global_qvalue_cutoff, qvalue_cutoff, pg_qvalue_cutoff)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Use file.path() for intermediate file path construction.

paste0("reduce_output_", output_file_name) will produce a broken path when output_file_name contains directory components (e.g., "results/output.csv""reduce_output_results/output.csv"). Both references (lines 187 and 194) should be fixed.

🐛 Proposed fix
-  reduceBigDIANN(input_file, 
-                 paste0("reduce_output_", output_file_name),
+  intermediate_file <- file.path(dirname(output_file_name),
+                                 paste0("reduce_output_", basename(output_file_name)))
+  reduceBigDIANN(input_file, 
+                 intermediate_file,
                  MBR,
                  quantificationColumn,
                  global_qvalue_cutoff, qvalue_cutoff, pg_qvalue_cutoff)
  
   # Preprocess the cleaned data (feature selection, etc.)
   msstats_data <- MSstatsPreprocessBig(
-    paste0("reduce_output_", output_file_name),
+    intermediate_file,
     output_file_name, backend, max_feature_count,

Based on learnings: "In R converters across the MSstatsBig package, replace direct concatenation with file names by constructing intermediate file paths using file.path(dirname(output_file_name), paste0("prefix_", basename(output_file_name))). This ensures the intermediate file is created in the correct directory even when output_file_name contains directories."

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
reduceBigDIANN(input_file,
paste0("reduce_output_", output_file_name),
MBR,
quantificationColumn)
quantificationColumn,
global_qvalue_cutoff, qvalue_cutoff, pg_qvalue_cutoff)
intermediate_file <- file.path(dirname(output_file_name),
paste0("reduce_output_", basename(output_file_name)))
reduceBigDIANN(input_file,
intermediate_file,
MBR,
quantificationColumn,
global_qvalue_cutoff, qvalue_cutoff, pg_qvalue_cutoff)
🤖 Prompt for AI Agents
In `@R/converters.R` around lines 186 - 190, The intermediate output path is built
by concatenating strings which breaks when output_file_name contains
directories; update the calls around reduceBigDIANN to construct the path with
file.path(dirname(output_file_name), paste0("reduce_output_",
basename(output_file_name))) (and the analogous construct used at the other
occurrence on line ~194) so the intermediate file is created in the same
directory as output_file_name; change both references to use dirname(...) and
basename(...) wrapped with file.path(...) instead of paste0("reduce_output_",
output_file_name).

@tonywu1999 tonywu1999 merged commit 42627ab into devel Feb 8, 2026
2 checks passed
@tonywu1999 tonywu1999 deleted the fix-diann branch February 8, 2026 17:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant