fix(diann): Add q-value filtering to DIANN big clean function#11
fix(diann): Add q-value filtering to DIANN big clean function#11tonywu1999 merged 2 commits intodevelfrom
Conversation
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📝 WalkthroughWalkthroughThis pull request extends the DIANN converter pipeline to support three new q-value cutoff parameters (global_qvalue_cutoff, qvalue_cutoff, pg_qvalue_cutoff), propagating them through bigDIANNtoMSstatsFormat to downstream functions. Function signatures are updated and parameters are explicitly forwarded, with documentation added. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@R/converters.R`:
- Around line 186-190: The intermediate output path is built by concatenating
strings which breaks when output_file_name contains directories; update the
calls around reduceBigDIANN to construct the path with
file.path(dirname(output_file_name), paste0("reduce_output_",
basename(output_file_name))) (and the analogous construct used at the other
occurrence on line ~194) so the intermediate file is created in the same
directory as output_file_name; change both references to use dirname(...) and
basename(...) wrapped with file.path(...) instead of paste0("reduce_output_",
output_file_name).
🧹 Nitpick comments (2)
R/clean_DIANN.R (1)
25-26: Consider using named arguments in the closure as well.The cutoff parameters are passed positionally to
cleanDIANNChunk. For consistency with the named-argument style adopted at theMSstatsCleancall site (line 53-60), consider using named arguments here too to guard against future parameter reordering.♻️ Suggested change
- diann_chunk <- function(x, pos) cleanDIANNChunk(x, output_path, MBR, quantificationColumn, pos, - global_qvalue_cutoff, qvalue_cutoff, pg_qvalue_cutoff) + diann_chunk <- function(x, pos) cleanDIANNChunk( + input = x, output_path = output_path, MBR = MBR, + quantificationColumn = quantificationColumn, pos = pos, + global_qvalue_cutoff = global_qvalue_cutoff, + qvalue_cutoff = qvalue_cutoff, + pg_qvalue_cutoff = pg_qvalue_cutoff + )R/converters.R (1)
189-190: Consider using named arguments for thereduceBigDIANNcall.The q-value cutoff parameters are passed positionally. Named arguments would make the call more readable and resilient to parameter reordering.
♻️ Suggested change
reduceBigDIANN(input_file, - paste0("reduce_output_", output_file_name), - MBR, - quantificationColumn, - global_qvalue_cutoff, qvalue_cutoff, pg_qvalue_cutoff) + intermediate_file, + MBR = MBR, + quantificationColumn = quantificationColumn, + global_qvalue_cutoff = global_qvalue_cutoff, + qvalue_cutoff = qvalue_cutoff, + pg_qvalue_cutoff = pg_qvalue_cutoff)
| reduceBigDIANN(input_file, | ||
| paste0("reduce_output_", output_file_name), | ||
| MBR, | ||
| quantificationColumn) | ||
| quantificationColumn, | ||
| global_qvalue_cutoff, qvalue_cutoff, pg_qvalue_cutoff) |
There was a problem hiding this comment.
Use file.path() for intermediate file path construction.
paste0("reduce_output_", output_file_name) will produce a broken path when output_file_name contains directory components (e.g., "results/output.csv" → "reduce_output_results/output.csv"). Both references (lines 187 and 194) should be fixed.
🐛 Proposed fix
- reduceBigDIANN(input_file,
- paste0("reduce_output_", output_file_name),
+ intermediate_file <- file.path(dirname(output_file_name),
+ paste0("reduce_output_", basename(output_file_name)))
+ reduceBigDIANN(input_file,
+ intermediate_file,
MBR,
quantificationColumn,
global_qvalue_cutoff, qvalue_cutoff, pg_qvalue_cutoff)
# Preprocess the cleaned data (feature selection, etc.)
msstats_data <- MSstatsPreprocessBig(
- paste0("reduce_output_", output_file_name),
+ intermediate_file,
output_file_name, backend, max_feature_count,Based on learnings: "In R converters across the MSstatsBig package, replace direct concatenation with file names by constructing intermediate file paths using file.path(dirname(output_file_name), paste0("prefix_", basename(output_file_name))). This ensures the intermediate file is created in the correct directory even when output_file_name contains directories."
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| reduceBigDIANN(input_file, | |
| paste0("reduce_output_", output_file_name), | |
| MBR, | |
| quantificationColumn) | |
| quantificationColumn, | |
| global_qvalue_cutoff, qvalue_cutoff, pg_qvalue_cutoff) | |
| intermediate_file <- file.path(dirname(output_file_name), | |
| paste0("reduce_output_", basename(output_file_name))) | |
| reduceBigDIANN(input_file, | |
| intermediate_file, | |
| MBR, | |
| quantificationColumn, | |
| global_qvalue_cutoff, qvalue_cutoff, pg_qvalue_cutoff) |
🤖 Prompt for AI Agents
In `@R/converters.R` around lines 186 - 190, The intermediate output path is built
by concatenating strings which breaks when output_file_name contains
directories; update the calls around reduceBigDIANN to construct the path with
file.path(dirname(output_file_name), paste0("reduce_output_",
basename(output_file_name))) (and the analogous construct used at the other
occurrence on line ~194) so the intermediate file is created in the same
directory as output_file_name; change both references to use dirname(...) and
basename(...) wrapped with file.path(...) instead of paste0("reduce_output_",
output_file_name).
Summary by CodeRabbit
New Features
Documentation