-
Notifications
You must be signed in to change notification settings - Fork 2
feat: Allow comparison of XVector objects (part 2 of 2) #6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: devel
Are you sure you want to change the base?
Conversation
…sing C stack overflow
Hi Aidan, Sorry for the slow response. Comparing 2 DNAString objects (or other XString derivatives) with
I made this decision a long time ago, based on my feeling at the time that this semantic would be more useful than a vectorized letter-wise comparison. The letter-wise comparison can be useful too, and you can obtain it by coercing the 2 objects to
I don't know if this is documented somewhere but it probably should. Had I made the decision to go the other way around (i.e. have Note that XRaw objects (which XString objects derive from) also compare atomically:
Anyways, for better or worse, we're stuck with the atomic comparison 😉 This means that Note that all these comparisons are already implemented for XStringSet derivatives so an easy way to make them work on XString objects is to coerce the latter to the former (this coercion is a no-copy operation so is very cheap). Something like this:
Note that there's no need to define the
Thanks for working on this. The S4Vectors/IRanges/XVector/Biostrings code base is a maze with 2 levels, the R level and the C level, each of them split across 4 packages. So 8 regions! It can be hard to navigate and understand the interactions between the 8 regions. It's great that you were able to find your way from the R level in Biostrings all the way down to the C level in S4Vectors. This was a good exercise that will help you in the future. Thanks again and let me know if you have questions, H. |
Sorry for the slow response -- thanks for the feedback!
Makes sense. I can change this to match that implementation. At the very least, there's an
Also makes sense. I can open a PR with that change after I fix these PRs. I'll work on updating these tomorrow. The only remaining issue is what to do in a 1:2 == 1:2
## [1] TRUE TRUE
XInteger(2,1:2) == XInteger(2,1:2)
## [1] TRUE
1:2 == XInteger(2,1:2)
## [1] TRUE TRUE (?)
## [1] TRUE (?) I'm thinking the best implementation is to just coerce the function(x_vec, y_any){
x == as(y_any, class(x))
} While this breaks consistency with the base R types, it does preserve the expected behavior of comparing XInteger(2,1:2) == as("A", "XInteger")
## Error in as("A", "XInteger") :
## no method or default for coercing “character” to “XInteger” which is probably a nice functionality. Maybe I'll also add in some documentation updates to man page to address the element-wise comparison you mentioned; I agree that it should probably be documented somewhere. |
New update in response to these comments -- the following comparisons now work:
I've also left in the
I've also updated the relevant documentation files to add a description of both of these functionalities for users. Some brief notes on decisions made here:
Example of faulty setMethod("order", "XVector",
function(..., na.last=TRUE, decreasing=FALSE, method=c("auto", "shell", "radix")){
args <- list(...)
if (length(args) == 1L) {
x <- args[[1L]]
SharedVector.order(x, decreasing)
} else {
args <- unname(args)
do.call(order, c(args, list(na.last=na.last,
decreasing=decreasing,
method=method)))
}
}
)
x <- XInteger(10)
y <- XInteger(10)
order(x,y) ## errors using do.call, works using lapply(args, order, decreasing=decreasing) |
e6df994
to
4198784
Compare
4198784
to
6e6e640
Compare
couple force pushes to make commit amends, no major changes from last comment |
Hi Aidan, There's a lot going on here and I think it's going to be easier/simpler if we leave aside the XInteger comparison for now. I think the PR should just focus on XString objects. More precisely, it should focus on:
Right now most of these comparisons are broken, as documented in Biostrings' TODO file here: https://github.com/Bioconductor/Biostrings/blob/44e8aa36353658a53a8ff7141082c024b6b39b9c/TODO#L62-L114 Comparisons between XInteger objects or between XDouble objects are also broken in a similar way, and they will need to be addressed at some point. However, fixing them will have additional challenges. Plus these comparisons are not related to or used by Biostrings, so I'd say it's ok to leave them alone for now. So do you think you can modify this PR, or create a new PR (up to you), that focuses on (a) and (b) above? Given my previous comment from Jan 18 above, the new (or modified) PR should be much simpler. In particular, I don't anticipate the need to touch anything at the C level. Hope this makes sense, Thanks, |
Sorry for the slow followup -- haven't had a lot of bandwidth outside of finishing up my dissertation. I've adapted this PR into a Biostrings-specific one available here: Bioconductor/Biostrings#130 This addresses these issues specifically for XStrings (and XString-character comparisons). That should be sufficient for now; feel free to close this PR or leave it open for reference when the aforementioned issues with XInteger/XDouble need to be addressed. |
This is part 2 of a PR, you can find part 1 here: Bioconductor/S4Vectors#127
Background is covered in that PR description; this one will just cover was wasn't mentioned there.
This PR implements the XVector methods required to allow comparisons between XVectors. After merging both these PRs, the following are now supported:
And similarly for
XRaw
andXDouble
objects.Known Issues:
SharedVector.order
orXVector.order
. That implementation will depend on the backend being implemented in S4Vectors (see this comment).sameAsPreviousROW
function exists forSharedVector
objects, onlyXVector
. Do we need one forSharedVector
? Since the class isn't exported I'm not sure if it's necessary.