Skip to content

Dealing with memory and speed issues in permutation analysis for large datasets#2

Open
areyesq89 wants to merge 4 commits intotengmx:masterfrom
areyesq89:integerFix
Open

Dealing with memory and speed issues in permutation analysis for large datasets#2
areyesq89 wants to merge 4 commits intotengmx:masterfrom
areyesq89:integerFix

Conversation

@areyesq89
Copy link

Hi @tengmx,

I ran into the problem reported here and dig into debugging the error. It is related to the gcapcPeaks function. It turns out that, for some datasets, the vector of permuted values can be extremely large (>50 billion) and the functions quantile and density just break. I solved this by adding an option to sample permuted values from a uniform distribution. I added a parameter permsamp= that indicates the fraction of permuted values to use for the size of the sample.

I also modified some lines of code to speed them up. For the example dataset, these changes improve the runs by only ~5 seconds, but for larger datasets it makes a more substantial difference.

This version passes R CMD check without problems. Let me know if these suggestions make sense!

Alejandro

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments