Skip to content

rlewinter/RcppPkgDemo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

About

This project uses Rcpp to create an R package implemented in C++. I use Roxygen2 to manage documentation, which turns specially formatted comments into .Rd files. These are the help files displayed when you run ?foo() in the R console, where foo() is a function.

The package consists of C++ functions for generating random numbers from the Poisson distribution using algorithms described in the book Non-Uniform Random Variate Generation by Luc Devroye.

Random number generators for various distributions are needed for many applications, such as sampling or modeling stochastic processes.

This project is intended as an exercise in package development with C++ for R. It is also a good reference and reusable foundation, since the options and setup reflect the type of requirements I'll most likely encounter in future projects.

RcppPkgDemo uses standard R Documentation accessible at the console with the ? prefix, e.g. ?hello displays documentation for hello()

Package Development Tips

Some useful keyboard shortcuts for package authoring:

  • Build and Reload Package: 'Ctrl + Shift + B'

  • Check Package: 'Ctrl + Shift + E'

  • Test Package: 'Ctrl + Shift + T'

This project uses Rcpp to create an R package implemented in C++. Most of the behind-the-scenes work is done automatically by devtools.

Note: after creating a project with initial files for making a package, run usethis::use_rcpp() to set up the package for Rcpp.

To set up the package for testing with testthat, run usethis::use_testthat().

To create a skeleton test file, run usethis::use_test("name").

To run all tests for the package, run devtools::test().

Never use library() or require(), since these modify the search path.

Use the DESCRIPTION file to specify package requirements.

Calling usethis::use_package() automatically updates DESCRIPTION.

To make C++ classes available as reference classes in R with Rcpp modules, call Rcpp::loadRcppModules().

I use roxygen2 to manage documentation, which turns specially formatted comments into .Rd files. These are the help files displayed when you run ?foo() in the R console, where foo() is a function.

Use devtools::document() to process roxygen2 comments.

Run Rcpp::compileAttributes() whenever functions are added, removed, or have their signatures changed. This is automatically done when the package is built if you are building with RStudio or devtools.

Use devtools::build() to bundle up the package in a .tar.gz

Rcpp's functionality is provided through a set of C++ classes that wrap R data structures: NumericVector, IntegerVector, CharacterVector, LogicalVector, int, double, bool, String, IntegerMatrix, NumericMatrix, LogicalMatrix, CharacterMatrix, List, DataFrame, etc.

Rcpp knows how to convert from many STL data structures to their R equivalents, so you can return them from your functions without explicitly converting to R data structures.

Rcpp also provides functions that work similarly to the equivalents in R: any(), all(), head(), tail(), abs(), ceil(), choose(), factorial(), mean(), sum(), sd(), cumsum(), match(), duplicated(), unique(), dnorm(), runif(), etc.

Learn more about Rcpp at:

Some examples and information in this demo are adapted from the following sources:

Installation

The package is available for download from GitHub, which is easily done with devtools (see code chunk below).

Since this project has to do with choosing algorithms for generating random numbers, I'll be using the R package microbenchmark to compare implementations, so the following code chunk also installs and loads the microbenchmark package.

if (!"devtools" %in% rownames(installed.packages())) {
  install.packages("devtools")
}
devtools::install_github("rlewinter/RcppPkgDemo")
library(RcppPkgDemo)

if (!"microbenchmark" %in% rownames(installed.packages())) {
  install.packages("microbenchmark")
}
library(microbenchmark)

Example comparison of vector summation functions

In RcppPkgDemo, I've implemented 2 versions of a function that loops through a numeric vector and returns the sum of all items.

  • sumC is implemented in C++ and uses a standard loop

  • sumI is implemented in C++ and uses an iterator

Now let's write a function sumR in R that uses the same strategy as sumC:

sumR <- function(x) {
  total <- 0
  for (i in seq_along(x)) {
    total <- total + x[i]
  }
  total
}

And now compare the performance results on a 1000 item vector:

x <- runif(1e3)
knitr::kable(summary(microbenchmark(sumC(x), sumI(x), sumR(x))), caption="Comparison of Vector Summations (Microseconds)")

Comparison of RcppPkgDemo functions for generating Poisson values

Functions:

  • pois_knuth - For each value generated, algorithmic complexity is proportional to parameter lambda. Underflow occurs for lambda greater than 103.972, resulting in junk data.

  • pois_junhao - Based on Knuth. This algorithm can handle values of lambda up to 88.722, after which point it fails to terminate. This is strange because it is intended to handle larger lambda than Knuth.

  • pois_iss - Uses inversion by sequential search. Fails to terminate for lambda greater than 102.873.

  • pois_mult - Uses multiplication of uniform random variates. Underflows for lambda greater than 103.972, resulting in junk data.

  • pois_eit - Uses exponential inter-arrival times. Works for large values of lambda, presumably due to use of an exponential random variate instead of uniform, like every other function, since this does not require storage of very small numbers.

knitr::kable(summary(microbenchmark(pois_mult(1, 100000), pois_knuth(1, 100000), pois_eit(1, 100000), pois_iss(1, 100000), pois_junhao(1, 100000))), caption = "Comparison of Poisson Generators (Milliseconds)")

About

An example of creating an R package with functions implemented in C++ using Rcpp

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors