R functions I wish I knew earlier
Much of my time spent R programming revolves around figuring out ways to automate the creation of shiny reports; during the past two years, I’ve found the following functions pretty neat.
1. browser
and shinyjs::logjs
- debugging functions
browser
interrupts a function and opens a session of R with access to the environment where browser was called from while logjs
sends objects to the (web) browser console.
Both tools are invaluable for debugging and understanding code. I find logjs
to be a good backup to browser
- e.g. I may not have RStudio installed on a production server, but I need to have better diagnostics than error logs.
One great use of browser
is to add it to a function you want to better understand. For instance, let’s look at using it in Vectorize
:
Vectorize
## function (FUN, vectorize.args = arg.names, SIMPLIFY = TRUE, USE.NAMES = TRUE)
## {
## arg.names <- as.list(formals(FUN))
## arg.names[["..."]] <- NULL
## arg.names <- names(arg.names)
## vectorize.args <- as.character(vectorize.args)
## if (!length(vectorize.args))
## return(FUN)
## if (!all(vectorize.args %in% arg.names))
## stop("must specify names of formal arguments for 'vectorize'")
## collisions <- arg.names %in% c("FUN", "SIMPLIFY", "USE.NAMES",
## "vectorize.args")
## if (any(collisions))
## stop(sQuote("FUN"), " may not have argument(s) named ",
## paste(sQuote(arg.names[collisions]), collapse = ", "))
## rm(arg.names, collisions)
## (function() {
## FUNV <- function() {
## args <- lapply(as.list(match.call())[-1L], eval,
## parent.frame())
## names <- if (is.null(names(args)))
## character(length(args))
## else names(args)
## dovec <- names %in% vectorize.args
## do.call("mapply", c(FUN = FUN, args[dovec], MoreArgs = list(args[!dovec]),
## SIMPLIFY = SIMPLIFY, USE.NAMES = USE.NAMES))
## }
## formals(FUNV) <- formals(FUN)
## environment(FUNV) <- parent.env(environment())
## FUNV
## })()
## }
## <bytecode: 0x0000000011e2dc00>
## <environment: namespace:base>
Boy, that’s pretty complicated. But now, maybe we can add a call to browser() in the code to better understand what is happening, line by line.
Vectorize_browser <- function(FUN, vectorize.args = arg.names,
SIMPLIFY = TRUE, USE.NAMES = TRUE)
{
browser()
arg.names <- as.list(formals(FUN))
arg.names[["..."]] <- NULL
arg.names <- names(arg.names)
vectorize.args <- as.character(vectorize.args)
if (!length(vectorize.args))
return(FUN)
if (!all(vectorize.args %in% arg.names))
stop("must specify names of formal arguments for 'vectorize'")
collisions <- arg.names %in% c("FUN", "SIMPLIFY",
"USE.NAMES", "vectorize.args")
if (any(collisions))
stop(sQuote("FUN"), " may not have argument(s) named ",
paste(sQuote(arg.names[collisions]), collapse = ", "))
FUNV <- function() {
args <- lapply(as.list(match.call())[-1L], eval, parent.frame())
names <- if (is.null(names(args)))
character(length(args))
else names(args)
dovec <- names %in% vectorize.args
do.call("mapply", c(FUN = FUN, args[dovec], MoreArgs = list(args[!dovec]),
SIMPLIFY = SIMPLIFY, USE.NAMES = USE.NAMES))
}
formals(FUNV) <- formals(FUN)
FUNV
}
Try this in your own session!
## not run
Vectorize_browser(stats::integrate, vectorize.args = c('lower', 'upper'))
2. dput
dput
provides an inverse to writing code for most R objects. Consider:
x <- list(a=1:7)
dput(x)
## list(a = 1:7)
In addition, dput()
can write to files and even delay evaluation of quoted expressions. I find this useful for when you want an end user to have access to a bit of code you want the user to be able to change, but you also want to treat this code as an object (such as a specification list
for shiny applications). (Also, see ?..deparseOpts
for various ways to style dput
output - including options to preserve quoted expressions).
3. rlang::call2
and splicing with !!!
Sometimes you want to give an end user the ability to exchange functions to be executed on data. This can happen when data, filtering, grouping, and merging stay fixed, but the functions need to be swapped. While dplyr
has many programmatic options for this, data.table
does not. Fortunately, using call2
and splicing simplifies this procedure.
Consider the “assign by reference” function data.table::`:=`
. This function adds a column to a data.table
by reference to another column in that data.table
but without needing to copy anything. It’s basically mutate
from dplyr
but more efficient and can operate with respect to joins and data.table
’s $i$
argument. How can we allow a user to construct their own calls to :=
?
library(data.table)
library(rlang)
data(mtcars)
setDT(mtcars)
user_functions <- list(var1 = c('sum', 'hp'),
var2 = c('prop.table', 'mpg'))
# if you don't trust the user, make sure to approve the functions
# they may use first - may only want to allow sum, unique, prop.table, etc...
calls <- lapply(user_functions,
function(x) call(x[[1]], as.name(x[-1])))
mtcars[,eval(call2(':=', !!!calls ))]
## var1 var2
## 1: 4694 0.03266449
## 2: 4694 0.03266449
## 3: 4694 0.03546430
What we’ve done is spliced language objects into :=
. Before splicing, it was hard to construct these sorts of calls because the arguments to :=
must happen sequentially.
However, even this problem can be overcome with do.call
.
4. do.call
- a combination of call
and eval
Let’s say you want to programmatically add a certain number of tabs to a shiny report. Perhaps you get stuck because shiny::tabsetPanel
takes only ...
, so you know it’s expecting tabs to be added sequentially, i.e. shiny::tabsetPanel(tabPanel('tab1'), tabPanel('tab2'))
. You can always overcome this difficulty with do.call
.
my_tabs <- c('tab1', 'tab2')
my_tabPanels <- lapply(my_tabs, shiny::tabPanel)
# wont work - shiny::tabsetPanel(...=my_tabs)
do.call(shiny::tabsetPanel, args = my_tabPanels)
5. system.file
system.file
is the answer to self-referencing within your own R package. It can find a file or folder within the user’s directory structure, just given a package name.
system.file('help', package='base')
## [1] "C:/PROGRA~1/R/R-41~1.2/library/base/help"
For those who make heavy use of the ‘Inst’ directory and auxiliary javascript and CSS files, it’s nice to have a way to access those that works from machine to machine.