for (i in 1:200) { mad(rnorm(1000)) }
10 Debugging in R
Debugging in the narrow sense is the finding and fixing of errors in the code. In the broader sense debugging are the activities that help us understand, analyze, improve, and fix code. This can take on many forms: analysis of log files, memory dumps, reading the code (!), control–flow analysis, profiling, debug messages, interactive debugging, and so on.
Profiling tools, for example, break down the time spent in a program by function calls, even down to individual lines of code. In order to optimize code for speed, you need to profile it first. You might be surprised where a program spends most of the execution time. To reduce run time you can then focus code optimization on those parts of the program.
All bugs are not created equal.
- Any defect related to security or data privacy is critical.
- In analytic software, wrong-number-bugs are critical bugs.
- The dreaded silent wrong-number-bug is the worst. When you compute an \(R^2\) and report it as 1,530 most educated users of the program would recognize this as an obvious bug since most everyone knows that \(0 \le R^2 \le 1\). Still, it is a bad bug. A silent wrong number bug is to compute the \(R^2\) as 0.75, when it should be 0.6. There is no indication to the user of the software that the reported number is incorrect.
10.1 Profiling
Profiling code gives you an idea of how much time is spent executing the lines of code. This sounds simple, but is actually quite tricky. First, you need to have a way to reliably measure time, and you need to distinguish different definitions of “time” on a computer. In most situations we are interested in wall-clock time, referring to the amount of time elapsed when the code execution is measured by taking the difference between the time of day just prior and after the run.
Measuring Execution Time
To time the code
we could wrap it in calls to Sys.time
and take the difference:
<- Sys.time()
st for (i in 1:200) { mad(rnorm(1000)) }
<- Sys.time() - st
diff diff
Time difference of 0.01803303 secs
Sys.time
returns the absolute date and time of day, the precision is on the order of milliseconds but depends on the operating system.
In order to measure execution time of code, it is recommended to use proc.time()
instead, which measures the time the R
process has spend on execution. proc.time()
returns three numbers,
- the CPU time charged for the execution of the user instructions,
- the system time charged for execution by the system on behalf of the calling process,
- the elapsed (wall-clock) time.
proc.time()
user system elapsed
0.339 0.030 0.386
Wrapping the loop timed previously with Sys.time
with calls to proc.time
results in the following:
<- proc.time()
st for (i in 1:200) { mad(rnorm(1000)) }
<- proc.time() - st
diff diff
user system elapsed
0.016 0.001 0.017
You can simplify this operation by using system.time
instead of proc.time
. system.time
makes calls to proc.time
at the beginning and the end of the code execution and reports the difference:
system.time(for(i in 1:200) mad(rnorm(1000)))
user system elapsed
0.016 0.000 0.016
Micro-benchmarking
Benchmarking is the comparison of code alternatives. Sometimes, one of the alternatives is an established set of tests or conditions against which a product is evaluated. TPC, for example, is a non-profit organization that establishes benchmark tests for databases. Right or wrong, if you develop a new database, eventually you will have to evaluate the database against the various TPC benchmarks.
More often benchmarking comes into play in evaluating alternative ways of accomplishing the same result. Which one is faster? Which one requires less memory? Does it matter? This form of benchmarking uses packages designed to evaluate code execution with sub-millisecond precision. One such package in R
is microbenchmark
. It gives a more accurate comparison than the frequently seen system.time(replicate(1000,expr))
expression. However, the package is not available for all versions of R
. For example, it failed to install on my M2 Mac R 4.3.0
. Another package is bench
, which we will use here.
The idea of micro-benchmarking is to run a piece of code multiple times, either a fixed number of times or for a certain length of time and to describe the statistical properties of the run times (min, max, mean, std. deviation, median) as well as other characteristics (memory consumption, garbage collection, etc.).
Consider the following three approaches to compute the square root of the elements of a vector:
<- rep(1,1000)
x <- sqrt(x)
s1 <- x^0.5
s2 <- exp(log(x) * 0.5) s3
Mathematically, they are identical. Numerically, they are equivalent in the sense that they lead to the same answer within the limits of finite precision:
sum(s1-s2)
[1] 0
sum(s1-s3)
[1] 0
To see which of the three methods is fastest, we can run a micro-benchmark with bench:mark()
:
library(bench)
<- mark(
bm sqrt(x),
^0.5,
xexp(log(x)*0.5),
time_unit='us',
) bm
# A tibble: 3 × 6
expression min median `itr/sec` mem_alloc `gc/sec`
<bch:expr> <dbl> <dbl> <dbl> <bch:byt> <dbl>
1 sqrt(x) 0.943 1.31 602282. 7.86KB 60.2
2 x^0.5 3.36 3.73 242654. 7.86KB 24.3
3 exp(log(x) * 0.5) 5.21 5.90 164260. 7.86KB 32.9
plot(bm, type="violin")

The time_unit='us'
option requests that all times are reported in microseconds. If you do not specify this parameter, the times can be reported in different units for the code alternatives. By default, bench::mark
runs each expression for at least (min_time=
) 0.5 seconds and for up to (max_iterations=
) 10,000 iterations.
The built-in sqrt
function is faster than the power computation and much faster than exponentiation the logs.
The violin plot of the results shows that the distribution of the run times is heavily skewed to the right (Figure 10.1). Some individual function calls can exceed the majority of the calls by orders of magnitude—note that the horizontal axis of the plot is on the log scale! You should avoid benchmarking based on the mean run time and instead use the median run time.
Visual Profiler in RStudio
Wrapping code in system.time
is convenient if you want to know the (user, system, elapsed) time for a chunk of code. Micro-benchmarking is helpful to evaluate alternative ways of writing expressions. To measure the performance of larger pieces of code, down to the level of function calls and individual lines of code we use a profiler.
To get a detailed analysis of how much time is spent on lines of user-written code, and how much memory is allocated/de-allocated for those lines of code, you can use the profiler tool that is built into RStudio. You first need to load the profvis
library, then wrap the code you wish to profile in a call to profvis({code-goes-here})
. After the profiling run RStudio opens a window with two tabs. The flame graph provides a visual representation of the time spent in code execution, stacking functions that call each other. The Data
tab provides a tabular breakdown—many users find that easier to comprehend than the flame graph.
The following example calls the profiler for code that computes the histogram, mean, and standard deviation of 1,000 bootstrap samples of the trimmed mean in a sample of \(n=20,000\) observations from a \(G(0,2)\).
library(profvis)
profvis({
set.seed(542)
<- 20000
n <- rnorm(n, mean=0, sd=sqrt(2))
x
<- function(x,b=1000) {
bagger <- length(x)
n <- rep(0,b)
estimates for (i in 1:b) {
# Draw a bootstrap sample
<- sample(n,n,replace=TRUE)
bsample # Compute the trimmed mean of the bootstrap sample
<- mean(x[bsample],trim=0.15)
estimates[i]
}# Compute mean and standard deviation of the estimator
= mean(estimates)
mn = sd(estimates)
sdev hist(estimates)
return (list(B=b, mean=mn, sd=sdev))
}
bagger(x,b=1000)
})
Each block in the flame graph represents a call to a function, or possibly multiple calls to the same function. The width of the block is proportional to the amount of time spent in that function. When a function calls another function, a block is added on top of it in the flame graph. Memory reported in the Data view shows allocated memory as positive values and de-allocated memory as negative values (in megabytes).
The profiler is a great tool for long-running code, and can point you at opportunities for optimizing execution time. However, you need to be aware of how it works and the resulting issues:
The profiler samples the code. Every few milliseconds the profiler stops the interpreter and records which function call it is in and traces back through the call stack. You will get different results from run to run, depending on which lines of code and which functions are being sampled. The variability in profiling results affects the functions the most that execute quickly—they might get sampled once or a few times in one profiling run and might get skipped in another run. Fortunately, those are the functions of least interest in code optimization—you focus on the code where the program spends most of the time.
If your code runs fast, the profiler might not hit any lines of code. On the other hand, you should be able to reliably hit the long-running parts of the code with the profiler.
The profiler does not record some built-in R functions or code written in other languages. Many routines are calling into C, C++, or even Fortran code. The profiler records an overall time for that code, but does not provide a breakdown.
The call stack might appear to be in a reverse order from the code. This is because
R
s lazy execution model might call a routine only when it is needed, rather then where it is specified in the code.You might see a
GC
entry in the profiler. This represents the garbage collector which frees resources no longer needed. Garbage collection is unpredictable, depends on the overall state of the system, and is also time consuming—it can throw off the profiling results if it occurs during a run. However, if you consistently find garbage collection in the profiling run, it can be a sign that the code is spending a lot of time freeing and reallocating memory. There will be corresponding memory allocations and this represents an opportunity to optimize the code to be more memory efficient. For example, instead of adding elements to lists inside a loop, pre-allocating the required memory and filling in the elements will trigger fewer reallocations of memory.
The following code adds elements to a list in a loop. R
has to increase the size of the vector several times, leading to memory allocation and garbage collection (freeing) of previously allocated memory.
profvis({
<- integer()
x for (i in 1:10000) {
<- c(x, i)
x
}
} )
You could achieve the same result with a single memory allocation and without garbage collection with this code
<- rep(1,10000) x
If you fill values in a loop, pre-allocate the result vector to avoid garbage collection:
<- integer(10000)
x for (i in 1:length(x)) {x[i] = i}
10.2 Tracebacks
A traceback is a listing of the call stack of the program. Tracebacks are helpful to see which functions were called when an error occurs. Consider the following example from “Advanced R” by Wickham (2019). Function f
calls function g
which calls function h
and so forth.
<- function(a) g(a)
f <- function(b) h(b)
g <- function(c) i(c)
h <- function(d) {
i if (!is.numeric(d)) {
stop("`d` must be numeric", call. = FALSE)
}+ 10
d }
When we call with a non-numeric argument, an error occurs.
f("a")
Error: `d` must be numeric
If you are in RStudio, you can now click the “Show Traceback” button next to the error message (Figure 10.2).
Alternatively, you can see the traceback with the traceback()
function.
traceback()
10.3 The Interactive Debugger
The RStudio IDE has an interactive debugging tool. Interactive debugging allows you to stop code execution at breakpoints, step through the code, interact with function evaluations, see and change the value of variables, and so forth. It is a valuable tool to find problems in programs and to learn how code works. It is highly recommended to step into the functions you frequently use and examine the code line by line—you learn a lot about how your favorite packages work.
Suppose you want to run a classification model using adaptive boosting on the banana quality data. This data set contains ratings of the fruit quality (Good, Bad) and fruit attributes such as size, weight, sweetness, etc., for 4,000 bananas.
library("duckdb")
<- dbConnect(duckdb(),dbdir = "ads.ddb",read_only=TRUE)
con <- dbGetQuery(con, "SELECT * FROM banana_train")
ban_train dbDisconnect(con)
head(ban_train)
Size Weight Sweetness Softness HarvestTime Ripeness
1 1.706644 -0.03692624 -4.46344950 -1.51004720 4.5640225 -0.04171263
2 3.703947 1.11884890 -3.04337740 0.02899801 -0.9705806 -1.46996620
3 -3.888493 1.32609500 0.04608246 2.25944950 0.5068415 0.73710510
4 -3.052952 -0.58796227 -1.63466790 1.04902060 -0.2098602 -1.81881750
5 1.965954 -1.37867620 -3.14279270 -3.24607060 -0.6116899 1.81526600
6 2.306789 -3.69348800 -0.04255614 0.70101670 2.7060213 3.96723440
Acidity Quality
1 4.3266883 Good
2 -0.5881153 Good
3 2.5832198 Bad
4 3.9573660 Bad
5 -0.5000507 Bad
6 -0.9244213 Good
Suppose we want to debug the code for adaptive boosting in the ada
package. The following statements load the library and invoke the ada::ada
function on the banana training data. Prior to executing the code, execute debugonce(ada::ada)
in the Console. This will set a breakpoint at the entry point of the function.
library(ada)
<- ada(Quality ~ .,
adab data=ban_train,
control=rpart.control(maxdepth=1, cp=-1, minsplit=2, xval=0)
)
When the code above is executed, a browser window pops up when the breakpoint at ada::ada()
is hit. The browser shows the R
source code of the function and highlights the current line of the code (Figure 10.3). The frame of the Console window now shows five icons through which you can navigate the code (Figure 10.4).
- Next (n): Execute the line of code and move to the next line
- Step Into (s): Step into the function called on the line you are on.
- Step Out (f): Run to the end of the current function (or loop) and step back out to the calling function
- Continue (c) : Stop interactive debugging and continue execution of the code
- Stop (Q): Stop interactive debugging and terminate execution of the code
Figure 10.5 shows the result of stepping into the ada
function. This brings you to the ada.formula
function where the model expression is parsed. Advancing with several Next steps will land the cursor on the call to ada.default
. If you now click on Step Into the source file for ada.default
is loaded and you can advance through the function. If you click instead on Next, the call to ada.default
function is completed and debugging resumes on line 22.
During interactive debugging the Environment pane shows the values of the variables and objects in the code and the call stack (Figure 10.6).
You can invoke the interactive debugger through other methods as well:
Setting a break point in the code by either clicking on the line number or with Shift-F9. I have found this method to not work reliably in .Rmd and .qmd documents. It does work in R scripts (.R files).
Invoking the
browser()
command anywhere in the code.Using
debug
ordebugonce
. The former sets a breakpoint that invokes the interactive debugger at every invocation of the function, until theundebug
function removes the breakpoint.debugonce
sets a breakpoint for single execution.
Debugging works on the interpreted R
statements. Many functions call complied code in other languages, C and C++, for example. The interactive RStudio debugger cannot step through this compiled code.