Lately I've been working with C++ code. I've got it to work, I've got it right and now I need to make it fast. And the first step of optimization is always profiling. Except I'm working on an R package (and python) which calls the C++ code. So how do I profile across different languages?

Fortunately there's a useful SO answer off which I'll be basing most of this post. We'll assume you're developing an R package in the directory ~/pkg with a src sub-directory which contains your C++ code. Simply follow the steps:

1. Install the gperftools library: follow instructions for your particular system (for me it's a simple pacman -S gperftools).
2. Add -lprofiler to the PKG_LIBS section of your Makevars file in pkg/src. Mine looks like PKG_LIBS = $(LAPACK_LIBS)$(BLAS_LIBS) \$(FLIBS) -lprofiler.
3. Make a new file in ~/pkg/src with contents

#include <Rcpp.h>
#include "gperftools/profiler.h"

using namespace Rcpp;

// [[Rcpp::export]]
SEXP start_profiler(SEXP str) {
ProfilerStart(as<const char*>(str));
return R_NilValue;
}

// [[Rcpp::export]]
SEXP stop_profiler() {
ProfilerStop();
return R_NilValue;
}

5. You can profile by running R code:

start_profiler("/tmp/profile.out")
run_your_cpp_stuff()
stop_profiler()

start_profile("/tmp/another-thing.out")
that_other_thing(1:1000)
stop_profiler()


where the argument to start_profiler is the filename you want to save the profile in. You can have multiple results; just save them to their own file.

6. Read the profile with pprof --text pkg/src/pkg.so /tmp/profile.out. My stuff is small enough that just reading the text output is sufficient to identify my bottlenecks. There's also fancier versions like pprof --gv pkg/src/pkg.so /tmp/profile.out
7. Optimize and repeat