Lately I've been working with C++ code. I've got it to work, I've got it right and now I need to make it fast. And the first step of optimization is always profiling. Except I'm working on an R package (and python) which calls the C++ code. So how do I profile across different languages?

Fortunately there's a useful SO answer off which I'll be basing most of this post. We'll assume you're developing an R package in the directory ~/pkg with a src sub-directory which contains your C++ code. Simply follow the steps:

  1. Install the gperftools library: follow instructions for your particular system (for me it's a simple pacman -S gperftools).
  2. Add -lprofiler to the PKG_LIBS section of your Makevars file in pkg/src. Mine looks like PKG_LIBS = $(LAPACK_LIBS) $(BLAS_LIBS) $(FLIBS) -lprofiler.
  3. Make a new file in ~/pkg/src with contents

    #include <Rcpp.h>
    #include "gperftools/profiler.h"
    
    using namespace Rcpp;
    
    // [[Rcpp::export]]
    SEXP start_profiler(SEXP str) {
      ProfilerStart(as<const char*>(str));
      return R_NilValue;
    }
    
    // [[Rcpp::export]]
    SEXP stop_profiler() {
      ProfilerStop();
      return R_NilValue;
    }
    
  4. Recompile your package and load your package
  5. You can profile by running R code:

    start_profiler("/tmp/profile.out")
    run_your_cpp_stuff()
    stop_profiler()
    
    start_profile("/tmp/another-thing.out")
    that_other_thing(1:1000)
    stop_profiler()
    

    where the argument to start_profiler is the filename you want to save the profile in. You can have multiple results; just save them to their own file.

  6. Read the profile with pprof --text pkg/src/pkg.so /tmp/profile.out. My stuff is small enough that just reading the text output is sufficient to identify my bottlenecks. There's also fancier versions like pprof --gv pkg/src/pkg.so /tmp/profile.out
  7. Optimize and repeat