This semester I'll experimenting with an app called TagTime to track how I spend my time.


See this post from the creators1 for more details, but essentially the TagTime app provides an easy way for me to track how I'm spending my time. It pings me at random times (more on this later) at which time I record exactly what I'm doing through the use of tags. For example I'll have tags such as PhD for general school work, TA for my office hours and grading, ADA for my ADA project, hw for homework, etc.


As a graduate student, I have a lot of flexibility and independence in how I spend my time. While I greatly enjoy the freedom this brings, I also need to be vigilant to make sure that I am making good use of my time. As a statistician, the first step to deciding how I spend my time is to collect data 2.

I refuse to use a traditional approach such as keeping a journal or clocking in/out of tasks because I lack the dedication and attentiveness to maintain those types of records. I need something relatively automatic and effortless.

Because TagTime pings me, I have an instant reminder to keep up with the records. I don't expect I'll have to spend more than a few seconds recording my tags as I'm shooting for a big picture accounting of my time rather than a fine-grained record.

Statistical Background

Beyond the benefits to my productivity, I also greatly enjoy this tool because it appeals to my inner (and outer) stats geek.


The times between pings are drawn iid from an exponential distribution with a known rate. Thus the pings follow a Poisson process, which gives us the following interesting result.

See this paper for the proof, but the basic result is that the fraction of pings in which I'm performing a certain activity will converge in probability to the fraction of times I'm performing that activity 3. Thus I'll have a consistent estimate of the fraction of time I spend on each activity.

Confidence Intervals

The result above is nice, because it allows me to decide how to characterize uncertainty in my estimates. Since I'm fitting a sample proportion I'll use a binomial model for each of the tags.

Thus for my point estimate I'll have \(\hat{p}\) the fraction of times I observe a certain tag, with variance \(\hat{p}(1 - \hat{p})/N\).

Memoryless Property

Another cool aspect is that the exponential distribution is memoryless. In mathematical terms,

\[ \Pr(X > t + \epsilon \mid X > t) = \Pr(X > \epsilon) \]

which simply means that however long I've waited for a ping, the distribution of time I'll have to wait for the next ping is the same as it was at the start. This means I can't anticipate the next ping 4.

Future Musings

I'd like to think about how to incorporate known lengths of time into my estimates. For example, I know exactly how long my lectures and office hours run, so I should be able to measure that time with very little uncertainty.



These are the same brilliant minds that created Beeminder which you should check out


Obligatory business adage: "What gets measured, gets managed"


Technically, this is only the case if the fraction of the time I'm performing an activity converges in probability. However, I can get this convergence if I model my time as a regenerative process, basically treating my days (or more realistically weeks) as independent cycles.


Intellectually I know this to be the case, but I wonder if I'll still anticipate the pings anyways. I imagine the brain is wired to anticipate events as relatively few things in nature can be modeled as exponentials.