When deciding which program to join for my PhD I came across Jerzy's
blog. His post on the 1st semester was very helpful in confirming that
CMU was the place for me. You definitely should read all of them: 1st,
2nd, 3rd, and 4th thus far. Anyways, as an act of blatant imitation,
here's my own review of the first year.
Attention Conservation Notice:1 Of interest mostly to other
PhD students. Sample size of one; may not replicate
Before I start in on the presumably boring details, I found it
helpful to distill the entire experience into a few short "lessons".
There's no doubt that the first year of grad school is a humbling
experience. You're exposed to the many things you don't know and
you must work very hard to start correcting that situation. Without
a good sense of your limitations this is incredibly exhausting. So
knowing when you're tired and frustrated and knowing what to do to
fix that is essential. I find that exercise and a book are helpful.
That being said, it's also nice to "internalize that you are, in
fact, a badass", to borrow a phrase from Cosma Shalizi. It's quite
a morale boost to realize just how far you've come in a relatively
short amount of time. So while in the middle of a measure theory
problem set, just remember that only a few months ago you had
couldn't even understand the statement of the problem you're
Know your tools
On a more quotidian level, this year has been an education in the
mechanics of getting work done. With problem sets due every week,
homeworks to be graded, papers to read, and research to be done you
need to get on top of things. Investing a little time in improving
the process by which you accomplish these tasks will pay off in the
long run. All that to say, my emacs-fu has gotten much stronger
I'll just list a few of the tools and commands which I now find
emacs, especially elfeed (rss), mu4e (email) and above all
One of the aspects of CMU which had escaped my notice before
deciding to attend was the wonderful culture. The students are very
friendly and the professors helpful. This is nice because you
occasionally hear stories of the academic version of
Thunderdome2: "two men enter, one leaves" (with a
diploma). Instead, things are very cooperative which is fortunate
since you can benefit from specialization of knowledge; you help me
understand real analysis and I help you understand programming. On
a less transactional level, you spend quite a bit of time with
these people and assuming you're staying in the field will never
quite escape them; so it's just as well that you befriend them.
Now for the nitty-gritty of the first year. A great deal of time and
effort in the first year was spent on coursework. CMU front-loads the
coursework so that we're done with required courses very early. Thus
the courses are meant to get you up to speed for research.
10-701 Machine Learning
A whirlwind tour of various algorithms ranging from the familiar
linear and penalized regression to the new-to-me perceptron and
SVM. My biggest gripe was that they required us to code in
Matlab's open-source cousin Octave.
36-699 Statistical Immigration
This was excellent. This was not a class, merely a scheduled time
for each of the faculty to give a short talk about their work. It
was extremely helpful not only so I could recognize all of the
faculty, but so that I had a first approximation to what everyone
was doing. Several professors also took this time to advertise ADA
projects which was also helpful.
36-705 Intermediate Statistics
Coming from a heavily Bayesian school (Duke) I had a lot to
learn/refresh about classical statistics. Larry Wasserman's course
was the way to quickly get up to speed. More or less following the
content of his All of Statistics3, it was the perfect way
to get into the swing of things.
36-707 Regression Analysis
All of the other courses were heavy on math and theory, so 707
served as our connection with "real" data analysis. I thought I
had regression all figured out, but this class helped fill in some
of the gaps in my education.
36-702 Statistical Machine Learning
Where 701 was a whirlwind introduction to several methods, 702
went back and looked at the theory behind those methods. Co-taught
by Larry Wasserman and Ryan Tibshirani, this course was my
favorite thus far.
36-752 Advanced Probability
This was our introduction to Measure Theory. I definitely
benefited from being a math major in college. The actual content
of my undergrad courses was not particularly helpful, but the
mental patterns for figuring out proofs were crucial.
36-757 Advanced Data Analysis
This was also not exactly a course so much as a scheduled time to
give short talks about our progress on our ADA projects. It's been
great to have a variety of opportunities to practice speaking. The
only way to get better is to practice, and this course gave us
plenty of feedback on our skills.
Of course, not everything is about coursework.
All PhD students in the department have an Advanced Data Analysis
(ADA) project which they typically complete over the course of
their first spring and second fall. Mine is working with recordings
of neural activity in Parkinsonian mice. The experience working
with real data has been educational and it serves to connect all of
the work we've been doing in our course with actual data analysis.
I was a TA for 36-225 in the fall and 36-217 in the spring. While
not necessarily the most exciting aspect of the program, I've
framed the work as practice at conveying statistical ideas to
non-statisticians. Office hours are enjoyable, although grading
papers is a grind.
Over the summer I worked as a TA for our summer research program
for undergraduates. This was a great deal of fun. Once a week I ran
an R tutorial for the entire program and the rest of the time I
worked with a small group on their project. Their dataset was
gene-expression levels in autistic and control brains and the goal
was finding differentially-expressed genes. It was an excellent
context in which to bring up multiple-comparison testing,
permutation tests, and correcting for confounding variables.
Whether it's playing intramurals, walking in Schenley Park, or
getting books from the public library, it's good to get out of the
department. I've particularly enjoyed setting up my hammock on
Flagstaff Hill and relaxing with a good book from the Carnegie
Library of Pittsburgh.
A lot of the time the department has come along, as our cohort has
been very good at socializing outside of the department. I've even
hosted a few board4 game nights at my apartment.
As further evidence of my lack of originality with this
post I decided to borrow from Cosma Shalizi, although in the course of
writing this I actually followed his link and apparently he got it
from Bruce Sterling.