Variability In World Cup Ski Racers
[Note: This article first appeared here.]
Kris Freeman recently commented on his FasterSkier.com blog, “I am a serious contender for the most volatile and inconsistent skier on the world cup”, in reference to his disappointing races at the Vancouver Olympics.
Every cross-country ski racer knows that you can’t always race at your best, all the time. Some days you just feel better than others. Often there’s an obvious reason (sickness, fatigue, overtraining etc.) but sometimes not. Racers work very hard to condition their bodies to perform at very high levels, repeatedly, throughout a season. However, it is inevitable that there are some differences from race to race.
These observation lead naturally to a topic that’s not very “sexy”, but it’s what stats geeks think about all the time: variation. Let’s look at some data regarding variability in ski racing and see what we can learn.
Here are the FIS points for individual distance races in the ’03-’04 season for Andrei Golovko and the ’93-’94 season for Jari Raesaenen. Golovko was all over the place and Raesaenen was quite consistent.
Calculating a standard deviation (SD) of the FIS points is one way to quantify variability. In my example above, Raesaenen had a SD of 7.1 while Golovko’s was 23.6. Clearly I’ve picked some extreme examples here, so we might ask some follow-up questions: How much variation is normal? Are some racers unusually consistent? Are some racers unusually inconsistent?
Let’s look at these questions using data from the distance events at major international ski races: World Cups (WC), Olympics (OWG) and World Championships (WSC). Here’s a more precise description of what I did and why. If you don’t care for technical details, feel free to skip the rest of this paragraph. For each athlete, I looked for seasons where they had received FIS points of less than 150 in at least nine WC, OWG or WSC races. Why less than 150 FIS points and at least nine races? First, there are some athletes with enormously high FIS point races. These results are going to seriously cloud the issue. (Trivia: care to bet what the highest FIS point score in my database is?) Second, some athletes may only race in 1-2 of these events in a season. That will make a racer look very, very consistent! So I’m trying to weed out things like this that will cloud the data. As it is, nine is small number of points to use for a SD.
In other words, for each athlete we find seasons where they had at least 9 WC, OWG or WSC races with less than 150 FIS points and then calculate the SD of these results. That’s one data point. Repeat for each athlete and we end up with several hundred SDs.
What do we get? The average standard deviation is 17.7 FIS points for men and 18.9 FIS points for women. Now, what the heck does this mean? Suppose, through a stunning and miraculous chain of events, I ended up on the World Cup circuit and an average race for me yields ~50 FIS points. If my SD is a “typical” 17.7, I would expect most of my races to fall between ~14 and ~86 FIS points, i.e. two SDs below and two SDs above my average 50 point race. Anything outside of that range would be fairly unusual.
So a SD of around 18-19 is typical, but just how typical is it? To answer this we’ll consider the following two histograms of all the SDs for men and women respectively:
These histograms give us a sense of the variation in variability. Confused? That’s ok, here’s the deal. We started out looking at how variable a single athlete’s results are over the course of a season, which we’ll call “within-athlete variation”. The histograms are plots of several hundred examples of within-athlete variation. This gives us a rough sense of just how variable this within-athlete variation might be and helps us to see how typical or unusual different levels of within-athlete variability are. For example, an unusually small amount of variation might correspond to a SD of less than 10, while an unusually large amount of variation might correspond to a SD of at least 28-30.
Finally, let’s return to Kris Freeman’s comments that I noted above. Has he, in fact, been unusually variable this season? Well, his SD for this season (with only six races) is 54.2. However, that is almost entirely due to one race, the Olympic 30k Pursuit. Removing that one race drops his SD for the season down to 27.1. Still high, but not frighteningly high.
Has he been getting more inconsistent over time? The table on the right displays Freeman’s SD’s for the past several seasons.
I don’t see much of a pattern there. Is it reasonable to say that Freeman is in contention for being one of the most “volatile” racers on the World Cup circuit this season? Perhaps; but if Freeman is significantly less consistent than other skiers it is certainly almost entirely due to blood sugar issues. As he mentioned in his blog post, maintaining a correct insulin dose is extraordinarily difficult and these doses can change over time. One slip-up can result in a single catastrophically bad race that will make an entire season look very “volatile”.
And given that Freeman has had several other quite consistent seasons (2003-2004, 2006-2007, 2008-2009) it might be more fair to say that when he’s on top of his blood sugar management he’s as, or perhaps more, consistent than his peers.
I should also note that the manner in which I’ve examined skier variability so far is fairly crude, as it relies on calculating SDs based upon anywhere from 6-15 values at a time which, as I noted, is fairly sensitive to a single outlying result. An alternative approach would be to employ some heavier artillery and model skiers’ variability over the course of their entire career. I may revisit this topic later to address this, but for the moment you’ll just have to trust me that it doesn’t dramatically alter the picture with respect to Kris Freeman specifically.
I’ll close with the reminder that skiing consistently is not the same as skiing fast. Go back to my first example and ask yourself which season you’d rather have had, Raesaenen’s or Golovko’s?