Effect of start order on women’s WSC 10k

This topic has been covered elsewhere but I thought I’d add my two cents, and it turned out to be slightly longer than Twitter could accommodate.

A lot of wacky things went on that day, as you’d expect when the weather and waxing are tricky and change dramatically during the race. I haven’t watched the TV coverage of the race myself, so I’m at a bit of a disadvantage here since I don’t have any sense of how things progressed and how the athletes looked except for what I’ve read online.

Basically, it started snowing shortly after the race started, which changed the conditions dramatically. This both made the conditions for later starters inherently more challenging and additionally some nations (e.g. Norway) just flat out missed the wax and had terrible, terrible skis.

So naturally we’re interested in whether we can see direct evidence of this start order effect in the results. My approach is actually quite simple (from the perspective of all the machinery I’ve built up over the years in the form of code written to push skiing data around). I’m just going to take the basic data in the graph I Tweeted earlier and rework it a bit.

The idea in that original graph is that I’m just taking each skier’s percent behind the median skier and showing a rough “confidence interval” for perspective (it’s actually just the 25th and 75th percentile of their races over the previous 1-2 years). It already suggests strongly that a lot of the people at the top of the results sheet had “surprisingly good” races, relative to their prior results, as shown by the gap between the red dot and the horizontal bar. We can just take the difference (scaled by the racer’s inherent level of variability, i.e. the width of their bar) and then plot the results relative to start order.



On the x axis, positive values are better than expected results, negative values are worse than expected. There were 4-5 athletes (no one notable) that I dropped entirely since they had so few results for meaningful numbers. The red dashed line is my rough guess-timate (again, based only on this graph; I didn’t watch the race) on where things changed. My placement is rather aggressively toward the back of the field; you could arguably say that between starters 25-40 things had stabilized somewhat, and then finally the conditions had really nosedived after that.

And of course as you would expect the relationship isn’t perfect. There are certainly folks at the back of the field that had good races, for them. But this seems like very strong evidence to me that it was simply a good day to be at the front of the field. Virtually all of those people had good to excellent races compared to their personal past performances.

The usual caveats apply here: this suggests there was an effect, but it can’t tease out the magnitude of the effect on a skier-by-skier basis. Different folks were impacted differently based on the specific wax they had, and how they responded in race to having a great (or terrible) day, in addition to the regular “noise” in athletic performances.

Race Snapshot: Oslo 30/50k Classic


oslo_cl_men oslo_cl_wom

Race Snapshot: Drammen Sprints





Noah Hoffman’s Progress

Like everyone else in the world (seemingly) I enjoy Noah Hoffman’s blog. Apparently he gets a little bit of a hard time for how often he posts, but I think it’s pretty remarkable how much he shares about his training and racing. A lot of athlete blogs will, quite understandably, shy away from sharing some of the lower points during their season. So I was struck by Hoffman’s post following the Sochi 50k in which he was quite open about questioning whether he should remain in the sport at all.

Part of the reason I found it interesting was that for a while now I’ve felt like Hoffman hasn’t been making much progress, and based on what I read elsewhere I don’t seem to be in the majority on that front. Obviously, as a fan of US skiing, I root for him, but form my very distant vantage point looking only at race results, I haven’t seen much sign of the dramatic improvement we’d like to see.

For instance:



This is all of his major international results. So you can see why I’d be puzzled by comments suggesting that his results have improved dramatically. Clearly the last two seasons have seen more good results, but his typical race hasn’t improved all that much, if at all.

Hoffman had an excellent race this past fall in what FIS now calls the pursuit at Kuusamo, and that certainly was noteworthy. But I’ve long felt that you can’t really generalize much from those pursuit races since so many skiers alter their pacing in response to the overall standings. Clearly Hoffman had a good day; I really wish he’d had that effort in an interval start race since that would be a much clearer signal of his ability.

Another example was the Sochi 50k, a race that Hoffman professed some disappointment with. He skied with the leaders nearly the whole race and had some bad luck near the end with a broken pole. On the other hand, by his own description he was basically toast with 5k to go and said that nearly everyone was leaving him behind. In the end he finished a little over a minute off the pace, or about 1.07% back. To say that a minute out in a 50k is pretty good is a bit misleading here. The fact is that modern mass start races (for the men at least) are essentially medium intensity cruisers with several kilometers of mad sprinting at the end.

I often joke that we might as well simply put the whole field on stationary bikes for a set period of time and then have them do a mass start 5k. A little bit more than a minute out in a 5k race doesn’t sound quite so good.

So you might say that I think that percent back in mass start events isn’t necessarily indicative of much. If I had better split data, a more informative metric would be a skier’s pace over the final 5% of the race, not their overall percent back.

Regardless, let’s try to take stock of where Hoffman is. I collected all top 30 men’s results in major international mass start, skiathlon and interval start races (so, yes, I’m dropping those misleading “pursuits”) over the past four seasons. For each person I calculated their median mass start percent back and their median interval start percent back. Here are the results, separated by whether they have ever achieved a top 5 result (in either type of race):



One thing this makes clear is that 1.07% back in the Sochi 50k is actually one of his better results, but it’s also clearly not been the rule. His other two strong mass start results were both held in North America.

All this isn’t meant to just rag on Hoffman. I genuinely hope he succeeds (if you don’t already count was he has done a success). But I was struck by his post-50k blog as being remarkable honest and clear eyed about his progress, which isn’t something I feel like we see very often expressed in public. I also think his coaches are right: he’s got at least two more Olympic cycles in him, and that’s a lot of time to work with. But I can’t say that we’ve seen a dramatic improvement in his race results…..yet.

Race Snapshot: Lahti 10/15k Freestyle





Race Snapshot: Lahti Freestyle Sprint





Race Snapshot: Toblach Freestyle Sprint





Next Page »