Effect of start order on women’s WSC 10k

This topic has been covered elsewhere but I thought I’d add my two cents, and it turned out to be slightly longer than Twitter could accommodate.

A lot of wacky things went on that day, as you’d expect when the weather and waxing are tricky and change dramatically during the race. I haven’t watched the TV coverage of the race myself, so I’m at a bit of a disadvantage here since I don’t have any sense of how things progressed and how the athletes looked except for what I’ve read online.

Basically, it started snowing shortly after the race started, which changed the conditions dramatically. This both made the conditions for later starters inherently more challenging and additionally some nations (e.g. Norway) just flat out missed the wax and had terrible, terrible skis.

So naturally we’re interested in whether we can see direct evidence of this start order effect in the results. My approach is actually quite simple (from the perspective of all the machinery I’ve built up over the years in the form of code written to push skiing data around). I’m just going to take the basic data in the graph I Tweeted earlier and rework it a bit.

The idea in that original graph is that I’m just taking each skier’s percent behind the median skier and showing a rough “confidence interval” for perspective (it’s actually just the 25th and 75th percentile of their races over the previous 1-2 years). It already suggests strongly that a lot of the people at the top of the results sheet had “surprisingly good” races, relative to their prior results, as shown by the gap between the red dot and the horizontal bar. We can just take the difference (scaled by the racer’s inherent level of variability, i.e. the width of their bar) and then plot the results relative to start order.

Voila:

wom_10k_fr

On the x axis, positive values are better than expected results, negative values are worse than expected. There were 4-5 athletes (no one notable) that I dropped entirely since they had so few results for meaningful numbers. The red dashed line is my rough guess-timate (again, based only on this graph; I didn’t watch the race) on where things changed. My placement is rather aggressively toward the back of the field; you could arguably say that between starters 25-40 things had stabilized somewhat, and then finally the conditions had really nosedived after that.

And of course as you would expect the relationship isn’t perfect. There are certainly folks at the back of the field that had good races, for them. But this seems like very strong evidence to me that it was simply a good day to be at the front of the field. Virtually all of those people had good to excellent races compared to their personal past performances.

The usual caveats apply here: this suggests there was an effect, but it can’t tease out the magnitude of the effect on a skier-by-skier basis. Different folks were impacted differently based on the specific wax they had, and how they responded in race to having a great (or terrible) day, in addition to the regular “noise” in athletic performances.

Noah Hoffman’s Progress

Like everyone else in the world (seemingly) I enjoy Noah Hoffman’s blog. Apparently he gets a little bit of a hard time for how often he posts, but I think it’s pretty remarkable how much he shares about his training and racing. A lot of athlete blogs will, quite understandably, shy away from sharing some of the lower points during their season. So I was struck by Hoffman’s post following the Sochi 50k in which he was quite open about questioning whether he should remain in the sport at all.

Part of the reason I found it interesting was that for a while now I’ve felt like Hoffman hasn’t been making much progress, and based on what I read elsewhere I don’t seem to be in the majority on that front. Obviously, as a fan of US skiing, I root for him, but form my very distant vantage point looking only at race results, I haven’t seen much sign of the dramatic improvement we’d like to see.

For instance:

hoffman2

 

This is all of his major international results. So you can see why I’d be puzzled by comments suggesting that his results have improved dramatically. Clearly the last two seasons have seen more good results, but his typical race hasn’t improved all that much, if at all.

Hoffman had an excellent race this past fall in what FIS now calls the pursuit at Kuusamo, and that certainly was noteworthy. But I’ve long felt that you can’t really generalize much from those pursuit races since so many skiers alter their pacing in response to the overall standings. Clearly Hoffman had a good day; I really wish he’d had that effort in an interval start race since that would be a much clearer signal of his ability.

Another example was the Sochi 50k, a race that Hoffman professed some disappointment with. He skied with the leaders nearly the whole race and had some bad luck near the end with a broken pole. On the other hand, by his own description he was basically toast with 5k to go and said that nearly everyone was leaving him behind. In the end he finished a little over a minute off the pace, or about 1.07% back. To say that a minute out in a 50k is pretty good is a bit misleading here. The fact is that modern mass start races (for the men at least) are essentially medium intensity cruisers with several kilometers of mad sprinting at the end.

I often joke that we might as well simply put the whole field on stationary bikes for a set period of time and then have them do a mass start 5k. A little bit more than a minute out in a 5k race doesn’t sound quite so good.

So you might say that I think that percent back in mass start events isn’t necessarily indicative of much. If I had better split data, a more informative metric would be a skier’s pace over the final 5% of the race, not their overall percent back.

Regardless, let’s try to take stock of where Hoffman is. I collected all top 30 men’s results in major international mass start, skiathlon and interval start races (so, yes, I’m dropping those misleading “pursuits”) over the past four seasons. For each person I calculated their median mass start percent back and their median interval start percent back. Here are the results, separated by whether they have ever achieved a top 5 result (in either type of race):

hoffman1

 

One thing this makes clear is that 1.07% back in the Sochi 50k is actually one of his better results, but it’s also clearly not been the rule. His other two strong mass start results were both held in North America.

All this isn’t meant to just rag on Hoffman. I genuinely hope he succeeds (if you don’t already count was he has done a success). But I was struck by his post-50k blog as being remarkable honest and clear eyed about his progress, which isn’t something I feel like we see very often expressed in public. I also think his coaches are right: he’s got at least two more Olympic cycles in him, and that’s a lot of time to work with. But I can’t say that we’ve seen a dramatic improvement in his race results…..yet.

US Olympic Assessment

So, how did the US do overall at the Olympics this year?

Well, as usual, I’m going to mostly ignore the team events. As I did before, here’s some historical context for our results this time around:

us_sochi_grade

 

That’s all WSC and OWG results for Americans stretching back to 1992. It’s still kind of hard to swallow the women’s sprint results as a significant improvement, but there you go.

The men’s and women’s distance results both ticked slightly in the wrong direction. However, my suspicions held true and the women continued their steady improvement at the low end. The men are really just in a holding pattern. Basically nothing has changed on that front for about a decade, really.

A friend phrased the question to me in terms of a grade. Personally, if I’m being objective, I’d give the results a B+. Kikkan’s sprint race was a huge disappointment, to be sure, but four women in the top twenty is still quite good and we did put Sophie in the finals. Liz could certainly have had a better 30k, but beyond that I don’t really think anyone significantly under-performed in the distance events compared to what I expected, or thought was reasonable.

On the other hand, (and it’s very hard for me to say this publicly, because Kikkan Randall has been nothing short of revolutionary for the US skiing community), I find it hard not to consider these Games a pretty huge disappointment. But that’s my heart talking, not my head.

Effect of TdS on Olympic Results

Multiple people emailed me asking about whether there was any potential link between whether an athlete participated in the Tour de Ski and their performance at the Sochi Olympics.

Personally, I like dealing with questions like this because they are a great example of how something can both be poor statistical reasoning, but still true at the same time. What could I mean by that?

Well, all of the emails I received made some attempt to list people on both sides: those who skipped the TdS (entirely, or in part) and then seemed to ski well at the Olympics, and those who skied the whole Tour but then seemed not to ski so well at Sochi.

The problem with this sort of thinking is that you have to stop and ask “skied well (or badly) relative to what?”. And that is devilishly hard to establish. For instance, Marit Bjoergen skipped much of the Tour, and then she won 2 gold medals.

But of course, we don’t really know how Bjoergen might have skied in Sochi had she finished the Tour. That would require time travel, or Dr. Who style alternate universes or something. Imagine an alternate universe in which she finished the Tour, and then got the exact same results, under the exact same scenarios, at Sochi. Ask yourself if you’d react to that universe with shock and surprise that Bjoergen skied well enough to win two gold medals (and two bad results from a crash and bad wax) at Sochi after doing the Tour in January. No? I thought not.

So with all my statistical hand waving out of the way up front, here’s a crude way to look at this, as best we can. I took all the top 30 finishers in the individual races at Sochi and collected all their results from 2012-2013 forward. I calculated the difference in the average performance of each skier at Sochi and prior to Sochi (measured by looking at the difference in rank or FIS points compared to each of the skiers in the cohort), and then plotted that relative to the number of stages they started at the Tour de Ski:

tds_effects

 

Due to the somewhat unfortunate repeated collapsing and then subtracting, negative values here represent doing better at Sochi against this specific cohort of skiers than they had in the previous year and half or so. So if more starts at the Tour had an overall negative effect on performance, you’d see things trending generally upward. But mostly people are just all over the map.

But the key here is that just because we have no evidence of an overall effect, for the whole population on average, that in no way means that specific people weren’t adversely affected by racing deep into the Tour. That’s the key: it’s almost certainly true that some people may have benefited from skipping the Tour, and some people suffered by doing the Tour. But that’s not the same thing as some sort of general, over-arching effect across all people.

I skipped adding any trend lines, because really, the story here is the variation. Sure, you might be able to convince yourself that there’s an effect present for the male sprinters.

Sochi Sprint Qualifying Times

A friend mentioned to me in passing the fairly large gaps in the qualifying times for the Sochi freestyle spring between certain skiers, so I became a little curious.

The following graph shows the percent back in qualifying time for the Sochi sprint (red) for the top 30 qualifiers, along with the same data for all WC sprint races for the past several years:

sochi_spr_qual

 

The times for Sochi weren’t quite as unusual as I thought they might be. However, the men’s times after 12-13th or so are definitely further back than average. For the women, things look fairly consistent until around 20th or so, then the times drop off pretty rapidly.

US Championship Performance Context

On a day like today it’s probably best not to dwell on what could have been, at least for US fans.

Instead, here’s some context for how we’re doing so far, compared to our results in previous Olympics and World Championships back to the early 90’s:

sochi_context

 

It may be tough to swallow today, but look at that women’s sprint panel in the lower right. That is (by far) the single best team sprinting performance at a major championship for the US, men or women. Ever.

It is a huge testament to the accomplishments of Kikkan Randall (and Jessie Diggins, and Ida Sargent, and Sadie Bjornsen, and…) that a day like today could feel like such a big disappointment on many levels.

The other piece that catches my eye is the steady improvement in the worst female distance results (although the current Games are not over yet…). Much attention is given to our skier’s best results, but this is really what “raising the bar” really means. Slowly, steadily over time, finishing in the 70’s isn’t respectable. Then finishing in the 50’s isn’t respectable. And so on.

The Caitlin Gregg Situation

Ah, yes, another Olympic year, another wildly entertaining FasterSkier comment thread regarding team selection.

I find this round interesting because I sort of assumed that most of the heat this year would fall on the men’s selections, but apparently the decision to not select Caitlin Gregg is getting the bulk of the attention.

This is a singularly difficult thing to analyze, because there just isn’t much concrete data to go on. But let’s get a few things straight right up front. The selection criteria were clearly intended to not give special preference to people skiing fast just this fall. You really had to have had good points from last season as well, and some of the best opportunities for those points would have come at the spring races with the US women present (mostly).

There’s no question that reasonable people can disagree on whether this is the best strategy for a selection criteria, but I think it’s impossible to argue that its a bad idea, or even the worst idea. At best, you’re only going to see 7-8 starts from someone by early January, and when you consider someone like Gregg who you’d be taking primarily to ski a distance skate race, you really are only going to see 2-3 relevant results out of her in that time period. That’s not a lot to go on, really, and I think it’s perfectly reasonable to ask that a major part of putting yourself on an Olympic team to be demonstrating that you can ski fast over a longer time period, and against relevant fields.

So let’s acknowledge that Caitlin hasn’t raced against anyone on the US team at all this season (except for the two recent sprints in Europe). What do we know about how she has stacked up against that particular group? If we take each head-to-head matchup between Gregg and one of Randall, Diggins, Bjornsen, Brooks, Caldwell and Sargent, Gregg compiled a 6-25 record in 2011-2012 and a 3-20 record in 2012-2013. Not surprisingly, in sprint races its even worse, with a 2-11 record in 2011-2012 and an 0-20 record in 2012-2013. In graphical form, that looks roughly like this:

gregg1

 

Negative values are bad for Gregg here. Over the past two season, Holly has basically dominated Gregg overall. Brooks hasn’t skied all that well herself this year, and I think it’s clear that Gregg is skiing faster. Have they moved enough to swap places? Impossible to say, since they haven’t skied against each other yet.

Last season, basically all of those match ups (except against Brooks) came during  “Spring Series” (I’m honestly not sure if they still call it that, but I like the name) at which Gregg skied rather poorly. It’s quite possible that she was just unlucky, getting sick near those races, or maybe she just didn’t manage her fitness as well as she could have and went into the series a little run down. Who knows. Regardless, I feel like anyone looking at the selection criteria would have known that those races were going to be very, very important, both for potential points and for demonstrating an ability to ski toe-to-toe with the gals spending all year over in Europe. So that was clearly a missed opportunity.

So the record we do have from Gregg from 2011 through last spring has very little evidence that she was skiing very close to the level of the top US women in Europe. But then she shows up this fall and has clearly improved, winning or finishing on the podium of basically everything she enters. In particular, she crushed the field in both of the freestyle skate races, a 10k and a 20k mass start. Does that tell us anything meaningful?

It’s hard to say. Normally, I’d be very skeptical of reading much into a huge margin in a mass start race, since the in race dynamics can be so weird. Certainly, a good portion of that margin came from the field simply deciding they weren’t going to catch her and they started racing for second. But she won the 10k individual start in Yellowstone by a huge margin as well. My personal feeling is that while she was clearly the best skier on both days, I have a hard time putting much stock in the margin of victory. The Yellowstone race was an extremely early season event, and the other was a mass start.

And then there’s the problem that you just keep circling back to: the fact that the people she’s beating in those races themselves fare very poorly against the top US women. In those two skate races, the other top women (Patterson, Fitzgerald, Flowers, Brennan, Rorabaugh) all have pretty dismal records against the US Ski Team women themselves. Brennan skied well enough to earn herself a trip to Eurpoe, but even she went 12-40 against that group in 2012-2013 and only 2-18 so far this season.

And finally, two stellar races in the discipline you want her in isn’t much of a trend, really.

I realize as I’m writing this that I probably sound very negative about Gregg’s results this season. Actually, that’s not the case. I think it’s clear that she’s skiing considerably better than last year. I would have loved to have seen her selected and it would have been great to watch her ski in the 30k in Sochi. But the selection criteria made it pretty clear, I think, that relying on skiing fast this fall to get in was going to be long odds. A stronger signal would have been to put up some good results last spring against the other top US women head-to-head and then demonstrate you can sustain it when racing resumes the next fall.

I think it’s perfectly fair for people to react to Gregg’s non-selection with a cry of “What does it take?” The answer to that is complicated by the fact that the Olympics are such an emotionally charged event. The Games have symbolic and cultural importance for many people that really transcends the more mundane aspirations of a national skiing body (winning hardware). It’s frustrating, but I think it’s unfair to expect an organization like USSA,  whose mission really ought to be to do everything it can to win, now and in the future, to treat Olympic starts any differently than World Cup or World Championship starts. Still, as a fan, that can be tough to swallow.

Gregg’s been working a long time for this, she’s well known, and folks look up to her. Sending her to Sochi to race the 30k would certainly have been a big payoff for a lot of long, hard work, even if she isn’t likely to finish very high up the results, and even if, at ~34, she’s not likely to remain an elite racer for that much longer. But it would certainly make a lot of folks back home happy and excited about ski racing.

The counter-argument is that I think the folks running the US Ski Team want us all to dream bigger. The days of simply going to the Olympics, or simply getting the chance to race in Europe as a fitting reward to a long career are over. Qualifying for Olympic/World Champ teams, attaining first period WC start rights, are all just steps along the way. Worth celebrating, for sure, but no longer a career capping moment. I think on some level, they actually don’t want us to be aspiring to long, successful domestic racing careers capped off with a trip to a major event with middling results.

And I think that mostly people are ok with that message, and that we really are dreaming bigger. But when it comes to the Olympic Games, the cold hard reality of that message can really sting, since for so many the Olympics are just a different beast altogether.

I don’t really have an ending for all this, except to say that I’m excited that Caitlin Gregg is skiing so well this year, and that as an American I’m thrilled to see what results she can put up across the pond, Olympics or no.

Next Page »