Skip to content

Big Championship Winning Margins

Therese Johaug’s margin of victory today was pretty epic. Epic enough that FasterSkier included a list the winning margins in 10k championship events for the last 20 years or so. I thought it might be interesting to expand on that slightly. The following table lists winning margins for all 10/15km (men’s & women’s) interval start championship events (WSC & OWG) that I have times for. This expansion muddles things a bit since if you go back far enough you start getting the women doing 15km or the men doing 10km, so you do need to keep that in mind. But I still think it’s interesting to include them all. The table lists the winning margin and the percent difference.

Johaug’s winning margin isn’t the absolute largest on this list, only because of the aforementioned women’s 15km championship events, but if you sort by percent difference, hers is the largest by a fair bit.

10/15km Winning Margins

Generated by wpDataTables

Assessing the USA/CAN Performance in the Oberstdorf WSC Classic Sprint

American ski fans have gotten spoiled this season with some outstanding results. Even allowing for the fact that we probably all know that classic sprints aren’t really one of the stronger events for the US, today’s results might have seemed a bit underwhelming.

This sort of situation is what I consider the ideal use of data in skiing. Not some big fancy analysis, just a simple gut check with a quick graph. The metaphor I like to use is that race result data’s best role in the sport (at the moment) is as a sort of “guardrail for the brain”. Our minds are amazingly talented at taking small bits of information and spinning pretty elaborate stories to explain them. Some simple looks at the data may not provide us with some groundbreaking insight, but it can often prevent us from driving wildly off the road.

So here’s the history of major international classic sprint performances for the US and Canadian skiers:

In line with my previous discussion, I’m not claiming this gives us any big insight. But! I can glance at these graphs and quickly calibrate how I feel about how the US & Canada did today, in a broad sense:

  • The US women had a notably off day, maybe not terrible but not great
  • The US men did ok, maybe about as well as we might expect
  • The Canadian men had a pretty good day (coming from a pretty low baseline)
  • The Canadian women had a maybe ok or only slightly disappointing day, again from a pretty low baseline

My one caveat is that this is obviously at a “program level” assessment of the day, rather than at the level of the individual athletes, where their particular histories would potentially lead you to different conclusions about whether their individual results were good or bad today.

WC Race Speeds Over Time

Someone emailed me with this question, which is a common one that I’m sure I’ve written about multiple times before. But it’s an easy graph to make, and just posting it again is easier than finding one of my old posts.

The question is how have race speeds changed over time. As you might expect we’re going to limit ourselves to a single race format and distance for consistency, in this case 15/10km interval start events for the men & women respectively:

One women’s race from the 1990’s that is a significant outlier has been removed; I’m quite sure it’s been mis-recorded on the FIS website somehow. The trends are just linear fits to the data. If you really want to squint at the scatterplots you could try to fit a smooth curve and try to spot places where it “jumped” but skiing races are just too variable to cleanly identify that level of change, and the linear trends obviously describe the data quite well.

Just eyeballing the graph, all four events (men & women, freestyle & classic) have improved by around 15%; some a little more, some a little less. It’s easy to brainstorm all the various causes for this improvement:

  • skiers train better, eat better, have better technique
  • better waxes
  • better skis
  • better, more reliable grooming (this is my favorite explanation and the most commonly overlooked in my opinion)
  • faster courses? (This is kind of an oddball one, but I wonder about it sometimes. Old time ski courses were narrower, with fewer longer loops and more twisty (maybe?), particularly before we had to accommodate skating and mass starts. Those kinds of courses with constant transitions and turning were possibly slower to ski than today’s trails that look like super-highways.)

Every time I make this graph, or something equivalent I have the same somewhat entertaining thought: what sort of courses would we need to have on the WC circuit to force current athletes back to the average speeds of the 1990’s? The 1980’s? It’s kind of amusing to think about what the courses would have to look like to add 10, 20 or even 30 seconds per kilometer to today’s top skiers.

Top US Results Have A Freestyle Bias

I don’t think it’s any secret that US skiers tend to have somewhat better results on the World Cup (and OWG/WSC/TdS) in freestyle events. Let’s quantify that a bit, starting with distance races:

I’ve plotted major international results of both techniques (so no skiathlons) for the US men & women for the last decade or so. I’m just using finishing place as the measure, to keep things simple and easily interpretable. The lines represent the 50th & 10th percentiles for each technique. I had to use two sets of lines with the same colors (blue = freestyle, red = classic) so I labelled the sets as 50th and 10th percentile.

There’s a distinct separation between techniques for the US women that has persisted for quite a while. For the US men, I’d say that the 50th percentile result is about the same for each technique, with a very small edge for freestyle in the 10th percentile, not terribly significant.

Here is the same plot for sprints:

Some interesting differences here. The women still have a consistent bias towards freestyle, but it’s considerably smaller, and the difference for the 10th percentile has narrowed in recent years. The men seemed to have very little difference between techniques until recently when a fairly substantial gap has developed, again favoring freestyle. Part of me wonders how much of that is the result of just one skier, Andrew Newell, declining slightly later in his career and then mostly retiring.

Jessie Diggins’ Best Classic Races

I noticed an interesting Tweet from Chad Salmela regarding Jessie Diggins’ race in today’s Falun 10k classic mass start:

Ultimately, of course, Jessie Diggins is the only person who truly knows what her best classic performance is. But for data people like me it’s always a fun game to try to tease out some objective measure from the peanut gallery. When I read Chad’s tweet my gut reaction was disagreement.

Comparing race performances, even limiting ourselves to just classic distance races, is hard. There are lots of different confounding variables but the big obvious one is race format. Mass start events are just very different in modern skiing than individual start races, and of course pursuit starts are probably worth excluding all together.

Let’s look at Diggins’ major international classic results using three measures:

  1. finishing place (rank),
  2. FIS points, and
  3. PBM points (a point system analogous to FIS points based on the median skier and the spread of skiers in an attempt to capture the strength of the whole field rather than just the top finishers)
Generated by wpDataTables

First, let’s sort the table by finishing place (rank). Even if we exclude the pursuit start race in Canmore, the 7th in today’s (2021-01-30) Falun 10k mass start is bettered by several events including an interval start race also in Falun in 2016. There are also numerous other 7th & 8th place results that should be in the conversation by this measure.

But that’s just one measure, and Chad specifically referenced percent-back, which is what FIS points are based on, so if we instead sort the table on FIS points we see basically what Chad describes as the top three (again we’re ignoring that Canmore pursuit start).

But do you notice something interesting here? When we focus only on the percent-back from the leaders some strong interval start races suddenly look much, much worse. Sort of disturbingly worse.

Here we start to butt up against the reality that mass start and interval start events may be so different, tactically and physiologically, that we should start to at least consider the possibility that they aren’t even really comparable at all. Not as incomparable as distance and sprint events, maybe, but different enough that what happens at the front of the field may not adequately capture a skier’s efforts between those formats.

As all cross-country ski racers know, today’s mass start races often have relatively slow paces for much of the race, ending with a frantic sprint over the final few kilometers, or perhaps even only the final few hundred meters. Interval start races tend to involve a more consistently high pace (and effort) for the entire race. Old farts like me tend to be biased towards interval start events (it is “real” racing, after all!) but if I’m being honest, it’s just different and requires somewhat different skills.

Circling back to Diggins’ classic results, it’s these sorts of things that make me wary of percent-back based measures when we’re mixing interval and mass start events. It should give us pause that Diggins can finish 5th & 7th in interval start races and yet have as much as ~3x the FIS points (roughly 3x the percent-back) as a 7th place mass start finish. Should we really be discounting those interval start results that severely? Diggins was much closer, in time, to the handful of skiers up the track in the mass start race, but was that because she skied faster, or was it because those skiers in front of her were only really “racing” at a different pace for a short time and so didn’t have the road to put more time on her?

In general, I think percent-back based measures like FIS points tend to pretty severely underplay interval start results. Many of the choices I made in designing a percent-back from the median skier revolved around these sorts of concerns. I’ve found that the middle of WC fields tend to be a bit more stable a benchmark for measuring performance. You avoid situations where a single skier like a Bjoergen or Johaug just outpaces the field by a minute. Additionally, I included an adjustment for how “spread out” overall the results for that day are, to penalize slightly situations where the pace is slower, resulting in a very bunched field.

So if we finally sort the table by PBM points we get very different results indeed! This measure appears to give much more credence to Diggins’ various interval start results, rather than just putting all the mass start races at the top. But there are some very different, and possibly surprising, races at the top using this measure.

First, it really likes the 2018 TdS 10k classic mass start in Val di Fiemme, for instance. This does make some sense. It was a mass start race that fractured pretty badly (albeit with a small, late-Tour field), so Diggins ended up way ahead of the mid-pack skiers, but she finished 4th, within a respectable distance from the leaders. This measure also really likes her 5th place finish in the 2016 Falun 5k interval start. This also makes sense: Johaug outpaced the field by ~19sec that day in only 5k and that’s the sort of thing that can really distort percent-back from the winner measures.

But I’m burying the lede here because where on this list is today’s 7th in Falun? Holy cow it’s all the way down in 18th! Why so low? Well, in today’s race Diggins wasn’t unusually distant from the middle of the pack (~1min in a 10k) but the field as a whole was very “bunched” suggesting that the overall pace wasn’t doing much to separate the field and so we should put somewhat less stock in being close to the leaders at the finish.

Ultimately, my gut tells me that my PBM points are being perhaps a bit too harsh on today’s race, and the truth is probably somewhere in the middle. Today’s classic performance almost certainly wasn’t Diggins’ best ever, but I’d say top-10 is fairer than ranking it all the way back in 18th.

My thought process when reading results actually does synthesize all three of the measures we just discussed, so in practice I don’t think any one is preferable to the exclusion of the others. My internal monologue today went something like this:

Wow, Jessie finished 7th in a classic race! And very close to the winner! That’s very good for her! But, hmm, it was a mass start, and gee, the top of the field looks like it was very bunched together suggesting either a relatively slow pace, or very fast conditions, or both. Let’s see, how far from the middle of the field was she? Hmm…pretty average actually. Ok, so that was a very solid classic race for her. Not Earth shattering, but very good.

Swedish Men’s Struggles

It’s always fun to take one of Devon Kershaw’s rants and make it concrete. In this case, his apparent despair over the struggles of the Swedish men on the international racing scene. He’s ranted about it more than once, but in this particular case he referred to 2019-2020 as being the worst season for them ever, more or less.

So let’s look at the Swedish men’s distance results in World Cup, Olympic and World Championships, tally them up in categories and give each season a score. My entirely made up points are to award 10 points for a win, 5 for a podium, 2 for a top 10 and 1 for a top 30. (Results are counted in each category they fall in, so a win contributes 10 + 5 + 2 + 1 = 18 points.)

Swedish men’s distance results. One point for top 30, two points for top 10, 5 for podiums and 10 for wins.
Season Wins Podiums Top 10 Top 30 Score
2019-2020 0 0 7 53 67
2018-2019 0 1 7 53 72
2017-2018 0 3 12 57 96
2016-2017 2 5 23 81 172
2015-2016 0 2 4 37 55
2014-2015 1 5 22 72 151
2013-2014 1 8 24 59 157
2012-2013 3 6 29 81 199
2011-2012 3 5 25 62 167
2010-2011 3 9 24 67 190
2009-2010 2 9 32 67 196
2008-2009 2 5 15 42 117
2007-2008 1 3 16 43 100
2006-2007 0 2 15 46 86
2005-2006 3 6 22 61 165
2004-2005 0 2 13 34 70
2003-2004 3 7 17 52 151
2002-2003 6 14 37 69 273
2001-2002 4 10 20 47 177
2000-2001 8 10 22 47 221
1999-2000 0 5 18 58 119
1998-1999 2 4 30 67 167
1997-1998 0 2 19 59 107
1996-1997 0 1 22 60 109
1995-1996 0 2 16 57 99
1994-1995 1 3 20 63 128
1993-1994 0 4 17 50 104
1992-1993 2 4 17 46 120
1991-1992 1 3 18 42 103

2019-2020 isn’t technically the worst on this list, but close enough. It was pretty bad. I don’t quite recall what happened in 2015-2016. I know Marcus Hellner was still racing then but he had a pretty terrible season. My guess would be some combination of injuries or illnesses, but I have a terrible memory for that stuff.

For comparison, here’s the same thing for sprinting:

Swedish men’s sprint results. One point for top 30, two points for top 10, 5 for podiums and 10 for wins.
Season Wins Podiums Top 10 Top 30 Score
2019-2020 0 1 13 51 82
2018-2019 0 0 14 57 85
2017-2018 0 3 14 42 85
2016-2017 1 3 11 41 88
2015-2016 0 0 10 34 54
2014-2015 0 0 7 29 43
2013-2014 2 7 19 43 136
2012-2013 4 6 17 43 147
2011-2012 3 8 20 52 162
2010-2011 7 8 31 51 223
2009-2010 4 8 26 54 186
2008-2009 1 2 11 37 79
2007-2008 1 6 19 41 119
2006-2007 0 5 24 56 129
2005-2006 7 12 30 49 239
2004-2005 1 5 20 50 125
2003-2004 3 10 24 49 177
2002-2003 4 11 30 52 207

In this case 2014-2015 and 2015-2015 both stand out quite a bit more than any of the more recent seasons, although they all have a distinct lack of wins.

Hopefully they can turn things around!

US Development Graphs

I posted these graphs on Twitter a few days ago with basically zero commentary, so I thought I should at least describe what I did in a tad more detail.

I make these graphs because they get at an aspect of something the US Ski Team leadership talks about fairly often: being “on track” for top international results. They are not meant to represent any actual artifact or analysis that the US Ski Team actually uses for making decisions. They are merely my translation of the general concept. There are lots of different ways you could actually approach this idea and end up with somewhat different conclusions.

First we need a definition of “top international results”. For distance events I took that to be finishing in the top ten in WC/WSC/OWG/TdS events at least three times. For sprints, reaching the semifinals in WC/WSC/OWG/TdS events at least three times.

Then I took those “successful” skiers and plotted a band that represents the 10th-90th percentile of their race FIS points by age. So this band represents what the bulk of these skier’s race FIS points looked like at different ages.

Then I plotted on top the actual race FIS points from some of the top US skiers according to the most recent FIS points ranking lists for distance and sprint. I skipped a few people who I believe retired, and generally tried to focus on younger skiers in places. But mostly I was just running down that ranking list.

Finally, the red trend line is tracking the median result for each skier. I was a tad lazy and didn’t tweak the smoother too rigorously for each skier, which results in some poor (overfit) results in a handful of cases. Those are the red lines that are really, really squiggly. If I were taking this more seriously, I’d replace those with a more rigid linear fit, or maybe just omit the trend line for those skiers entirely.