Happy (Almost) New Year! And thanks once again to Skadi Nordic for sponsoring this week in review. As the Tour de Ski get’s rolling, here’s what we’ve been up to this week:
- We looked a bit more closely at the link between qualifying in the top ten and going on to do well in the finals in sprint racing.
- Upon a reader’s request, I followed up my earlier technique preference post for Japan by examining the Italians with some interesting results, I thought. A similar post for Norway is in the pipeline.
- Before the second bout of World Cup action got under way I snuck in some mid-season recaps for the North American skiers, both sprint and distance.
- As a short preview for the Tour de Ski I made some bump charts for last year’s Tour, just to see how things played out last year.
- I revisited a post from a while back looking at the “struggles” of the Norwegian men’s distance skiers (which, as an American, I’m required to place in quotations; I understand that they’re struggling compared to what they are used to), updating it to include the results so far this season.
- Finally, since the Tour de Ski officially kicked off today, we had a race snapshot post for the prologue, despite the difficulties of comparing the prologue to other races.
I had planned to post something looking at the North American domestic scene, but some of the last races in December were slow to appear on the FIS site, and then it was Christmas and now US Nationals are right around the corner. So I have some stuff thought out on that subject, but at this point it probably makes more sense to wait until after US Nationals and assess where we are then.
I debated whether these snapshot graphs would be sensible for the prologue (and also some other TdS stages as well) since the race formats during the Tour are a bit odd, so comparisons to the past are fairly tenuous. Also, I think the prologue event is kind of dumb, but that’s just my own bias. I decided to try them for now. I’m going to play it by ear with the handicap start stages later on, though, as sometimes FIS scores a “pursuit” stage like a two day pursuit and sometimes they’ve scored only the times from that day (despite the handicap style start).
I believe tomorrow’s pursuit stage will be treated like a two day pursuit, so the Stage 2 winner will be the person with the best combined time, but they don’t count the prologue time twice for the TdS overall. I’ll spend some time today seeing if I can come up with a sensible race snapshot graph for that kind of event…
Back in September I wrote a post looking at the supposed woes of Norway’s male distance team. The bottom line was that, yes, by various measures things haven’t been looking as good lately for the Norwegian men’s distance squad. With the first period of racing completed, and more rumblings in Norway about their poor performance, I thought it would be a good idea to revisit and update my previous post.
As before, I’m going to avoid fancier metrics like FIS points and stuff. So here’s an updated graph showing the number of athletes Norway, Sweden and Russia have placed in various groups, per race, over time:
Ok, so this isn’t much of a preview. Rather it’s just a bird’s eye view of what happened last year using some bumps plots. Here’s how things played out for the men:
The median skier can sort of be thought of as the “peloton”. As you can see, not much happened until Stage 5, the first really long distance event of the Tour. That stage clearly split the group into three main packs, with some stragglers in the back. These groups mostly stayed together through Stage 6, but then began to drift apart in Stages 7 and 8. Emil Jönsson bagged his sprint WC points and then packed it in despite leading after four stages.
The women, not surprisingly, were a bit more scattered: Read more
Continuing on from my last post on North American distance skiers, let’s jump right in with the sprinters, starting with the men:
No fancy metrics here, just what place people finished in. Once again, red is this season, blue is last year’s Olympics. Alex Harvey’s sprint race in Kuusamo was unusually good for him. Simi Hamilton had one excellent race and two pretty bad ones, although he’s been struggling with a leg injury for a while. Stefan Kuhn hasn’t really put one together yet, but his two results around 30th aren’t exactly atypical for him so far in his career. Kershaw had a strong sprint result in Kuusamo and then an unusually bad one.
Len Valjas certainly had some promising results, but then, we don’t have much to compare that to in terms of data. Phil Widmer had one pretty bad race and another excellent one.
Andrew Newell is the most consistent of these guys, at least in terms of putting himself in the top 30 or top 20. People keep waiting for Newell to put together a podium finish again, but keep in mind that his sprint results so far this season are right in line with how he normally does.
As for the women: Read more
With the first period of WC racing in the books and the next one just around the corner, I’m going to look back at the results of the North American skiers so far this season. The theme of this post is context and expectations.
One of the reasons that I started this blog was that I felt as though the tools we use to measure performance in skiing don’t always match up with our expectations and this can lead to a lot of needless frustration. The general paradigm for measuring performance in XC skiing is to examine only an athlete’s best few races over some time period. FIS averages your best five races over the previous calendar year for their point lists; EISA picks out each skiers best two classic and freestyle races (if I recall correctly), and so on.
We may not be explicitly aware of it, but this methodology stems from an important observation: once in a while good skiers can have truly terrible races. Averaging is particularly sensitive to extremely small or large values, so one terrible FIS point race could have a huge effect if we were to look at all of someones results. This is an excellent observation, but I don’t think people are necessarily aware of the consequences.
Performance measures that look only at an athlete’s best few races are a terrible indicator of how well they typically ski. Instead, it’s measuring how well they have skied at their very best. It’s an estimate of an upper limit, not a “typical value”.
This, combined with short memories, leads to some strange tendencies, I think. When we think about how good a skier is, we tend to think about their best results: Kris Freeman’s 4th place finishes, Kikkan Randall’s sprint podiums, etc. But those aren’t necessarily indicative of how well they ski from week to week.
With all that in mind, let’s look at some graphs.
Note the slightly unusual scale on the y axis. I’ve been tinkering with this for a while now, and I’ve grown to like using percent back from the median skier. Negative values are good (ahead of the median skier) positive values are bad (behind the median skier). It turns out that it mostly (but not completely!) eliminates discrepancies between mass start and interval start races, it’s easy to interpret, and there are some technical mathematical reasons why I like it as well.
The red dots are this season’s results. I’ve highlighted last year’s Olympic results in blue, which I’ll return to in a moment. I will also point out the one instance here where percent back from the median doesn’t capture everything as we’d like: Devon Kershaw’s 5th place in the Olympic 50k last year. His percent back from the median skier isn’t all that low, for a 5th place finish. This suggests that (a) the pace might have been fairly slow and (b) as with all 50k’s, the field was likely somewhat reduced. Read more
My post on the differing performance of Japanese skiers by technique (classic vs. freestyle) got a lot of positive responses and a few requests that I use the same methods on some other countries. First up is Italy.
I’ve tweaked and refined my model a fair bit, hopefully for the better. The basic idea is the same: using a hierarchical linear model to estimate differences in performance in skating and classic races (I’m omitting pursuits of all varieties). There are some technical things I’ve changed to be able to accomodate changes over time. Mostly this means making some adjustments for the occasional small sample sizes you find from season to season. This allows me to provide an estimate even in seasons where a skier did races of only one technique, although naturally those estimates come with a bit of a grain of salt.
In the results by athlete for their entire career, I’m only going to display information on only those athletes that did a minimum number of races of each technique (2) for space and clarity reasons.
The final big change is in the distance category. There are some technical reasons why FIS points are somewhat of a nuisance to use as a response variable in models like these, so I’m using something else: percent back from the median skier. I’ll save a more detailed description for why I’m doing this and how this measure is useful for another post. Here all we need to know is that 0% back represents the median (or middle) WC skier. Negative values mean you’re faster and positive values mean you’re slower.
Going into this, our intuitive notion is that the Italians have been generally better at skating. And that does turn out to be the case. But some other fascinating stuff pops up as well. Read more