I love the annual series FasterSkier does profiling each nation that scored nation’s cup points in the days before the World Cup begins. Usually by the time they reach Norway and Russia, I’m actually less interested, but the first week or so is a fun ride through nations with tiny, little known programs.
First up was Spain, featuring Laura Orgue. The correctly note that she’s basically the only real prospect they have, and thankfully she’s not terribly old (~25). What can happen a lot with these tiny ski programs (or even medium sized ones, really, like the US) is they latch onto a decent skier and basically ride them into their 40’s. However, looking at Orgue’s last few seasons, I’m a little concerned:
No one’s pretending that Orgue has been anything other than a middle of the pack type of distance skier, but I’m worried that her results have plateaued a bit, perhaps even tailing off last season. In my experience looking at skier’s results, after the early rapid improvement, they often hit a plateau. But each year they spend stuck at that level decreases the chances that they’ll start improving again. The clock is ticking, so to speak.
As for the men, the FasterSkier profile if anything underplays the comical contrast of with and without Johann Mühlegg: Read more
I’ll be the first to admit that sprinting doesn’t get as much love here at Statistical Skier. To be honest, that’s probably my subconscious at work, as I have a stronger connection to distance events I suppose. But it’s not terribly fair, so here is some World Cup sprint qualification analysis for all you sprint lovers out there.
The first step to success in sprinting is qualifying for the elimination rounds.1 So I thought it might be fun to look at what sorts of efforts it takes to qualify in a World Cup sprint race (and any trends or patterns that might arise).
We’re going to do this using two measures: percent back and pace (seconds/km). Using pace means we’re implicitly assuming that the courses are measured accurately, which probably has not always been the case. Just for starters, the skier’s times are measured to the tenth of a second, but the length of the courses is typically only reported to the nearest 100m. And it takes a bit more than a few tenths of a second to ski 100m. Also, we should keep in mind that there can be pretty extreme variations in course design and weather/snow conditions. So keeping all that in mind, let’s dive in…
- Everyone skis the course one at a time, i.e. the qualification round, and then the top 30 move on to elimination heats of 6 at a time or so. ↩
Continuing on from last time, we take a look at male long distance specialists. Recall I’m being all “fancy” here and using a fairly sophisticated modeling technique to identify skiers who tended to have better major international results at longer distances.
If you’re familiar with any statistics, you can think of it as an extension of a linear regression model, where we’re estimating a coefficient for each skier that (hopefully) captures the effect of race length on their individual results. As before, we’ll focus in on those for whom the model identified at least a nominally “significant” effect: Read more
In alpine racing, that is! Fooled you, didn’t I?
Like I said before, I’m not going to make a habit of posting about alpine racing, but since the season opener went fairly well for the US, I thought I might as well. So here are two race snapshot graphs, similar to those I’ve been making for cross country races, although a little less tricked out. The bars represent the middle 50% of that skier’s results over the past three seasons, in this event. (Giant slalom, in this case.) The red dot is their FIS points for the race this weekend. As always, the aim is to give a quick visual representation of who skied better or worse than usual.
And the men: Read more
A commenter noted that it was interesting that Petra Majdic showed up as being “statistically significantly” better at longer distance races (as opposed to sprint races), although just barely.
It turns out this is a good example of the statistical concept of leverage. Check out the following graph that compares skiers at each end of the statistically significant group I highlighted in my last post:
Johaug was near the top of the list and Majdic just barely snuck in at the bottom. What’s going on here is that the “standard” distances, 10-15km don’t have much of an effect on the model. But really short distances (5k or Prologues) and really long distances (30km and up) exert more leverage on the regression line.
So Johaug hasn’t done terribly well in Prologues, but has done well in 30k’s. Conversely, Majdic doesn’t have quite as extreme a split between those two race lengths. It is true that basically all of Majdic’s results at 30km+ are classic races (big surprise!) and she is quite the classic specialist. So it could be that despite attempting to control for technique in my model I haven’t quite entirely removed technique as a confounding factor in Majdic’s case.
Since we supposedly do statistics around here, the goal here will be to ‘statistically’ identify folks who specialize in longer distance races. By that I mean people who tend to ski better (relative to their own performance) in longer races. The fun part is to see whether the folks we single out match up with the people we would have thought of anyway. We’ll begin with the women.
The mechanics of this will largely mirror how I’ve identified technique specialists in the past, using mixed effects models. These are extremely handy modeling tools when you want to compare athlete specific effects. If you’re familiar with linear regression models, you can imagine doing a regression analysis on all skiers that investigates the effect of race length on performance. What we’d end up measuring is the average effect across all skiers. That’s not terribly interesting, or meaningful. Mixed effect models allow us to estimate a separate parameter for each skier that (hopefully) captures the effect of race length on performance for each individual skier.
Beyond that, I’ll spare you the nitty gritty, except to say that I included a handful of other variables in the model to attempt to control for some other differences in race types (technique, mass vs. interval, etc.).
Out of all the women with a reasonable number of major international distance races (at least 10), these are the ones with who have performed ‘significantly’ better in longer races:
The more negative the value, the stronger the preference for long races. One of the reasons I like this kind of analysis is that it doesn’t just pick out people who are good at long races; it picks out people who are better at long races than short races. This means you aren’t just picking out the fast folks over and over again. There are plenty of big names here, but also some folks who are not dominant WC skiers. I’ve never hear of Annmari Viljanmaa, for example.
Therese Johaug floating to the top here probably isn’t a surprise, although it’s not like I’d count her out completely in a 5k. And I’m also not surprised to see three Italians here; they seem to excel at the longer stuff a lot of the time.
Who are some folks at the opposite end of this list? Along with a lot of people I’ve never heard from you’d find the likes of Vesna Fabjan, Ida Ingermarsdotter, Astrid Oeyre Slind, Katrin Zeller, Wendy Wagner and Natalia Korosteleva, who would all apparently tend to do worse in longer races.
I haven’t done one of these in a while, so just for fun…
This time let’s compare two great female skiers from the 90’s (and early 00’s), Norwegian Bente Skari and Italian Stefania Belmondo. Unlike with Daehlie and Alsgaard, things are little more balanced for these two: