Since we in the US now have a relay team that’s doing quite well, that also means we as fans have something new (and fun!) to argue about. Namely, what is the best team we can put out there?
Let’s assume that there are seven women who could potentially be placed on the US women’s relay team in Sochi: Kikkan Randall, Liz Stephen, Jessie Diggins, Sadie Bjornsen, Holly Brooks, Ida Sargent or Sophie Caldwell. I suppose Caitlin Gregg is another possibility, but let’s keep it at these seven for now.
Some very simple math tells us that there are actually not that many relay teams you could construct from seven people. 840, to be exact. That includes every possible combination of four people, and every possible ordering of those four.
That got me to wondering if it were at all possible to somehow assess the quality of each relay team versus the others. I don’t track relay leg results, so I only have individual distance results to work with. But if we limit ourselves to a pool of seven athletes, then for a given relay team, we really only need to know how each skier performs against the three folks left off, in whatever technique their leg is.
For example, let’s imagine a relay team of Diggins, Caldwell, Stephen and Brooks, in that order. Then we’d look at how Diggins and Caldwell have performed against the remaining three in classic races, and how Stephen and Brooks have performed against the remaining three in freestyle races. What I settled on was taking the weighted average of the difference in percent back between each pair of skiers, with races weighted based on how recent they are.
So using the above example, we’d take Jessie Diggins and look at the difference in percent back between her and Randall, Sargent and Bjornsen in classic distance races, and then take the weighted average of those values, weighting recent events more heavily. Repeat for each Caldwell, Stephen and Brooks and then add up those four numbers. Voila! One way to think about this is that it is similar in spirit (though not in the technical details) to VORP in baseball.
Some obvious caveats: this methods cannot distinguish between the two classic legs and the two freestyle legs. So you don’t get any special consideration for skills at scrambling or anchoring. In that sense, order is only very loosely evaluated, amounting to just comparing techniques. In fact, once you decide who is doing the classic legs and who is doing the skating legs, my method will give you the same “score” for all of the four different orderings you could use. But it will help sort out issues of whether someone like Randall is more valuable skiing a classic leg or a skate leg.
Still, it’s fun to play with, and now that I’ve built it I can start using it on skiers from other countries…
The results are pretty unsurprising. The best team is basically what we saw last weekend: Randall and Bjornsen on the classic legs and Stephen and Diggins on the freestyle legs. The next best team simply swapped Randall and Diggins, having Randall skate and Diggins do a classic leg. The third best team starts to get kind of interesting. It has Stephen and Bjornsen doing the classic legs and Randall and Diggins skating.
You can also ask fairly fine grained questions, like “What’s the best team with Ida Sargent on it?” The answer in this case would be the team with Bjornsen and Sargent taking the classic legs and Randall and Stephen skating. Similarly, if you require that Holly Brooks be on the team, then once again you have to remove Diggins, but this time unsurprisingly you have to keep Randall on a classic leg and Brooks gets the freestyle leg vacated by Diggins.
Finally, one slightly surprising tidbit that fell out of this was that of the three (Brooks, Sargent and Caldwell), if you have to sub one of them in at the moment, the best option is Brooks.
What do I notice about the men’s podium from Saturday’s 15k classic race:
Excepting Poltaranin, not much racing at this level between them. In fact, among the youngest men’s distance podiums I have on record (basically since the early 90′s):
Saturday’s race is that unusually low value in the lower right corner. As you can see, there’s nothing to suggest that this is the start of a dramatic trend, as the other ages this year have been all over the map.
I was chatting with some friends today about the difference in depth between the US men’s and women’s squads, particularly in distance events, so when I sat down to noodle around with some data I ended up making this graph:
This is all US men’s distance results for the past five seasons or so, but with some notable exceptions. First, I’ve removed all results from Kris Freeman and Noah Hoffman. Second, I’ve removed all prologues from stage races (this excluded 4-5 top 30 results, mostly by Andy Newell). Outside of Freeman and Hoffman, the US men average around 45th-55th place.
More strong early season performances from some Americans in Norway this weekend. How strong? Let’s take a look.
First up is Sadie Bjornsen who had a very strong 5th in the 10k classic.
Values above zero are good. The grey shaded region and red trend line represent how she has performed against these specific skiers in the past. She had already made a big jump last season, and this race was very strong even compared to that. Next up Noah Hoffman:
This is the better of his two races, the 15k freestyle. This result was considerably better than he normally does against this crowd compared to last season. His classic race (graph omitted) was somewhat worse that usual, but not dramatically so. Lastly, Liz Stephen had a strong result in the 10k freestyle:
She’s been on an upward trend for several seasons now, and this would suggest that might continue.
This first batch of races for 2013-2014 are in from Muonio, Finland. As always, it’s difficult to read much into a single race, particularly early season races like these. You never quite know who’s still in the midst of a big training block and how seriously people are approaching them.
Still, I thought it was interesting that Russian Petr Sedov won the last race of the weekend, the 15km freestyle. He’s still fairly young, and after a promising introduction to the World Cup he kind of slipped back a little last year, I think:
He had a handful of strong races last year, but was much less consistent in general. If we look at this Muonio 15km race in particular, we can get a better sense of the quality of his race by comparing how he did to each of the top 30 skiers at the Muonio race to how Sedov has fairly against those specific people in the past:
Values greater than zero are better for Sedov in this graph. The blue dots are the differences in percent back between Sedov and the other skiers at Muonio (he won, so they are all above zero). The shaded region with the red trend line summarizes how he’s fairly against this specific group of skiers in the past.
As you can see, he generally dominated them, and then last season struggled considerably, losing to this group almost as much as he beat them. If this is a sign of things to come, we could see more of the Petr Sedov from two years ago, or perhaps an improved version.
A little later than I’d intended, but here it is. Instead of just the previous season, this is a view of the difference in median FIS points in Europe versus in North American for all North American athletes, in all events. Since we’re looking at multiple years, the graph is a bit different. For each season, we have the center of the distribution of median differences (the median of the medians, if you will) and the bars represent the middle 50%.
This is interesting on a few levels. First, if you looked at just last season, and saw the the North American female sprinters were getting better points in Europe than in North America, there would be a temptation to attribute that to the increased success by folks like Kikkan, Jessie Diggins, Ida Sargeant, etc. But here we can see that the points for North American female sprinters have pretty much always been better in Europe!
The other three panels all to some degree show a shift away from good points in Europe towards better points in North American. It’s a weak trend for the male sprinters, but much stronger for both the male and female distance events. The other interesting difference is that there seemed to be a sharp jump for the men’s distance skiers in 2010-2011. Normally that would make me wonder about changes in the population (i.e. a sudden jump in the number of non-USST folks racing in Europe), but the fact that the other panels look so different makes me skeptical.
Last week I wrote about how US and Canadian skiers fared last season, in terms of FIS points, when racing in North America versus in Europe. That included a rather almost too perfect looking symmetric, normal distribution for the men’s sprinters. On it’s face this suggests that the difference in median points earned by North American sprinters in Europe versus at home, while possessing a fair bit of variation, is basically centered perfectly at “no difference”. A reader complained about this apparent statistical anomaly, so I offer the following:
This is the exact same plot, only instead of a density estimate I’ve used a simple histogram (yes, yes, I know, a histogram is a sort of density estimate). I suppose if you wanted to get all shamanic and read the tea leaves on this, you could argue that the four short bars on the extreme left of the men’s sprint panel argues for a less symmetric distribution than the density estimate showed. But I think we’re splitting hairs at that point.
The sample size here is what I’d call medium-ish, at around 75 individuals for the men’s sprint panel. I think the best argument against what I posted last week is not that the distribution appears remarkably symmetric, but that perhaps my choice of smoothing parameter for the density estimate (in truth, I simply used the defaults for my software) were perhaps a bit….aggressive for 76 data points.
Later in the week I’ll update this with a look at how these values have changed from season to season.