A little later than I’d intended, but here it is. Instead of just the previous season, this is a view of the difference in median FIS points in Europe versus in North American for all North American athletes, in all events. Since we’re looking at multiple years, the graph is a bit different. For each season, we have the center of the distribution of median differences (the median of the medians, if you will) and the bars represent the middle 50%.
This is interesting on a few levels. First, if you looked at just last season, and saw the the North American female sprinters were getting better points in Europe than in North America, there would be a temptation to attribute that to the increased success by folks like Kikkan, Jessie Diggins, Ida Sargeant, etc. But here we can see that the points for North American female sprinters have pretty much always been better in Europe!
The other three panels all to some degree show a shift away from good points in Europe towards better points in North American. It’s a weak trend for the male sprinters, but much stronger for both the male and female distance events. The other interesting difference is that there seemed to be a sharp jump for the men’s distance skiers in 2010-2011. Normally that would make me wonder about changes in the population (i.e. a sudden jump in the number of non-USST folks racing in Europe), but the fact that the other panels look so different makes me skeptical.
Last week I wrote about how US and Canadian skiers fared last season, in terms of FIS points, when racing in North America versus in Europe. That included a rather almost too perfect looking symmetric, normal distribution for the men’s sprinters. On it’s face this suggests that the difference in median points earned by North American sprinters in Europe versus at home, while possessing a fair bit of variation, is basically centered perfectly at “no difference”. A reader complained about this apparent statistical anomaly, so I offer the following:
This is the exact same plot, only instead of a density estimate I’ve used a simple histogram (yes, yes, I know, a histogram is a sort of density estimate). I suppose if you wanted to get all shamanic and read the tea leaves on this, you could argue that the four short bars on the extreme left of the men’s sprint panel argues for a less symmetric distribution than the density estimate showed. But I think we’re splitting hairs at that point.
The sample size here is what I’d call medium-ish, at around 75 individuals for the men’s sprint panel. I think the best argument against what I posted last week is not that the distribution appears remarkably symmetric, but that perhaps my choice of smoothing parameter for the density estimate (in truth, I simply used the defaults for my software) were perhaps a bit….aggressive for 76 data points.
Later in the week I’ll update this with a look at how these values have changed from season to season.
I thought I check in on the most recent season and compare the point availability for North American skiers in Europe versus at home. This was a pretty simple approach, I just took all US and Canadian skiers who raced in both North America and Europe last season (any race type) and calculated the difference in their median FIS points for the two locations. Finally, I did a simple density estimate on the difference to get this:
The x axis here is relative, not absolute. So -100 means a person’s median points in North America last season were 100% better than their median points in Europe. I’ve omitted the y axis labels entirely, since the idea here is to simply look at the shape of the curve, and where most of the area is located.
The men’s sprint panel sticks out like a sore thumb for being so damn perfect looking. This strongly suggests that as a group, North American male sprinters weren’t really any more likely to score better FIS points in Europe or “at home”. As the distribution makes clear, though, there is plenty of variation between individuals and how their particular races went at home and abroad.
While we’re on sprinting, my next biggest surprise was the women, who appear to have had a moderate trend towards scoring better sprint points in Europe. Before you start saying that Kikkan Randall is driving this, keep in mind that Kikkan only contributed a single value to that density estimate. Each individual skier only counts once in each graph. So maybe between Kikkan, Jessie Diggins, Ida Sargeant, etc., one might have expected some good points in Europe, but I wasn’t quite expecting it too be that clear.
The distance panels aren’t too surprising, I think. Even with the recent improvements for the US women in distance events, US and Canadian skiers are still much more likely score lower points in North America than in Europe.
I was recapping WJC results last week by comparing them to each nation’s historical performance. Let’s do the same thing this week, but with U23s. Starting with the Americans:
The trend isn’t spectacular, but each group managed at least one or two decent results (compared to previous years). As for the Canadians:
The men’s distance panel is a tad unfair, since the Canadian men really did have one year of an usually strong group of Alex Harvey, Len Valja, Frederic Touchette and Brent McMurtry.
As we approach the Tour de Ski, this is a short assessment of how the US and Canadian teams are doing so far compared to previous seasons. As I typically do, I’m going to include all World Cup results, rather than just the best results. Clearly, there have been some strong results for both the US men and women thus far, but I’m frequently more interested in how we’re doing as a group, top to bottom.
Starting with the distance events:
For the math-challenged out there, this is every US and Canadian distance result over the past several seasons. The lines represent the middle (Median) result and the top 20% (Quintile 1) and the bottom 20% (Quintile 5). The Canadians have just generally struggled all around so far.
Comparing the US men and women reveals the now standard difference. The US women have been improving steadily over several seasons across the board. Their worst results have improved about as much as their best ones. For the men, their best results have been flat, or possible improved slightly, depending on the time frame you want to look at. Up to this season, their worst results had also been improving. One mitigating factor here is that a fair number of the period one races were in Canada, which allowed for a deeper than usual American field. This means it’s more likely you’re going to see “marginal” starters being somewhat over-represented here. Of course, the counter-point is that the US women had the same race schedule…
As for sprinting:
This is essentially the same post, only we’re plotting the final finishing place. It might be mildly surprising to see the US women’s trend lines running basically flat or even sloping up slightly, but if you think clearly about just the results we’ve had so far this season it makes sense. The women have put together some good sprint races, but Kikkan had one “off” sprint race, and Jessie Diggins has only made it past the qualification round once. So if Ida Sargent continues to have a strong season, and Diggins comes along later on, I think we’ll those numbers improve quite a bit relative to where they are now.
Once again, the less said about the Canadians, the better, except that I’m betting the rest of the season won’t be this bad. Maybe not awesome, but certainly not this bad.
Anytime the World Cup hits North America, the issue of weaker fields always comes up. Frequently, even when American or Canadian skiers do seemingly very well here, there’s always that nagging feeling about how that result would translate to a more “complete” World Cup field in Europe.
Overall, I think the US results in particular were strong enough that I think we should be comfortable with them on their own. But it’s still an interesting question, so let’s see what we can cook up.
My general approach for this kind of problem is to compare people to a specific collection of other skiers. So let’s walk through how this will work using Ida Sargent’s 14th on Sunday as an example.
Consider all the skiers in Sunday’s pursuit who have done at least 3 mass start or pursuit style races over the past year or so. Suppose we calculate Sargent’s percent back relative to these athletes. Now, armed with those percent back values, we can go back and look at where that would have placed Sargent in each of the similar World Cup races the other competitors had done.
Then we can take the median as a sort of prediction for how Sargent’s effort would be expected to play out in a regular field.
So let’s take a look at the results for the Canmore distance races:
These are all the North American top 30 results from the two distance races. Each skier’s actual result is in red, and their projected result in a “full” field is the black dot. The ranges plotted along with the projected results are provided to give you a sense of how variable even “full” World Cup fields can be, i.e. considerably.
Note that all of the projected results are worse than the actual results, which is what we’d expect from a weaker field. But in many cases, they are not worse by very much at all.
For instance, the women’s results from Thursday were probably right on the nose of where they would have landed in a fuller field. The rest would probably only have been moderately to slightly worse worse, with the possible exception of Elliott’s race on Sunday, probably because the 20th-40th range is the area most likely to get more competitive in a stronger field.
I did the same thing with the sprint race, treating the qualification round the same way I would a distance race: Read more
A commenter on my last post correctly pointed out that it might be interesting to look at performance using a percent back based measure on the qualifying times, since the number of racers present can greatly effect finishing place.
I didn’t quite have time to pull together the standardization piece that I typically use with distance races (I will definitely do that soon, though…) so this will simply use the percent behind the 30th qualifier. I’m fairly confident that the general picture won’t change much once you standardize those values, but I’ll check once I get that piece organized.
Anyhow, here you go:
So these are the same data, US and Canadian sprinters in WCs held in North America only. For comparison, the graph from last time using finishing place: Read more