How Useful Is A Single Good Result?

Following up from last time, we’re still discussing the recent USSA rule changes for qualifying for WJC/U23s. Now, it’s not like USSA has a ton of options here, but I would like to point out a few difficulties with pre-qualifying an athlete using only a single result.

Being a stats guy, I always think about things in terms of variability. So when I think about a skier’s performance, I visualize it as having a distribution. An athlete’s best races come from one extreme end of that distribution, but they are necessarily fairly rare. So to my way of thinking, a skiers best race isn’t a good estimate of how well they’re likely to ski. Rather, it’s a good (-ish) estimate of how well they might ski, if we’re really lucky.

To give you some context on this, I took all the FIS point results for Americans since 2006. Then I counted up how often each skier matched (or bettered) their best FIS point race from the early season (11-01 to 12-31) during the rest of the season. I counted this separately for men/women and sprint/distance and only kept folks who had at least 4 early season results, and at least one result from the remainder of the season. Here’s what we have:

I’m not quite so interested in the sprint numbers, since FIS points are such a dicey way to measure performance for sprinting. But if you look at the men’s distance panel, what this is saying is that they ski as well (or better) than their best early season result at most once more that season around 60% of the time. The drop-off is a bit less dramatic for the women, but still it is much more likely that you aren’t going to see a race with FIS points that low for the rest of the season.

And this trend is even more stark when we focus in on just the group of skiers who’ve managed a sub-50 point race during the early season:

Granted, this simple method of counting the number of results at a certain level may obscure some things. For instance, maybe folks don’t race quite that fast ever again, but they come pretty close fairly often. That will require a different sort of analysis that we’ll delve into next week.

WJC/U23 Assessment: Finland, Germany and Sweden

Ok, this will be my last post on WJC/U23s, I promise! As we did before with the US and Canada, and then Norway and Russia, these graphs are simply displaying each WJC or U23 result for each nation (in finishing place, not FIS points) with the median tracked in red. As usual, I’ll keep my commentary to a minimum, except to remind you to be aware of situations where there isn’t much data (e.g. U23 sprinting events).

Here I notice the general improvement by the men's distance racers over the last two seasons and that the women's sprint results have recently included one quite fast skier and then several others who aren't quite so fast.

WJC/U23 Assessment: Norway and Russia

Continuing on in the vein of my post yesterday recapping the WJC and U23 results for the last several years for the US and Canada, this post will do the same for Norway and Russia. As before, I’m plotting the finishing rank, or place, of each skier and then tracking the median result in red. So “how far behind the leader” information is being lost here, but I feel like a lot of the discussion that happens surrounding WJC/U23 results tends to revolve around what place people finished.

Not surprisingly, pretty darn good, at least to my jaded American eyes. Some picky Norwegians may note that while the median men's distance performance remained roughly the same as last year, they saw fewer top fives. As I noted before the competitions started, the Norwegian junior women have been doing very well in the distance events, with occasional results outside the top 15 or 20, but only occasionally. Interestingly, both the junior men and women from Norway had a bit of an off year (for them) in sprinting in 2007-2008.

Animated WJC Results History

I had fun making those animated charts for World Cup points the other day, so I thought I’d try using the same charts to look at World Juniors. All I’ve done is tally up “World Cup” points for each nation and each year, scoring each (individual) race using the traditional WC point scale. Then I divided each tally by the total number of such points awarded that year (since the number and type of races has changed over the years). So each point represents a single nation, with the x and y coordinates being the proportion of total “points” earned by that nation, that year.

As before, there’s really only a single data point for each nation for each year, but this Google chart API move the points smoothly between them. And they require Flash. Men first and the women below. Important: I’ve been noticing that the “Trails” option grinds the whole animation to a halt for me, so I’d recommend unchecking that option before you start playing around with these.

WJC/U23 Assessment USA and CAN

This week is going to be pretty heavy on the WJC/U23 graphs, I suppose, as we wait for the World Cup racing to start up again.

One of the things that has struck me while writing this blog is how much importance and meaning is placed on WJC/U23s compared with how little data they provide in a given year. What I mean is that if we think of a ski race as a measurement of performance, WJCs provides only 2-4 “measurements” for each athlete. That doesn’t add up to a ton of data, given how variable skiers can be, even when they are racing well. That said, they are what they are: a World Championships, so assessments are inevitable. Let’s see what we can see.

Each below graph summarizes the performance of either the USA or CAN in either WJCs or U23s over the past six seasons. I feel like the quality of the WJC (and to a lesser degree the U23) fields are at least reasonably stable over this time period, so I’m only going to look at finishing rank. They show each individual result along with the median result for each year in red. We’ll get our feet wet with the WJC graph for the US:

This should give you a good sense for how, with so little data within each year, that simple summaries like the median don't always reflect everything we'd like them to about the data. The two clearest trends here are in men's distance and women's sprint, with some steady drop-offs in the median over the past three seasons. The US women did have some strong sprint results this year compared to the recent past; they just also had some bad ones as well.

What Happens To Successful World Junior Racers?

This post appeared on FasterSkier several months ago, but I never posted it here, I don’t think. With World Juniors and U23s wrapping up, it seemed relevant. I edited the text slightly so it will differ somewhat from the original version. These graphs do not include this year’s WJC results or any results from the 2010-2011 season.

Identifying talent in endurance athletics at a young age can be a challenging task.  My goal here is to give you a sense of what different levels of success at WJC/U23s might indicate and how much variability there is between athletes in this respect.

Our basic tool will be a graph of FIS points versus age for distance results, broken down by their best result (finishing place) at WJC/U23s.  This gives us a total of eight panels representing the men and women who’s best results were between 1-5, 6-10, etc.

If you’re in the 1-5 group, that means your best result at any WJC/U23’s was between 1st and 5th.  For each group, I’ve plotted their FIS points (in distance races) versus age.  Each dot represents one race by one athlete.  The blue trend lines are for everyone in that panel.  The red lines are for only the American’s in that panel.

If an athlete’s name appears next to one of the red lines, that means they are the only American appearing in that panel.  Otherwise, the trend lines are averaging over multiple athletes.  These graphs are fairly large, so you might want to click on them for a larger version.

There’s a lot of interesting stuff going on here.  Let’s ignore the Americans for a moment and just look at the blue trend lines.  We’d expect the 1-5 group to have the most future success overall, and that is the case.  (More dots near the bottom.)  As we move right to the 6-10, 11-15 and 16-20 groups the cloud of points mostly creeps upwards, but less so by the time we’re comparing the 11-15 and 16-20 groups, suggesting that there’s a bigger difference between a top 5 and a top 10 than between a top 15 and a top 20 result.

We might also expect the trend lines in the groups to flatten sooner and at a higher level as we move rightward.  This would correspond to skiers who see less success at WJC/U23s not improving as much or leaving the sport earlier, or both.  This kind of happens, but with some weird caveats.  The men seem to follow this pattern until we get to the 16-20 group.  Why do the 16-20 athletes continue to improve past age 23 where the 11-15 athletes don’t?  The answer is likely our old friend, selection bias.

Success in ski racing serves as kind of a signaling mechanism as to whether you should continue racing.  Skiers who’s top WJC/U23 result is between 16-20 are more likely to stop pursuing an international racing career as their results have signaled that they won’t be successful.  The only ones who do continue are the ones who beat the odds and see some measure of success.

Another interesting aspect of these data is the evident plateau effect between ages 20-23.  Notice how many of the blue trend lines flatten out suddenly at these ages.  More so with the women, but we see it in the 11-15 and 16-20 groups for the men as well.  What might be causing this?  My best guess is that it’s another artifact of selection bias.

My guess would be that the ages 20-23 are crucial for deciding whether you’re going to be a successful international ski racer.  Athletes steadily improve up to that point and then have a decision to make.  Do I continue racing or hang up my skis and go to school, get a job, get married, have kids, etc.?  The plateau in the trend lines may come from those athletes that stop improving between those ages.  These athletes are likely to stop pursuing a serious racing career (or at least many of them do).  The ones who continue are, by necessity, the skiers who end up achieving some measure of success.

That’s why we often see this dramatic improvement from ages 16-20, a plateau from 20-23, and then continued improvement from 24 on.  It’s important to realize that this doesn’t mean that if you somehow push through ages 20-23 regardless of how much success you’re having, that somehow you’ll magically get faster simply by passing the age of 23.  The trend lines simply reflect the individual decisions different athletes are making about their likely future success.  I’d guess that a similar plateau effect is happening in the men’s 1-5 and 6-10 panels, but we don’t see it because it’s happening to a much smaller fraction of the athletes, so the trend line isn’t picking it up.

Now let’s look at what’s going on with the Americans.  Kris Freeman, Rob Whitney and Noah Hoffman are the only Americans in the men’s 1-5, 6-10 and 11-15 panels, respectively.  Each of the other red trend lines are based upon multiple athletes.  For example, the red trend line in the women’s 1-5 panel is based upon Liz Stephen and Morgan Arritola.  The truly bizarre looking red trend line in the women’s 6-10 panel is based upon Kikkan Randall, Nicole DeYong, Taz Mannix and Kristina Trygstad-Saari.  Both Randall and DeYong have significantly improved their distance results of late, which accounts for the line bending down suddenly past age 25.

Kris Freeman has been more or less “typical” for those athletes achieving a top result at WJC/U23s.  Rob Whitney had a very promising beginning, relative to his peers, but encountered some serious difficulties around age 22-23.  Noah Hoffman seems to be tracking the trend for his peers, or perhaps a bit better.

The most important thing I want to emphasize in these data is how variable they are.  Generally speaking, of course, it’s better to have good results (not just at WJCs, all the time!).  But if anything, this graph indicates just how crude a predictor WJC/U23s can be for future success on the World Cup.  There are plenty of athletes who’ve landed top results at WJCs but haven’t gone on to do much else.  Conversely, there are plenty of skiers who never cracked the top 15 at WJCs but ended up having a very long and successful career.  There are, as in the rest of life, many different paths to success.

WJC/U23 Preview: International (con’t)

As before, we can also look at how the results at WJCs and U23s for different nations match up with each athlete’s future performance in major international competitions. These graphs are constructed in the same manner. For each athlete in the graph I scored their future results in all OWG, WSC, WC or TDS races using standard WC points (100, 80, etc.) and then made the size of the dots correspond to the square root of the value. Taking the square root helps make the scale more compact, otherwise you have a few athletes (i.e. Björgen, Elofsson) with huge values and it becomes hard to distinguish everyone else.

These are the fancy SVGs again, with mouse over text, which may not work in all browsers. Try switching to Chrome, Firefox or Safari if you have problems. Also, be aware that there being a lot more points, many of them overlapping, not all of the mouse over texts are working quite well. This is a work in progress.

First up are the WJCs distance and sprint results (with vanilla png versions here and here):

I think what’s interesting to look at with these graphs is the variation in when strong results at WJCs or U23s translate into occasional or regular top thirty performances in major international competitions.

And finally the same for U23s (and the plain png versions here and here):


