A Big Data Approach to Predicting Your Marathon Pace
A new study uses the training data you upload to sites like Strava to estimate the "critical speed" that determines your race performance
Outside's long reads email newsletter features our strongest writing, most ambitious reporting, and award-winning storytelling about the outdoors. Sign up today.
This article is about a new study that uses accumulated training data from Strava to predict your marathon time. That’s the payoff. But to get there, we need to start by digging into a concept called critical speed, which is a hot research topic in physiology these days. It’s a really neat concept, so I promise the digression is worthwhile.
Let’s say you take your best possible performances over a range of at least three distances lasting between about two and 20 minutes—for example, a mile, 3K, and 5K. Plot them on a graph showing your speed on the vertical axis and your finishing time on the horizontal axis, as I’ve done below with my best 1,500, 3,000, and 5,000-meter times. What you find is that the dots fall along a curved line called a hyperbola, which is another way of saying that the speed you can sustain for a given amount of time is inversely proportional to the elapsed time:
This has been known for a long time. One of the first to explore these relationships, back in the 1920s, was A.V. Hill, the guy who discovered the concept of VO2 max. What’s interesting about hyperbolic curves is that they approach—but never reach—an asymptote. No matter how far out to the right we extend that curve, it will never drop below the dotted line, which for my particular three data points corresponds to 4:41 per mile pace. That’s my critical speed (or at least it was about two decades ago).
In theory, what this graph suggests is that, at paces slower than 4:41 per mile, I can run forever. In practice, that’s unfortunately not true. I wrote an article last summer that explores why we eventually run out of gas even when we stay below critical speed. Some of the potential issues include fuel depletion and accumulated muscle damage. Still, critical speed represents an important physiological threshold. Below critical speed, you can cruise along in a “steady state” in which your heart rate, lactate levels, and other physiological parameters stay roughly constant. Above critical speed, these parameters keep drifting up until you’re forced to stop. In practice, you can generally sustain critical speed for about an hour.
In a study by Andrew Jones and Anni Vanhatalo of the University of Exeter a few years ago, they used race PRs from distances between 1,500 meters and 15K to calculate the critical speed of a bunch of elite runners, and then compared their critical speed to their marathon pace. On average, the runners raced their marathons at 96 percent of critical speed, which fits with the idea that you have to stay just below that threshold in order to sustain a pace for more than an hour.
That’s a pretty useful thing to know if you’re planning to race a marathon. But there are two questions to consider. One is whether less elite runners can also sustain 96 percent of their critical speed for a marathon. Given that they’re out there for much longer, it seems unlikely. The other question is whether there’s a more convenient way of estimating critical speed for the majority of runners who don’t frequently race at short distances like the mile.
Those are two of the questions the new study, published in Medicine & Science in Sports & Exercise, sets out to tackle. Barry Smyth of University College Dublin and Daniel Muniz-Palmares of the University of Hertfordshire in Britain analyzed data from more than 25,000 runners (6,500 women, 18,700 men) uploaded to Strava. All the runners competed in either the Dublin, London, or New York marathons, and logged their training for at least 16 weeks prior to the race.
The basic assumption was that hard training efforts would provide a reasonable approximation of the speed-duration hyperbolic curve. For each runner, they scanned the training data and extracted the fastest 400, 800, 1,000, 1,500, 3,000, and 5,000-meter segment over the entire training block. They used this data to plot the hyperbolic curve and calculate critical speed. After a bunch of experimentation, they determined that they could get the best results by using just the fastest 400, 800, and 5,000-meter splits, perhaps because those are distances commonly hammered by runners in interval workouts and tune-up races.
Using this model, they were able to predict marathon times to within an average of 7.7 percent. On one hand, that’s pretty good for an automatic model that blindly looks at nothing but your fastest 400, 800, and 5,000-meter splits. On the other hand, 7.7 percent for a three-hour marathoner is almost 14 minutes, which is a pretty big deal if you’re trying to base your pacing off the prediction. So at first glance, this looks a bit like BMI: very useful for population-level trends, not so good for making individual decisions.
But there are some further nuances to consider. On average, the runners in the study sustained about 85 percent of their estimated critical speed during their marathons. That’s considerably lower than the 96 percent managed by the elites, which isn’t surprising since the recreational runners in the study had to sustain their pace for a lot longer.
In fact, there’s a clear trend showing that runners with slower finishing times were able to sustain lower percentages of their critical speed. Runners finishing around 2:30 averaged 93.0 percent of critical speed, while those finishing slower than 5:00 averaged 78.9 percent, and there was a pretty straight line in between. In the graph below, that percentage of critical speed is shown on the vertical axis (Rel MS) as a number between 0 and 1: runners who finished in 150 minutes (i.e. 2:30), for example, have a Rel MS of about 0.93.
That doesn’t mean that the slower runners weren’t trying as hard. You simply can’t stay as close to your personal critical speed for four hours as you can for three hours. Physiologically, it’s a different challenge. But the key point is that, with that graph, you can make a more accurate prediction of how fast you’ll run your marathon. If you’re a three-hour marathoner, you should probably aim for about 90 percent of critical speed, rather than 85 percent (like the average result in this study) or 96 percent (like the elite marathoners in the earlier study).
Another interesting pattern that shows up in the graph above is that women seem to sustain a slightly higher percentage of the critical speed than men. It’s probably not worth thinking too hard about this for now, because of the sheer number of possible explanations, including physiological differences, training differences (which would affect the calculation of critical speed), and pacing differences in the race itself. But file it away for future exploration.
The researchers also analyze pace in the initial 10 miles of the race, and conclude that your risk of a late-race blow-up increases substantially if you start at greater than 94 percent of your critical speed. The basic takeaway—starting too fast relative to your fitness will be punished by the marathon gods—is undoubtedly true, but I’m not convinced the 94-percent threshold has any particular significance. It’s probably safer, and definitely simpler, to simply start the marathon at whatever pace you think you can sustain to the finish.
There are already various tools on the market that use a similar process to what’s described here to estimate your critical speed (or, analogously, critical power), including Stryd’s running power meter and GoldenCheetah cycling software. What’s needed, in my view, is more big-data validation of how well these models work in the real world, published openly so that we can decide for ourselves how much to trust the algorithms with our race plans. This study is a pretty good start, but I wouldn’t bet my marathon on it quite yet.
For more Sweat Science, join me on Twitter and Facebook, sign up for the email newsletter, and check out my book Endure: Mind, Body, and the Curiously Elastic Limits of Human Performance.