FTP test vs power curve: when a self-coached cyclist actually needs to test
I have done dozens of FTP tests across my racing career, and the dirty truth is most of them never changed a training decision. The self-coached rider sitting at their desk on Tuesday with a modeled FTP on screen is making a decision, not a measurement: trust the curve, or burn Saturday on a 20-minute effort? The math behind tests versus power-curve modeling is covered in our ftp-without-a-test pillar. This spoke is the decision framework — when the model is enough, when a test is non-negotiable, and what testing actually costs.
By Jim Camut · Former pro & ex-Bruyneel Academy racer
Updated May 5, 20264 chapters6 citations
When the modeled FTP is enough on its own
If you have 90+ days of varied power data, at least one near-maximal effort longer than eight minutes, and the modeled estimate has not jumped more than three percent in the last month, the curve is doing its job. What a fresh 20-minute test would add at that point sits inside the test's own noise band.
The honest reason most self-coached riders test is not that the plan needs a better number — it is that Zwift, TrainerRoad, or Wahoo SYSTM scheduled a test and the calendar nudged them. A modeled FTP from a fitted Critical Power curve sits within roughly three to five percent of a clean field test once you have 90 days of mixed riding to fit against [Allen et al. 2019, MacInnis et al. 2021]. The 20-minute protocol you would replace it with carries its own measurement noise: Borszcz and colleagues showed warm-up structure alone shifts the resulting FTP estimate by a clinically meaningful amount on the same rider on the same day [Borszcz et al. 2022]. Trading three percent of model uncertainty for five percent of pacing and warm-up uncertainty is not progress.
The other under-appreciated point is that the model updates continuously. Intervals.icu re-fits eFTP every time you upload. AdaptCycling re-fits on every Strava webhook. Xert updates on breakthrough efforts. A field test gives you one data point on one Saturday in one set of conditions; the model gives you a rolling estimate that already incorporates dozens of efforts at varying duration. Inside the broader self-coached cyclist playbook — where the rider is making the periodization, intensity, and recovery decisions a coach used to make — chasing the test is one of the cheapest places to stop spending decision energy.
The three situations where you actually owe yourself a test
Three cases break the model and require a real effort: a brand-new power meter with under 30 days of data, a return from 4+ weeks fully off the bike, and the two weeks before a goal event where being 15 watts wrong costs the season. Outside those, the model is doing its job.
The first case is the cleanest. A rider who just bought their first power meter has nothing for the curve to fit against. Two weeks of zone 2 followed by a ramp test or a 20-minute effort gives the algorithm an anchor while the power-duration curve fills in. A new rider who skips this accepts a 5-to-10 percent margin of error for the first month of structured training — the difference between sweet-spot intervals that feel hard and sweet-spot intervals that drift into threshold. TrainerRoad, Zwift, and Wahoo SYSTM all default to a forced test on signup for exactly this reason.
The second is the long-layoff case. Mujika and Padilla showed VO2max and threshold-related markers begin meaningful decline after roughly 10 days of complete rest, with the rate accelerating past three weeks [Mujika & Padilla 2000]. After 4-to-6 weeks fully off, the power-duration curve is a record of a cyclist who no longer exists. Coming back, the model needs a fresh anchor before the first build week or the rider trains the first month at intensities calibrated to pre-layoff fitness. For shorter layoffs the graded ramp covered in our spoke on restarting after two weeks off applies; past four weeks, retest first.
The third is the pre-goal sanity check, and self-coached riders skip it most often. Two weeks before a target event, run a clean 20-minute effort against the modeled FTP. Agreement within 5% means taper with confidence. Disagreement of 10% or more means investigate before race day rather than discovering at the start line that the plan was built against a number 15 watts off. The asymmetry matters: one tired Saturday is the cost of the test; the cost of being wrong about FTP at a goal event is the goal event.
What testing actually costs you
A formal FTP test is not a free measurement. It costs a hard ride that needs 48-to-72 hours of recovery, displaces a quality session in the build, introduces false precision around a single number, and primes the rider to chase the result rather than train against it. Six tests a year is a real chunk of your annual TSS budget burned on a number you mostly already had.
The training-load cost is easy to underestimate. A 20-minute test ridden honestly is roughly equivalent to a hard threshold session — 80 to 100 TSS once you include warm-up and the protocol's anaerobic finish. A ramp test is shorter but maximal, and the recovery cost is similar. Testing every six weeks the way Zwift Academy or some TrainerRoad plans suggest is six to eight quality sessions per year displaced by a measurement workout. Halson's training-load review is explicit that load monitoring exists to tell you whether you are adapting [Halson 2014]; trading adaptation work for measurement work to feed that monitoring is the wrong direction of the trade.
The second cost is false precision. A single 20-minute test produces a number that looks authoritative but sits inside a wide individual confidence interval. Borszcz and colleagues found the limits of agreement between a 20-minute estimate and a true 60-minute effort spanned 40-to-60 watts on individuals, even when the group-level bias was small [Borszcz et al. 2018]. The rider who walks away with FTP at 268 watts to a stated precision is fooling themselves. A modeled estimate drawn from dozens of efforts has narrower individual error bars than any single test does.
The third cost is psychological, and self-coached riders are uniquely vulnerable. A test result becomes a target to defend rather than a parameter to train against. Riders who test every six weeks tend to optimize their training for the test rather than the goal event. The quiet virtue of a continuously-modeled FTP is the rider never has a single sacred number — the curve drifts upward as fitness improves and there is no test day to peak for or to fail.
How to read disagreement between the model and a recent test
When you do test and the result disagrees with the modeled FTP, the size of the gap tells you what to do. Within 3%, treat them as identical. 3-to-7%, accept the lower number for a week of intervals and retest one workout. Past 7%, something is wrong with one of the inputs and the answer is investigation, not averaging.
A 3% gap is below the test's own measurement noise [Borszcz et al. 2022] — average them, or stay with the modeled number, and move on. A 3-to-7% gap is the most common case and the one that matters most. Take the lower of the two for the next block of work, run a sweet-spot interval session at the new number on day three or four, and ask whether the prescribed wattage felt right. If it did, the lower number was correct; if it felt easy, the higher one was. One session, dispute resolved.
Past 7% is a flag, not a result. Common explanations: a power meter calibration shift, a test ridden fatigued or tapered, a power curve dominated by stale efforts that no longer reflect current fitness, or a model missing a long-duration anchor. Before changing FTP by 15 watts on one test, check the power meter zero-offset, audit the last two weeks for outlier fatigue, and look at the MMP curve to see whether the long-duration points are populated by recent rides. Big gaps almost always have an explainable cause, and the explanation usually invalidates one of the inputs rather than splitting the difference.
Quick answers
Do I need to do an FTP test if I'm using AdaptCycling, Intervals.icu, or Xert?
How often should I test if I am going to test?
My modeled FTP jumped 8 watts last week without a test — should I trust it?
Ramp test or 20-minute test for cross-checking modeled FTP?
Sources cited in this guide
- 01
- 02Borszcz et al. 2018. Functional Threshold Power in Cyclists: Validity of the Concept and Physiological Responses. International Journal of Sports Medicine.
- 03Borszcz et al. 2022. Functional Threshold Power Estimated from a 20-minute Time-trial Test is Warm-up-dependent. International Journal of Sports Physiology and Performance.
- 04MacInnis et al. 2021. Do Critical and Functional Threshold Powers Equate in Highly-Trained Athletes?. International Journal of Sports Physiology and Performance.
- 05Mujika & Padilla 2000. Detraining: Loss of Training-Induced Physiological and Performance Adaptations. Part I: Short Term Insufficient Training Stimulus. Sports Medicine.
- 06
More inside The self-coached cyclist
Start here · Foundational guide
The self-coached cyclist: training without a coach in 2026
What self-coached actually means in 2026, why most amateurs plateau in year two, and how to structure a training year without a personal coach.
Read the full guide
Other articles in this series
- 01
How to restart cycling training after two weeks off
When you miss two weeks on the bike, the question isn't whether to resume — it's how to restart without overshooting the next two.
- 02
How often to take a recovery week as a self-coached cyclist
The 3-on/1-off recovery-week cadence is a guideline, not a law. When self-coached riders should hold it, push it, or deload sooner.
- 03
Am I overtrained or just tired? A self-coached cyclist's guide
Functional overreaching, non-functional overreaching, and OTS — what each looks like, and when fatigue crosses into something more serious.
- 04
How to spot and fix junk intensity as a self-coached cyclist
Most amateur rides drift into the medium-hard zone you can't recover from but don't adapt to. How to spot junk intensity — and three structural fixes.
- 05
Why year two stalls: junk intensity and the amateur plateau
Year-one gains came from any training. Year two stalls when intensity drift inverts the 80/20 rule. The mechanism — and the fix.
- 06
Cycling training plan for an irregular schedule
Shift work, a new baby, and weekly travel break the standard 7-day training week. How to plan around an unpredictable schedule.
- 07
What weekly TSS should an amateur cyclist actually target?
TrainingPeaks charts make weekly TSS look like the goal. For most amateurs it's the wrong primary target — here's what to use instead.
- 08
Safe CTL ramp rate for amateur cyclists: how fast is too fast
How fast a self-coached cyclist can grow CTL without crossing into non-functional overreaching — the numbers, the signals, and the override rules.
- 09
The cheapest way to self-coach cycling in 2026
Free apps cover most of self-coaching. The one tool worth paying for, where the false economies hide, and the cheapest credible stack in 2026.
- 10
When self-coached stops working: signals it's time for a coach
Self-coaching works for most amateur cyclists — until it doesn't. The four honest signals you've outgrown DIY, and the cheaper fixes to try first.
- 11
Self-coached 12-week training plan for a goal event
How a self-coached cyclist builds a 12-week base/build/peak/taper block for a goal event — the math, the workouts, and the taper that holds the peak.
Free preview · No card · ~3 minutes
Try the adaptive coach yourself.
Tell us your goal, hours, and days. We'll draft a representative training week in our coaching voice — no Strava connection needed.
Connect Strava when you're ready to start the trial