One of the most often-asked questions about climate data is, “How long a time period do we need to establish a statistically significant trend?”
Statistical significance of trends is an issue which has been dealt with often at this blog and many others. Yet previous posts, at least here, have made general statements and/or illustrated by example (often with simulated data) how “statistical significance” can come within your grasp or slip through your fingers. But the issue is so important — and so often abused by fake skeptics — that I think it’s worthwhile to give it a close look.
In general terms, the real question we’re considering is: Do these data show a trend? In practical terms the question becomes more specific, usually amounting to: Do these data show a linear trend? In other words, are they reasonably approximated by a pattern over time which follows a straight line, one which is rising (upward trend) or falling (downward trend) but not flat (no trend)?
This is hardly the only trend pattern which can exist, it’s certainly possible for data to follow trends which are not linear. In fact it happens all the time. But establishing the existence of a nonlinear trend is usually harder than showing that a linear trend is present, and a linear trend test will often detect the existence of trends even when they’re highly nonlinear. So, the simple fact is that when scientists study data to determine whether or not a trend is present, the “default” first analysis is to perform linear regression, which is the basic test for the existence of a linear trend.
There are even many varieties of linear regression, but by far the most common is least-squares regression. It has some distinct advantages over other forms (but others have their advantages too), but it’s not our purpose to muse about the virtues and vices of different types of regression. We’ll focus our discussion on the circumstances under which data might or might not reveal the existence of a trend, when we test for a linear trend using least-squares regression.
We’ll suppose that we have n data points which represent measurements or estimates of the data values at times which are evenly spaced, e.g., monthly or annual data. The entire time span covered by the data we’ll call T. We recognize that in addition to the underlying pattern (which we’re assuming is linear, but of unknown slope), there’s also noise added into the mix. We’ll say that is the variance of the noise, so that is its standard deviation.
And as I’ve often emphasized, the noise values may not be independent of each other. In particular they may show autocorrelation, meaning that nearby (in time) noise values are correlated with each other. We’ll characterize the impact of autocorrelation by estimating a quantity , which we can call the number of data points per effective degree of freedom. For noise without autocorrelation, this quantity is equal to 1 — there’s 1 data point per degree of freedom. For noise with positive autocorrelation (it’s almost never negative) it will be greater than 1, which means that we need multiple data points to get a single “degree of freedom.”
We’ll let represent the slope of the trend line we estimate using linear regression. What’s the uncertainty in that slope estimate? When the number of data values is not too small, a very good approximate formula for the square of the uncertainty (the square of the “standard error”) of the slope is
Note the subscript on , to distinguish it from the variance of the noise which we’ve just called . The standard deviation of the slope, a.k.a. the standard error of our estimate, is the square root of that
Great! There’s a general formula for you, but what does it mean for real-world data, in particular for global temperature data?
Let’s take monthly average global temperature data from NASA GISS to estimate the parameters. The noise variance is approximately , so the noise standard deviation is about . The “number of data points per effective degree of freedom” turns out to be about 10.6, with its square root about 3.25. Note these are only estimates!
Using monthly data, number n of data points in a time span of T years is . Putting it all together we have
In order to be conservative, I’ll use 0.5 as an approximation for the numerator instead of 0.475, yielding a useful approximate formula for the standard error of the warming rate in NASA GISS monthly global temperature data
That’s the standard error we can expect — but is a slope significant? It will be so at the usual “95% confidence” level if the slope estimate is at least as big as 2 standard errors. Here’s a plot of twice the standard error as a function of the time span T:
I’ve also place a horizontal, thick-dashed line at the value 0.017 deg.C/yr, which is just about the modern rate of global warming. It intercepts our 2-standard-error curve when the time span T is smidgen over 15 years (also indicated by a dashed line).
That means that if we have 15 years of data, we can confirm a trend at the present rate of global warming, right? Not necessarily! When we estimate the slope, our estimate is a random variable. It will approximately follow the normal distribution, with mean value equal to the true slope and standard deviation equal to the standard error. That means that the quantity “estimated slope minus two standard errors” will follow the normal distribution with mean value zero, and standard deviation equal to the standard error. For a warming trend, if that quantity, “estimated minus two standard errors,” is positive, then we achieve statistical significance for a warming trend. If not, then we don’t.
For T just a hair above 15 years, the quantity in question roughly follows the normal distribution with mean value zero because the trend is twice as big as the standard error. So there’s a 50/50 chance of its being above zero and permitting us to declare “statistical significance.” There’s also a 50/50 chance of it’s not being so. Therefore, for the parameters estimated from NASA GISS data, a 15-year time span (actually a wee bit more) gives us about a 50/50 chance to detect a trend with statistical significance. It also gives a 50/50 chance for the significance test to fail — which does not mean there’s no warming (another very common misconception pushed by fake skeptics), just that the given data don’t show it with statistical significance.
How long would we need to have a really good chance — say, a 95% chance — of detecting the trend with statistical significance? For that to happen, the trend has to be four times as large as the standard error. That happens, with the given parameters, when the time span T is 24 years, not 15. Here’s an expanded plot with yet another dashed line indicating a 24-year time span:
So, 15 years of global temperature data from NASA GISS has about a 50/50 chance to show the trend with statistical significance. But for a 95% chance to achieve that threshold, you need about 24 years. All of this is approximate, but it does give a good perspective on the quantity of data needed. It also shows how easy it is for fake skeptics to crow about the lack of statistical significance, even when the trend is present and is real. Would they have the audacity to be so misleading? I’d say that’s something we can expect with 100% confidence.