Any given time series, say July average temperature in the Moscow region since 1881, might exhibit both short-term (more brief) and long-term (more lasting) patterns of change. The longer-term is certainly something worth knowing about. It might be increasing, or decreasing, or it might not be changing at all. It might have wiggled around a lot but not really gone anywhere until some new factor came into play. But whatever its pattern, we usually identify the longer-term pattern of change with the trend.
What we’re really after is the background level against which temperature variations have their sway. By “trend value” I mean exactly that: the background level at a given moment. If it changes while the nature of the fluctuations remains the same, the probability of record-setting extremes will of course change. When the background level is colder we’re more likely to get cold extremes, and when it’s hotter we’ll get more extreme heat. Pretty simple.
Some reserve the word “trend” for the linear trend. There is certainly value in knowing the linear trend, one can’t deny its utility, it tells us about the longest-term trend. If we write the book of a time series in polynomials it is the first chapter, and is most responsive to the longest time scale behavior. But it is hardly the whole story. Sometimes it’s most of the story, and often it’s the most obvious part, yet there can be more, or less, than meets the eye.
Here for instance is that temperature data for Moscow:
It’s the data used by NASA GISS but “as is,” without the homogeneity adjustment which GISS estimates.
A linear trend gives a small but significant slope of 0.012 deg.C/yr:
If we restrict ourselves to a linear trend, then the July Moscow temperature has gone up by about 1.5 deg.C over the given time span. But — more relevant to the 2010 Moscow heat wave, if the linear trend represents the background level then the observed 2010 value is fully 3.667 standard deviations above the background. That’s a very big fluctuation indeed — for a normal distribution [note: it doesn’t follow the normal distribution] something that large or larger would only happen once every eight thousand years — so it’s the kind that is bound to be exceedingly rare. In fact it’s such an extreme fluctuation that it makes me doubt the “linear trend represents the background level” theory.
Statistics also tells us that the linear trend is not the whole story. If we add a quadratic term to our model trend then we find it too is statistically significant:
This quadratic fit might or might not be a good characterization of the “trend,” but apart from that it has revealed the presence of non-linear trend in the data with statistical significance. It also suggests that most of the warming of the trend has happened recently. Because of that, the 2010 value is not quite so extreme, “only” 3.33 standard deviations above the background level. That’s still exceedingly rare, but not so much so as the linear model.
Of course we could try 3rd-degree (cubic) or 4th-degree (quartic) polynomials, or even higher. Let’s try all degrees from 1st (linear) through 10th, comparing models by AIC (Akaike Information Criterion). Here are the AIC values as a function of polynomial degree:
The winner is the model with lowest AIC, which is the 4th-degree (quartic) polynomial. It looks like this:
This too may or may not be such a good representation of the “trend,” but it is better (with statistical significance) than the linear and quadratic models. It suggests even more strongly that most of the increase has been recent. And, it reaffirms that the trend is not linear. It just isn’t.
Because of that non-linearity, the 2010 heat wave value isn’t so extremely different from the background level at the given time as it would have been if the climate hadn’t warmed. Under the quartic model, the 2010 value is “only” 2.89 standard deviations above the background level. That’s quite rare, but perhaps not so rare as to earn the adjective “exceedingly.” When we consider that the distribution of fluctuations is not normal, 2.89 standard deviations might be a notable but not implausible extreme — given the warmer background level observed in the 2000s.
There are other ways to estimate a nonlinear trend; just about any smoothing method can be used. My favorite is a modified lowess smooth, but if I do that some folks might accuse me of using some trick. They might call it some type of “highfalutin smoothing procedure” that “makes history irrelevant.”
So, I’ll also compute 30-year running means. That’s as basic as it gets, it sure can’t be called “highfalutin” (not with a straight face anyway). We’ll call it the “low-falutin'” smooth. Heck it ain’t even that smooth. Here it is:
Note that the most recent 30-year moving average is 0.56 deg.C warmer than any other before 1980. The present value (of the “trend,” i.e. the background level) is likely even higher now, because the moving averages don’t go past the year 1999.
Here’s the high-falutin’ smooth (in red), also on a 30-year time scale, which I’ll have the audacity to call the “non-linear trend” and covers the entire time span, together with the low-falutin’ running means (in blue)
It suggests even more strongly that the trend is non-linear, and that the value over the last decade or so — the background level — has been notably higher than before. The non-linear trend estimate is also similar to the quartic polynomial model which was selected by AIC:
Both models agree that the value now is notably higher than it was before.
In fact “notably higher recently than it was before” seems to be the common thread of non-linear trend models for these data. Let’s try the ultimate in such models, a simple step function. The model which fits best is that with a change starting in 1999, thus:
It too is statistically significant, and estimates the recent increase in temperature to be 2.38 deg.C. This can be considered statistical confirmation that the recent decade-and-a-half really has been warmer than those which came before it, and that the difference is enough to cause a notable increase in the chance of such an extreme as was seen in 2010.
For the coup de grace, this statistical confirmation of a higher recent background level doesn’t depend on the extreme record-breaking 2010 value — even if we omit that value from the analysis, the best-fit step-function model still takes its step in 1999, is still statistically significant (strongly), and indicates a background level 2.0 deg.C warmer than before.
Just for perspective, here are three nonlinear models to describe the trend in Moscow July temperature, the non-linear trend (in red), the quartic polynomial (in blue), and the step function (black):
All the non-linear models show strong recent warming. Furthermore, it’s not just inevitable wiggles from a smoothing method, it’s a statistically significant pattern according to the quartic polynomial and step-function models, but I suspect some mixture of the nonlinear trend and the step-function model is likely a better estimate of the genuine trend, i.e. the background level.
The recent warming makes quite a difference in the extremity of the 2010 heat wave. What was a 3.665-standard deviation extremity under the linear model is only 3.06 standard deviations above the mean in the non-linear model, only 2.89 according to the quartic model, and only 3.09 by the step-function model. In all three cases, the 2010 heat wave is seen to be very unlikely, but far less unlikely than if there had not been recent warming of the background level. Note that this conclusion pertains for the non-linear trends, the linear trend model makes 2010 a much more unlikely extreme. If we add the linear model to the previous graph (as a green line), and add error bars to the step-function model, we get this:
All non-linear models agree on higher post-1999 temperatures, but the linear model estimates a value for that time span which is clearly too low (note that it does the same thing for the pre-1890 temperatures).
One could argue that for all the nonlinear models, what happened early in the observed time span (say, prior to 1940) makes almost no difference at all to the estimated trend value in 2010, so for that estimation using a non-linear model “makes history irrelevant.” One would be right. And that is exactly as it should be. To get it right, one cannot ignore the non-linearity of the trend — by which I mean, the very background level we’re seeking to understand. As for history, the temperature in Moscow one million years ago just doesn’t help us estimate the background level in 2010, and frankly, neither does the temperature in 1940.
Given the statistical soundness of the nonlinear trends, and the failure of the linear model to capture the recent warming, I would say that if one wishes to know the background level of temperature in the Moscow region in July, in order to estimate how likely or unlikely the 2010 extreme was, it would behoove one to use a non-linear model.
Very nice! Glad to see you back.
Will you eventually be commenting on Lovejoy’s “Scaling fluctuation analysis and statistical hypothesis testing of anthropogenic warming” ?
A quick note: the Y axes of graphs #2, 3, 6 and 7 should be labeled Temperature, not Temperature Anomaly.
As a detail, on the first linear trend, even if no distributional assumptions are made and just the existence of mean and standard deviation are accepted, Chebyshev’s Inequality says the probability of 3.667 standard deviations is less than 7%. That is a very weak bound, however.
Very nice demonstration. Fortunately, there is blogs to distribute such type of analysis. They are a nice alternative to opinion.
I sorry to ask you again, but what would be your recommendation for a book of time series analysis. I have loss my note about your answer you give me last time.
Quoting from there:
Response: I’m hardly current with textbooks, but Shumway & Stoffer’s “Time Series Analysis and its Applications” is pretty good.
After I finish my set on Fourier analysis, I’m planning to finish my book on “Time Series Analysis for Physical Scientists.” But it’ll be a while.
Thank’s Goeff! Now, Tamino as only to finish his book, as myself on the scientific thinking.
Also, anticipating criticism, I calculated the autocorrelation using BEST data for Moscow. (Easier to get to that data.) Used July temperatures only (to eliminate seasonal effects) and anomalies. ACF dropped below 1.5 after 2 lags, so observations look essentially independent. Lag plot showed circular patterns for first 10 lags as well.
Also fit a linear model using same. 2010’s discrepancy was very close to GISS using a linear fit, 3.67 standard deviations. Got a slope of 0.0068439157595 and residual standard error of 1.638. (Intercept, if anyone cares, was -12.9993018067557.)
Oh we are most definitely living in the land of accelerated warming. Perhaps also worth noting is that Russian wheat production fell from 61 million tons in 2009 to 38 million tons in 2012, primarily due to heat and fire related impacts.
That’s just two data points in a noisy series, so I’m not sure it’s worth noting much of it. In the long term we expect yields to fall a lot. When do we expect to be able to see it in the data though?
Each year post 2009 is lower than 61 mt. I’d hardly call that noisy. 2013 will probably be higher than in 2012, but lower than 2009. Drought/heat/fire for 2014 looks rather bad for the region, so it may challenge the 2012 low.
I foresee a new form of denial — crop loss due to climate change denial.
As shown in the graph below, every year pre 2008 is lower than 61 mt as well, according to the USDA. If you want to claim Russian warming is having an impact that shows in the statistics of crop yields in Russia, you’ll need to do some more work than picking two or five data points.
Drought and flood kills crops. Longer summers benefit some crops and cause trouble with others. New technologies and practices increase yield. Farmers pick different crops and will cultivate or leave fallow certain fields based on the price. And I’m 100% certain there are other major confounding I didn’t think about on the spot.
IPCC reports indicate we shouldn’t expect worldwide total crop yields to be reduced by warming yet — it’ll be another few years (but in my cats’ lifetime) before the net effect of warming is negative.
r-scrib, be careful—this is what happens when you jump to conclusions using a very limited set of data.
A technical question – why AIC and not BIC? The latter is more conservative AFAIK – hence results are more robust.
And – how many degrees of freedom are in the system – any autocorrelations?
[Response: The autocorrelation is too weak to confirm that it’s non-zero. But I did confirm that it doesn’t negate the stated cases of statistical significance (by estimating an upper bound for its influence). As for AIC vs BIC … one could write a book about that.]
The implications of a quartic trend in the future are Spencerian, which is one strong reason to throw the damn thing in the circular file.
Er yes, but note that Tamino did didn’t say anything about extrapolating these trends into the future. That’s a whole different can of worms. Allowing higher-order polynomials let’s you fit the data better over the period for which you have data. But it would be foolhardy to assume that on their own they provided a better basis for forecasting.
Think about it this way, what does an excellent fit mean if there is no theoretical underpinning, and it cannot be extrapolated. It is sausage, statistically significant sausage but sausage none the less.
Eli, you might as well ask why anybody bothers graphing anything.
To learn about underlying causal drivers and to forecast the future (occasionally also the past). In both cases this sort of stuff fails.
How “Spencerian”? Speaking of Roy Spencer?
=:) Easter greetings
It is possible for the impact of multiple effects of physical processes operating at different times with varying “weights” (magnitudes) to add up to “look” like a 4th degree polynomial.
But I don’t think it is an exaggeration to say that in the case of climate, pretty much any non-linear fit over a 100 year period is going to be essentially useless for prediction purposes, but as pointed out by Mark, that’s a different ball of wax — and not what Tamino claimed:
Unfortunately, when the usual suspects go into fits, they act like the “fit IS it”: that the precise combination of physical processes that produced the fit will continue ad fititum.
“The fit IS it”
— by Horatio Algeranon
The fit IS it
I am QUITE sure
The trend IS tenth
Degree (or more)
I have a question for anyone who cares to give me an answer. I am math illiterate and I’m trying to understand all the graphs and smoothing but I just don’t have the background. Could anyone point me to a good source for learning how to interpret the mathematical aspects of climate science? I am a musician and I did take a class in Acoustics which helps me to understand things like amplitude, waves, frequencies, oscillations, feedback loops etc. But I really want to understand the math behind the discussions and I don’t know where to start. Oddly enough I understand the different waves and wave forms because I can relate it all back to music and frequencies but that’s about it.
Sorry for being off topic. Tamino, I think you’re a genius on this stuff and a true hero. I would love to get your take on the possibility of the next El Nino if you have the time. Or maybe you’ve already discussed it somewhere and can provide me with a link. Does your name have anything to do with a Mozart opera? Just curious. There is a “Tamino” in the Magic Flute.
“Could anyone point me to a good source for learning how to interpret the mathematical aspects of climate science?”
*Umm, general physics and statistics might go well here. there are no specifically climate-related mathematical methods that I know of. also some geographical equations are frequently used in full physical models, I think.
“I have a question for anyone who cares to give me an answer. I am math illiterate and I’m trying to understand all the graphs and smoothing but I just don’t have the background.”
*as you say ‘I am a musician’ so the metaphor of climate as a symphony used f.e. by Professor Maureen Raymo might do well. The players in the ochestra might be seen as various weather phenomena, and each players skill contributes to the overall performance. This might be seen as a metaphor for individual specialsts contributing to climate science. If one plays bad then the whole suffers, but the listener still can recognize the piece played if it’s not the lead theme. You’ve possibly heard that ‘climate is noisy’ somewhere and a statistician can help here, so he would be a music teacher or a conductor, to extend the metaphor. ‘smoothing’ might be as bad as using an autotune or as good as getting a proper singing intructor. Man, I know I’ve done enough errors in stats.
“and I did take a class in Acoustics which helps me to understand things like amplitude, waves, frequencies, oscillations, feedback loops etc.”
*The word ‘frequency’ has a different meaning in statistics than in music, so that might be one source of confusion.
*oscillations work almost the same, but are not necessarily all regular in science,
*feedback loop is about the same, but the amplifiers change their intake power…
*waves… are waves (The Wavewatchers Companion)
Not sure if this is any answer. Maths is needed to get the mathematical aspects of maths and science.
“The word ‘frequency’ has a different meaning in statistics than in music, so that might be one source of confusion.”
It’s a quibble, but no, not really. “Frequency” only came into music via acoustics (probably mid to late 19th century), hence its musical meaning is pretty much congruent with normal physics/math usage. Yeah, that’s somewhat distinct from stats per se:
(Musicians used terms like the English ‘pitch,’ which though well-specified by frequency, didn’t require referencing back to it from a practical standpoint. I do think the conceptual core of ‘musical frequency’ was understood all the way back to Pythagoras–even though the analytic emphasis tended to be on frequency ratios and the mystical significance thereof, rather than on the nitty-gritty physical details of what might be actually vibrating, and how. Cf., the ‘music of the spheres,’ which referred to the ‘harmonious’ ratios of (soundless) planetary motions.)
I thought that in the musical context, frequency referred to how often Britney Spears did it again (oops!) (undoubtedly “very”) which is basically the same meaning as in statistics.
Sufficiently agreeable pop songs aside, Heaven forfend that Britney become the definition of anything musical…
She is already the definition of “randumb number generator” for music videos (which is yet another connection with statistics)
Doc Snow.. your graph has to have a zero *somewhere*.
Zero point energy?
Not sure if this is what you’re looking for, but there are some online courses that might be of interest. One that I took last fall was produced at the University of Chicago (https://www.coursera.org/course/globalwarming). I found it well worth while. Another that I am currently auditing is produced at MIT (https://www.edx.org/course/mitx/mitx-12-340x-global-warming-science-1244#.U1wfzsYSspM). The MIT course is somewhat more mathematical. Involves some calculus and differential equations which I am no longer competent in, but the videos are informative. These will probably be offered again, so check the course catalogs for these (and others that might be related). Also available are courses in general statistical methods.
I believe the University of Chicago course is also available at http://forecast.uchicago.edu/, although it warns something about the site shutting down. Does not seem to be shut down right now. I have done most of it and have the textbook. I also recommend Professor Pierrehumbert’s PRINCIPLES OF PLANETARY CLIMATE and associated problems and Python code at his Web site. That’s truly great stuff and is intended to support self-study. I also recommend Petty’s book A FIRST COURSE ON ATMOSPHERIC RADIATION. There is also Carson’s detailed and technical blog, “The Science of Doom”, at http://scienceofdoom.com/about/, although, for me, the style of Archer and Pierrehumbert are better. I have not looked at the MIT course at all.
I think it would be interesting to do separate analyses on the high peaks and then the negative peaks (valleys), since differing weather processes (and the climatre physics behind them) drive those differing parts of the temperature record. That is, what happens if you evaluate those two separate trends?
You can, but if the fact that they are correlated is not taken into account, not all the information is being used. One reason for picking out a single month to examine is that then there’s no seasonality in the signal.
I see hockey sticks! (and dead people?)
Lots to think about here. Including how much using a single location and a single month might affect the expected annual values and variation about them in the underlying distribution. In a way this analysis brings up some of the same issues as analyzing ice extent minimums in the Arctic including the fact of a recent very extreme value.
“What A Carbonful World”
— Horatio’s version of What a Wonderful World (written by Bob Thiele and George Weiss and made famous by Louis Armstrong)
I see trees of brown,
Hockey sticks too.
I see dim gloom
for me and you.
And I think to myself,
what a carbonful world.
I see Hadley CRU,
And sea-ice flight.
The blistering day,
The hot muggy night.
And I think to myself,
What a carbonful world.
The ppm’s of carbon,
Increasing in the sky.
Are warming all the faces,
Of people who will die,
I see storms shaking hands.
Saying, “How do you do?”
They’re really saying,
“I’ll get you”.
I hear Stevies cry,
I watch them blow,
They’ll learn much less,
Than I already know.
And I think to myself,
What a carbonful world.
Yes, I think to myself,
What a carbonful world.
We need Weird Al Yankovic to record that.
It’s not up to Weird Al’s high standard, but here’s my low ($0) budget attempt
Thanks for yet another illuminating piece.
I’m a bit curious about one detail: why use the raw–as opposed to adjusted–data? Anticipating objections from tin-hatters, or something more technical (and perhaps fundamental?)
[Response: Because that’s what was used by Rahmstorf and Coumou, and criticized (in my opinion, not just unfairly but in very nasty fashion) by Roger Pielke Jr. There’s more to com.]
One reason for using AIC over BIC is that AIC asymptotically approaches the Kullback-Liebler divergence, while BIC is more related to Bayesian analyses. Since Tamino is comparing the difference between two distributions, K-L is probably the more relevant quantity.
Somewhat ignorant questions here: Do these results strongly depend upon the assumption that variance around the trend is stable? And then, as a potential follow up, is the variance consistent over time? Or does such a question make much sense with non-linear trends?
[Response: The data don’t demonstrate non-constancy of variance (heteroskedasticity), but it may be present. With only about 133 data points it’s not easy to show, yet there’s no obvious or extreme change of variance. Yes the question makes sense.]
First, a question:
How does one define the statistical significance of a trend in a time series without making assumptions about the spectrum of the process? I have never understood how such a claim is justified and not for want of looking.
[Response: I don’t see what the spectrum has to do with anything. The assumption used for statistical significance was that the noise is white — that’s a statement about its autocorrelation structure, not its spectrum. You could argue that saying “the spectrum is flat” is equivalent to saying “white noise,” but I’d say that’s just a word game.]
Second, some vague discomforts:
Obviously the best fit to a 130 year record is a 130th order polynomial. Your AIC seems to be an attempt to address the obvious problem there. But as the rest of your piece illustrates, there is a plethora of possible models with a low number of parameters, and as Eli points out, the polynomial fit is clearly non-physical.
[Response: So too is the step-function model (in my opinion). But the *question* is: how does the background level in 2010 compare to that in, say, 1900 or 1940? For that, as George Box might say, all the given models are *wrong* but they’re also *useful*. And they all agree (except the linear model, which gives an answer I consider demonstrably wrong).]
And of course, the Moscow time series itself is sort of a cherry pick.
[Response: For evaluating the extremity of the 2010 Moscow heat wave, choosing data from Moscow is a “cherry pick”?]
I think there must be a way to rigorously argue that things started to get wonky around 1998 – that’s my impression too. But I don’t think you’ve gotten there with this yet. And I doubt you’ll get there with a small set of data – this will require big iron.
[Response: I already got there. The fact that it was possible with such a small data set, shows the extremity of the change.]
“The fact that it was possible with such a small data set, shows the extremity of the change.” I agree that something odd happened in Moscow. But it still could be a cherry pick with regard to global change in the sense that Moscow could be an outlier entirely randomly.
More to my real question, dividing it into an underlying trend and an uncorrelated stationary residual is a model that builds certain assumptions in. All of your cases make that presumption.
I can see how this model is useful, but even without questioning it, I still don’t understand how to obtain the statistical significance of the trend.
I guess the null hypothesis is zero-trend (or zero-step) and you are trying to find out how likely that is given the observations and an assumption of no correlation? Leaving aside that this seems wrong to me, how does the quadratic trend get a different answer than the linear trend?
I’d like to know the mechanics of the calculation. As I say, I’ve been trying to wrap my head around it ever since Phil Jones said “not statistically signifiant warming”.
I have a lot of statistics behind me from communication engineering. My efforts to understand climate statistics have been somewhat stymied by the essentially Bayesian structure of data communication devices. Significance tests always strike me as a bit odd. So far I can find no reasonable explanation of how significance tests apply to trends in time series. For you to say that the spectrum of the signal doesn’t matter is totally alien to the way of thinking I was trained it.
If you could provide me with chapter and verse of what you did or a reasonable exposition I’d be appreciative.
Related, any comment on Lovejoy’s paper “Scaling fluctuation analysis and statistical hypothesis testing of anthropogenic warming”?
I have put the autocorrelation plot, the partial autocorrelation plot, and the lag plots I did for Moscow, referenced above, at:
respectively. For comparison I used BEST, not GISS data. The extreme nature of July 2010 in Moscow remains unchallenged. I don’t see any reason to not treat these data as i.i.d. Sure, there may be short-term correlations over one to three years.
I don’t want to speak for Tamino in any way, but I think his post shows how slippery notions of “trend” can be. For instance, Fyfe, Gillett, and Zwiers, in their 2013 “Overestimated global warming in the past 20 years” use linear trends estimated from HadCRUT4 data to do a two-sample bootstrap comparison of trends against trends from an ensemble of climate models. Why linear? Why especially when HadCRUT4 has many instances of “NA” values in its data? Why not, to throw another possibility into the Tamino mix, a smoothing spline which is then used to calculate point first derivatives? Such could be used, for instance to interpolate over gaps?
There is a systematic way of making these decisions, one which AIC and BIC barely touch, and that is either frequentist model comparison or Bayesian model comparison, like Bayes factors. But there are constraints and rules for each. Burnham and Anderson (2002), for instance, criticize use of significance tests after AIC has been used to choose among models, and also criticize, as they should, reporting of p-values without simultaneously reporting effect size and precision. Bayesian model comparison is another approach altogether. Some, like Burnham and Anderson, feel Bayesian model comparison is more complicated. I respectfully disagree, and refer the student to Kruschke’s 2011 text, Chapter 10.
I should have also said that the Bayesian way of doing this kind of analysis these days is to build into the model all the pieces the student has doubts about, and put priors on their parameters. So, for instance, if there’s suspicious a time series might be correlated but the student really doesn’t know, build the model supporting a correlation, and put a prior on the strength of correlation. Then, given the data, find the posteriors for the parameters and that will tell you how much correlation, if any, there is in the data. If the high probability interval for correlation contains zero, there are worthy doubts about it being important.
I also think it’s unfair to claim things like ‘Moscow being a cherry-picked random outlier’. The assessment never claimed any relationship between Moscow and the rest of the world. To bring that in now is, to my mind, going against the groundrules. To bring Moscow in relation to its surround, whether regional or continental, or the world means having a quantification of that relationship. Produce a covariance matrix for the world with Moscow as one row and column, and you have what you ask. But I don’t think it is at all Tamino’s job to produce such a matrix. You asked the question, it’s your job.
Interesting post, thanks Tamino, so, basically you’re saying the amplitude of extreme values may well be used to determine whether a trend is linear or something more, given enough data? Scratch the other comment, if it doesn’t make sense, please.
[Response: No. I’m saying that a change in the background level changes the extremity of a given value *relative to the background level*, therefore changing it’s likelihood. I’m also saying that there is a demonstrable change in the background level at Moscow, one which can only be estimated realistically by using a nonlinear model … it simply doesn’t follow a straight line.]
thank you for the response *starts reading the blog again*
I would think that if there is any place in the world where you might expect an enhanced heat island effect over the last 30 years it would be Moscow. The end of communism with a corresponding increase in living standards, the development of natural gas and the fact that the place is extremely cold in winter and pretty big would all contribute to increased heating.
So it might be argued that you have indeed detected a non linear trend but it may have other origins than CO2.
That is why using the homogenised values or using a more rural weather series would be more convincing. The Moscow heat wave was not of course restricted to Moscow.
But don’t get me wrong. I am a big fan of the work you do – yours is one of the few blogs that I tend to look at most days and every time I see an update I eagerly read it.
Berkeley Earth sees EXACTLY the same effect Tamino discovered with the GISS data. GISS data may be corrected for heat island effects, but I do not know. BEST certainly is. So, the UHI thing is a red herring.
Wouldn’t it be more appropriate to try to determine trend relative to CO2 (e.g. using CO2 as a proxy for GHGs) rather than to determine the trend in the time domain?
On a separate note, any idea what caused the cooling in the early 1900s? (1900-1915ish)
[Response: The question at hand is, how has a change in the background level affected the likelihood of temperature extremes? For that question, the *cause* of the change in background level isn’t meaningful. I guess this is a question of detection, but not one of attribution.]
Could you give us your opinion on how this might apply to this recent paper?
Timescales for detecting a significant acceleration in sea level rise
Tamino may comment. Dunno. But I’ve put my comment, after reading the paper, here: http://hypergeometric.wordpress.com/2014/04/21/comment-on-timescales-for-detecting-a-significant-acceleration-in-sea-level-rise-by-haigh-et-al/
“I’m saying that a change in the background level changes the extremity of a given value *relative to the background level*, therefore changing its likelihood.”
Ice-out dates in lakes are a very accessible example of this point. Maine lakes show a very defined trend toward earlier ice-out today as compared to the 1800s. As such the ‘background’ for anomalously early or late ice-out dates in the past 30 years (in Maine) is quite different than the background for a 30-year slice during the 1800s. Or to put it another way, the earliest ice-out dates we’ve seen since the 1980s would be ridiculously ‘early’ if they had occurred in the 1880s. And the very ‘late’ ice-out we’re having this year in Maine would not even register as much of an anomaly during the 1800s (the new ‘late’ was the 1850s ‘normal’). I’m curious if the Maine and New England ice-out data set shows a similar non-linear behavior as the Moscow dataset.
You might find this from Gavin over at RC interesting reading….http://www.realclimate.org/index.php/archives/2014/03/the-nenana-ice-classic-and-climate/
Eli Rabbett said
“Think about it this way, what does an excellent fit mean if there is no theoretical underpinning, and it cannot be extrapolated.”
It means there is probably a theoretical underpinning that hasn’t been found.
I have a theory that can be falsified (cf. Popper) “The July temperatures in Moscow can be described by a quadratic model plus noise”. Perfectly good theory. Can be falsified.
Kepler’s laws of planetary motion didn’t have any theoretical underpinning until Newton and Newton doubted whether his “action at a distance” was a theoretical underpinning. As he wrote
“Gravity must be caused by an Agent acting constantly according to certain laws; but whether this Agent be material or immaterial, I have left to the Consideration of my readers.”
— by Horatio Algeranon
If temperature change was linear
We’d know what was behind it
The fact that it is sinewier
Requires that we unwind it
Try extrapolating that model a couple of hundred years on either side, which is the hole that Roy Spencer fell into.
Funny you should say this. I just came across a statement in Cowpertwait and Metcalfe’s INTRODUCTORY TIME SERIES WITH R, page 17. I quote: “In the previous section, we discussed a potential pitfall of inappropriate extrapolation. In climate change studies, a vital question is whether rising temperatures are a consequence of human activity, specifically the burning of fossil fuels and increased greenhouse gas emissions, or are a natural trend, perhaps part of a longer cycle, that may decrease in the future without needing a global reduction in the use of fossil fuels. We cannot attribute the increase in global temperature to the increasing use of fossil fuels without invoking some physical explanation because, as we noted in [section 1.4.3], two unrelated time series will be correlated if they both contain a trend.” While a certain amount of naivete and failure to properly recognize climate science as one of those non-falsifiable systems sciences is amusing here, still, it should be noted that the same argument suggests “proofs” that increases in global temperatures due to anticorrelation with increasing use of fossil fuels are equally foolhardy.
“We cannot attribute the increase in global temperature to the increasing use of fossil fuels without invoking some physical explanation” .
You may not. I can.
Before I knew any physics, I found that if I let go of my rattle, it fell straight to the ground and a perfectly good theory was created. Later I discovered it didn’t work the same on a garden swing. The theory needed modification (or falsified and replaced?)
“non-falsifiable systems sciences”. Are you referring to something like the conservation of energy? When it’s falsified, it is resurrected by the invention of another form of energy.
I suppose climate models are a bit similar, we believe they should work and take their results as guides even though they are wrong (i.e. the models are falsified?).
The effort does go on to make them give better results. We know the CMIP5 models have missing feed backs (e.g. the effects of wildfires) so I guess/judge they are really quite optimistic. Tamino’s post here reinforces my judgement: things are worse than IPCC AR5 says.
Perhaps there is a “physical basis” here: The models are missing feed backs so climate change is greater than they predict. I guess/judge this post supports this view.
When you dig down science still relies on guesses and judgement. (cf. Feyerabend?) The trick is to make that judgement good judgement.
That “climate science is systems science” is a characterization by many, but most notably the late Professor Stephen Schneider. You’ll find it in many of his writings and talks. Falsifiability is a property of components of such comprehensive explanations. It’s not realistic to expect there’ll be sufficient statistical power (in the formal sense) to falsify the entire edifice in a single experiment or related series of experiments.
Moreover, as Kharin noted in 2008, there is “but one observational record in climate research”. (See http://www.atmosp.physics.utoronto.ca/C-SPARC/ss08/lectures/Kharin-lecture1.pdf.) What this means, if readers will allow the thought experiment, is that if we could freeze the state of the entire Earth climate system in an initialization at some time in the past, restore it, and then run it forward until now, the resulting state would be different in each run. Yet the observational record corresponds to the snapshots of states from one such run. This means a couple of things. One is that a frequentist, significance test-oriented way of distinguishing plausible explanations from implausible is on really weak footing here, since there is no “repeated sampling in the limit” to go on. That means Bayesian approaches are preferable. Secondly, the observational record needs to be seen for one it is, a sequence of rolls of a particular set of weighted dice, not the inexorable working out of only deterministic laws. It is the inexorable working out of stochastic laws but, then, observations of such walks need to be seen for what they are.
OK, first, Feyerabend was an idiot. Second, nobody has ever said that the primary evidence for climate change was the rising temperature. The prediction of warming predates its observation by nearly a century. Second, we do not expect the rise in temperature to overwhelm natural varaibility–if your authors are so dim that they can only look at a single forcing, I’d trash the book. Third, rising temperature is only one of many predictions of global climate models as a result of rising CO2. Fourth, nonfalsifiable, my rosy red ass. I really wish folks would realize that falsifiability is only one tenet of science and not the whole ball of yarn.
Snarkrates ” I really wish folks would realize that falsifiability is only one tenet of science and not the whole ball of yarn.”
I agree with the idiot, Feyerabend. If you wrote down the “tenets of science”, it wouldn’t be long before you changed your mind about them.
Hypergeometric “climate science is systems science”
I have never liked “systems science”. I think it’s mostly academic waffle intended to carve out a new “discipline” to advance careers. That doesn’t mean to say I sneer at the good work in those areas of research “systems science” tries to capture.
“falsifiability” OK, I’ll come clean. I don’t like Popper’s “Scientific Method”. OK on first read but falls apart on the second. And of course there is “The only statements that are meaningful are those that are falsifiable” which falls at it’s own hurdle. But it is a convenient short form.
My advice (OK that’s pompous): forget philosophy and try and think straight.
Popper’s approach is correct as far as it goes. It just applies to relatively simple hypotheses and theories, not complex theoretical frameworks for complicated systems where there are already mountains of evidence supporting the tenets of the theory. It is exceptionally unlikely that one will falsify evolution. Rather, our understanding of evolution will modify as we accumulate evidence. This drives the creationists crazy. They’d rather believe that all they have to do is find one more gap in fossil record and the whole edifice of the theory will come tumbling down. So Popper was not wrong, just incomplete.
Feyerabend was “not even wrong”. Science is not purely a social construct. There are aspects of it that are simply part of our psychology as human beings. There are aspects of it that are simply mathematically and logically correct. Finally, it works. It has changed human society beyond recognition since it was introduced by Francis Bacon. To dismiss this as a social construct equivalent to any other is to deny the reality that science is the most revolutionary methodology humans have ever devised.
I agree with most of that but
“To dismiss [science] as a social construct equivalent to any other”
What “other” social constructs were you thinking about?
“science is the most revolutionary methodology humans have ever devised.”
Now all we have to do is decide What is this thing called science?
P.S. Earlier, I of course meant Poppers’s “The Logic of Scientific Discovery”.
While I agree that there are some subtle aspects to defining what is and is not science, I think most folks to do science, are at least familiar with several branches of science and have given the matter a degree of consideration would agree that scientific method requires certain characteristics–an initial empirical investigation, preferably guided by relevant theory, development of a model/theory, nontrivial predictions by that theory or those theories, verification of those predictions, refinement of the theory, and so on. There must also be a way of estimating and controlling the sources of experimental error. Repeatability is one way of doing this, although it is not always possible for observation of complex systems. Science isn’t as mysterious or a arbitrary as charlatans like Feyerabend contend, nor is it as inflexible as some closed-minded physicists (e.g. Sheldon Cooper shouting out that geology isn’t a real science) seem to believe.
That’s not a theory.
It’s not even a hypothesis.
It’s nothing more than wishful thinking.
— by Horatio Algeranon
The temperature extremes
Are really not that bad
They’re just a little higher
Than Moscow’s lately had
Kepler’s motion notion, as extrapolated, continued to fit fairly well with observations — it persisted — unlike the usual wiggle-matching.
While I was reading this story,
it occurred to me that it should be possible to tease out and trend the signal of advancing springtime from the seasonal oscillations of Mauna Loa Observatory CO2 data.
Does anyone know of a published attempt to do this?
So the linear model is most likely wrong because the 2010 outlier is too far out. Alternative models don’t suffer from this fault, but offer no predictive power.
I wonder if the same analysis applies to Perth rainfall? It has been proposed that our continuing rainfall decline has occurred as a series of step-changes since the 1970’s.
We average climate over 30 year periods. Given the rate of climate change, is a 30 year climate baseline still workable? Do we need to develop nonlinear measures of climate to provide a basis of engineering design and public policy?
Since the 30-year rule is based on noise levels for temperature time series, as climate change accelerates, we will likely start seeing its signature on shorter timescales. I think this is a problem that will take care of itself as long as people understand the data.
It seems to me that in attempting to discern “climate” within a temperature (or rainfall) series, a series of step changes is as good as anything else. The climate “normals” used by meteorologists are 30-year averages that are allowed to change only once a decade: a peculiarly constrained form of step function.
The more I keep thinking about this blog entry, the more I come to a couple of conclusions:
1. Describing Moscow in the context of the whole Earth has some of the elements of describing a single string of, say, 20 coin flips from a pool of many thousands of strings of 20 coin flips. What suffices to describe–not model–the former simply does not suffice to describe–or model–the latter ***even if there is a very slow bias occurring over time in all the coins***. Distributions such as even 20 heads are totally expectable given enough distributions to examine.
2. Obviously the observations in Moscow correlate well with some larger region, but that correlation does not extend to the entire planet.
3. Here I’m less sure but I suspect that the results of modeling all of the strings of observation for all cities which have significant step functions would likely show a trend if one plotted all the midpoints of all the identified steps. This would seem to make that approach an especially complex way of doing standard regression.
for skylantec: your question in Google Scholar:
There’s a new report in SCIENCE http://dx.doi.org/10.1126/science.1249534
Faster Decomposition Under Increased Atmospheric CO2 Limits Soil Carbon Storage
Kees Jan van Groenigen, Xuan Qi, Craig W. Osenberg, Yiqi Luo,
Bruce A. Hungate
Soils contain the largest pool of terrestrial organic carbon (C) and are a major source of atmospheric carbon dioxide (CO2). Thus, they may play a key role in modulating climate change. Rising atmospheric CO2 is expected to stimulate plant growth and soil C input but may also alter microbial decomposition. The combined effect of these responses on long-term C storage is unclear. Combining meta-analysis with data assimilation, we show that atmospheric CO2 enrichment stimulates both the input (+19.8%) and the turnover of C in soil (+16.5%). The increase in soil C turnover with rising CO2 leads to lower equilibrium soil C stocks than expected from the rise in soil C input alone, indicating that it is a general mechanism limiting C accumulation in soil.
I think people are missing the point of this post. I think what Tamino is trying to get at is what we can discern from extreme events. Extreme events are by definition rare. They are a statistical gift that gives us rare information. We need to find ways to incorporate that information into our models. I don’t think we should get hung up on whether the series is really quartic, or for that matter exponential. I don’t even think that it matter all that much whether the dataset is homoscedastic–there could be forcings active at one point of the data series that are not active throughout (e.g. volcanic eruptions, fossil-fuel aerosols from 1945-80 or so), but as those forcings are negative, that won’t change the character of the maxima. I think that what the analysis is telling us is that unless you can find a forcing active in the early 21st century that is strongly positive (other than ghgs), that it is exceedingly unlikely that the underlying trend is linear.
Attempting to comment from my handheld.
Congrats, as always, to Horatio.
In the last 10 years we have seen loss of Arctic sea that has affected the behavior of the jet stream., In less than 10 years, we have see changes in the jet stream that should affect our expectations of weather, and that is climate.
Loss of ice in the Arctic is not noise – that ice is not coming back until we have global cooling – and that was a sea change in expectations that happened in only 5 years. In 2007, the ice melted – and all the climate folks said, “Look at the models, it will be back!” Five years latter they were saying, “Well, maybe not.” In 5 years the expectations, and hence the climate changed.
Now that the jet stream is moving, what climate does an engineer designing durable infrastructure use as a basis of engineering?
Regardless of what you think of fourth order polynomials at Moscow, it’s interesting that if you fit one to the global temperature series, it neatly bisects RCP4.5 and 8.5 at 2050:
Speaking of trends Forbes and Heartland have found a couple of linear ones they like. Nineteen years is good. And 83 years! I’m sure they just picked those start dates randomly.
Don’t want to know… or rather, pretty much already do.
Hi everyone, I’m sorry for my misplaced intervention. I’m a reader of the blog and I’m studying on “Understanding Statistics” (something I’d really suggest to anyone looking for well-written and clear introduction to statistics!) and I’d really appreciate to contact the author about a passage in which I feel like I’m reading something different compared to what I usually find in some scientific literature. (and I feel like author is right, but I couldn’t explain excatly why). Is this the right place to ask? (I don’t think so) Is somewhere around here an email address of the author?
Sorry again and thank you all.
[Response: Ask away.]