# Questions for Cliff Mass

In response to my post, you said my analysis is “very weak” and that I used “an inappropriate statistical approach that did not consider the variability of the time series.”

The Theil-Sen trend estimate is robust, in fact it’s what statisticians call “non-parametric” (although I prefer the name “distribution free”). Such methods are specifically designed to account for the variability of the time series. What is your basis for stating otherwise?

What method would you prefer? What method did you use?

### 12 responses to “Questions for Cliff Mass”

1. Mass defines himself as a meteorologist not a climatologist, and so presents the confabulated notion that no fewer than 30 years of weather can be allowed to define climate. Tenaciously clinging to this oversimplification, he seems to think climate studies should never process weather data any sooner than 3 decades ago. Ask him.

2. The other non-parametric way of seeing that the trend is significant is by considering the area under the cited curve

Suppose this is a two-dimensional cross-section of a homogeneous solid resting upon a thin, flexible plate. The abscissa is the projection of that plate onto the plane of the figure.

With sufficiently small stiffness in the plate (sufficiently low Youngs modulos, or sufficiently high density of the material in the solid), the Kirchhoff-Love plate bending for a non-uniform transverse load applies. The resulting magnitude of displacement is an indicator of significance.

In particular the corresponding p-value can be obtained by subdividing the abscissa into say 100 small uniform intervals and doing the plate displacement calculation per the above with the mass above each interval per the figure. Call the maximum displacement see for this on the right side of the figure $D_{\text{ref}}$. Then take the total amount of mass and reallocate it at random many times, each time noting the maximum displacement on the right side of the figure compared to the left. The fraction of the count of times when the displacement from each of these, $D_{i}$ on the right side is at least $D_{\text{ref}}$.

3. David Graves

The comments on his post are largely thoughtful and measured—unlike his comment on your post. So far nothing on the post-2016 area burned and its impact on his hypothesis, and of course nothing to back up the “weak statistics” rejoinder. And my comment to that effect hasn’t been posted (as yet?)

4. Alex C

Any variety of suitable GLMs show significant trend terms too. Poisson regression, gamma, negative-binomial, doesn’t matter. They’re all significant without the past 2 years’ worth of preliminary data (2017’s total number is at 1,248,606 acres, 2018’s total number is at 749,770 acres; both, if anything, will increase as more fires happen/are counted), and they’re of course more significant with them included. The trends are significant despite 1987 being a locally high value at 873,000 acres. The years prior are:

1986 – 119,000
1985 – 595,000
1984 – 251,000
1983 – 128,000
1982 – 160,000
1981 – 322,000
1980 – 403,000

These are all quite considerably less than the most recent 7 years’ worth of data.

Also the statement:

“an inappropriate statistical approach that did not consider the variability of the time series”

is pure nonsense for the way Mass obviously intends it. His allegation is that the time series is so variable, that the significance is not reliable. This is a fundamental misunderstanding of null hypothesis significance testing: no matter how variable the data is, no matter how short the time series is, the sampling distribution f(x) of the test statistic x can be explicitly defined ahead of time. For any “significance level”, you can precisely define the two regions, which represent “significant” and “non-significant”, and whose bounds {b} in the test statistic domain are chosen such that f(b1) = f(b2); and the integral f(x) within the “significant” region is whatever your significance level was, say 0.05.

If the model and null hypothesis are correct, then there is ALWAYS a 5% probability of the test statistic falling within that region of significance corresponding to the 5% significance level, by definition. It is never more likely to occur, nor less likely. When we have a result that is significant at the 5% level, that is a precise statement without any uncertainty. P-values do not have any uncertainty.

The ONLY thing that needs to be watched for is not the size of Var(y|X), but whether or not the actual assumptions of the model hold. If the assumptions of the model structure are unreasonable, then a different model should be chosen. For instance, OLS is not entirely appropriate because heteroskedasticity is violated, and also strictly because the data are non-negative.

Mass may be concerned about an OLS estimator not being very precise (it is still unbiased), but actually that leads to no problems with the interpretation of the significance level. The high variance in the data will be reflected both in the parameter estimate itself and also its standard error.

• Alex C

To clarify:

“and whose bounds {b} in the test statistic domain are chosen such that f(b1) = f(b2), for all b1, b2 in {b}”

• Alex C

Just adding on to the edits: *homoskedasticity* is violated in this data. The data are heteroskedastic.

• @Alex C,

I’m a Bayesian, but, nevertheless, I say APPLAUSE! to your comment.

Thanks!

• rhymeswithgoalie

I always say Bayesians are OK, as long as you don’t let them feed directly from your hand.

5. Better than my suggestion above: See Wildfires are burning longer and hotter each year, by Erin Ross. Also “Impact of anthropogenic climate change on wildfire across western US forests”, by
Abatzoglou and Williams.

6. Cliff Mass, replied to someone on his blog. Apparently, he’s now done the analysis and found the trend in area burned to be not significant at the 95% confidence level. He claims that a trend is not significant if there is a 5% or higher chance of its occurring naturally.

• I put in the following comment at Mass’ blog:

@Cliss Mass, re Casola, et al, 2009, or https://journals.ametsoc.org/doi/pdf/10.1175/2008JCLI2612.1,

One doesn’t need to assume a t to measure variability intrinsic to a series, nor make an assumption based upon some estimated appropriate window size. The Politis and Romano stationary bootstrap permits estimate of such variability in the series directly, and it operates at varying window sizes.

Moreover, positing the linear trend as a competing model is setting it up as a strawman. Who would claim, for instance, that the year-to-year effect of warming upon any phenomenon would be linear? Even temporal acceleration is likely to vary. That the portion of variance explained by a linear model is small says less about warming than it addresses the specification error with such a model. To be convincing the argument should bound such specification error.

Finally, the null used by Casola, et al 2009, or even Stoelinga, et al 2010, which is, per your claim, the basis of your analysis, is not in itself an uncorrelated random process. If it is not, the framework of significance testing is wildly inappropriate, even assuming it is ever appropriate. Bayesian methods are known to be superior to any application of the Student t.

7. Philippe Chantreau

I note that Cliff Mass is not providing an answer, yet it is a really good question. What statistical approach would satisfy Mr Mass? Readers versed in stats perhaps can provide suggestions and we’ll see how they pan out. I’m thinking that all will, one again, converge in the same direction where the rest of the evidence is exerting a black hole type of attraction.