In a comment on the last post, it was mentioned that the frequency of earthquakes of any given magnitude **or greater** will be given by the Gutenberg-Richter law. It states that the expected number of earthquakes in a given region over a given span of time, of a given earthquake magnitude or greater, will be

,

where is the quake magnitude, and are constants, and is the expected number. For active regions, the constant usually has a value near 1.

We can use the data referred to in the previous post to investigate whether or not the number-magnitude relationship follows the Gutenberg-Richter law. The most straightforward way to fit this equation to the data is to compute the number of quakes with magnitude greater than or equal to each possible value (magnitudes are given to the nearest 0.1), log-transform them, then fit a line by least-squares regression. I split the data into two segments — quakes prior to 2009 and those after. Although the 2nd time span is much shorter (only about 3 years compared to 33), it has so many more quakes *per year* that the numbers aren’t nearly so skewed. The first time span has a total of 779 quakes, the 2nd has 340. Here are the least-squares fit lines to the log-transformed data:

There seems (visibly) to be some misfit to the data, especially that before 2009. This is even more visible if we compare the fit curve to the actual data prior to 2009:

We can do the same for the data after 2009, for which the fit looks better:

The estimated parameter is 1.015 for pre-2009 data, 1.018 for post-2009 data, and the estimated uncertainties for those figures don’t indicate any statistically significant difference in their values before and after 2009.

Nonetheless, for both time spans the expected number at low quake magnitudes is a bit higher than the observed number, and the fit is poorest for low magnitude values. Does that mean the Gutenberg-Richter law isn’t working?

Not necessarily. This is one of those cases for which least-squares regression isn’t such a good choice. For one thing, the data are sums of all counts for magnitude greater than or equal to the given value. Hence they’ll exhibit very strong *autocorrelation*. For another thing, they’ll show extremely strong *heteroskedasticity*, i.e., the variance of the different values will be dramatically different. If, for instance, the data were governed by the Poisson distribution (although they aren’t, but at least that’s “in the ballpark”) then the standard deviation when the count is as large as 779 would be about 28 times greater than the standard deviation when the count is only 1.

Log-transforming the data doesn’t help. It does reverse the “variance problem” — now it’s the low values that have much greater variance than the high values. Hence in least-squares regression, the low counts (for high-magnitude quakes) get too much statistical weight and the high counts get too little. The fit is therefore too much dominated by low counts and that’s why it’s poorest at low magnitude levels.

So I tried a different approach. I modelled the data as a Poisson process with expected value given by the Gutenberg-Richter law. It isn’t — the variance is too high — but at least that’s a better approximation than a model where the observed number is given by the Gutenberg-Richter law plus noise which follows the normal distribution. Then I fit this model to the data by maximum-likelihood. That gives this fit for data prior to 2009:

And this fit for data after 2009:

These fits are *much* better. Now the estimated values are 0.892 for pre-2009 data and 0.998 for post-2009 data. Furthermore, the indicated uncertainties suggest that the difference is statistically significant, strongly so.

In my opinion, an even *better* idea is to fit a similar model to the actual counts *at* a given magnitude, not to the cumulative counts for quakes of that magnitude or greater. This eliminates most of the autocorrelation problem. It gives estimates for the parameter of 0.812 for pre-2009 data and 0.952 for post-2009 data, and again indicates that the difference before and after 2009 is statistically significant.

I don’t see sufficient visual evidence to suggest that these numbers are deviating from the Gutenberg-Richter law. If I were doing this for publication, I’d do a direct test of the observed numbers against the theoretical distribution, probably using the Kolmogorov-Smirnov test. But at this point I’m feeling lazy, so I’ll just say “not sufficient visual evidence” and leave it at that.

Bottom line: it seems the data follow the Gutenberg-Richter law (at least approximately) both before and after 2009, but the parameter is significantly different between these two time spans.

It might be worth noting that the overdispersion (too high variance) won’t affect the fit of this model. Confidence intervals will be too narrow, but the ML estimate for the mean will be the same for a Poission model and a quasi-Poisson model.

ps: I normally wouldn’t comment on spelling, but since it’s a technical term: heteroskedasticity is missing the “o”.

[

Response:Oops. Fixed.]The first one is a log-log plot which is notorious for always being linear and why physicists like scaling laws (guilty!)

A larger value for “b” means that the number of high intensity quakes decreases faster. Since your post-2009 “b” (0.952) is significantly higher than the pre-2009 (0.812) this would suggest that the difference is probably man made, but that the effect is biased towards creating more low intensity quakes.

It would be interesting to know what the implications are for 7.0, 8.0, and 9.0 quakes based on your best fit values of “a” and “b”.

Moment magnitude 7.0 and up earthquakes are said to be major. Moment magnitude 8.0 and up earthquakesare said to be great. There is no accepted term for moment magnitude 9.0 and up earthquakes but I suggest gargantuan or stupendous.

There are relatively few major earthquakes worldwide and even fewer great earthquakes. When we consider stupendous earthquakes there have been but 5 since 1900 and one of those was maybe just 8.8 not 9.0. Plotting the counts of measured major earthquakes on a semilog plot with the smaller ones one dicovers that right around moment magnitude 7.0 there is a definite break in the slope. Major earthquakes are more frequent than the Gutenberg-Richter law for the smaller ones would predict.

Stated another way, the distribution has a heavy tail up to the largest earthquake physically possible, seemingly moment magnitude 9.6 without assistance from a gigantic bolide strike.

I realize that, but it’s the 6+ quakes, especially the 7+ quakes that cause the most damage. I don’t think I’ve ever heard of a quake under 6.3 or so that ever caused noteworthy damage. So it would appear to stand to reason that it’s the 7+ (or so) range that we should be most concerned about.

Based on the data provided by tamino, 779 3+ quakes in 33 years with b=0.812 implies about 779/33*10^(-0.812*4)= 0.0133 7+ quakes per year based on pre 2009 data. With 340 quakes in 3 years and b=0.952 this becomes 340/3*(-0.952*4)=0.0176 7+ quakes per year, an approximately 33% increase. If you extend this to 8+ quakes then you get about 0.002 quakes per year for both cases.

So it would appear, based on these admittedly back of the envelope calculations, that there would be a significant, but I think not overwhelming, increase in 7.0 to 8.0 earthquakes, a very marginal decrease in extremely rare “great” earthquakes, but a massive increase in small 3.0 to 5.0 quakes and a near doubling of 6.0 to 7.0 quakes.

This is important because if the pre-2009 “b” is equal to the post-2009 “b” then an increase from 24 to 113 3+ quakes per year would imply an almost 5-fold increase in earthquakes at all intensities. This is certainly much worse than the case where “b” has increased.

If this is done in R with glm, you can replace ‘poisson’ with ‘quasipoisson’ to get a more reliable inference. Data are almost never going to be Poisson, and using a Poisson model makes Type I errors almost inevitable.

[

Response:I did it in R but used the “optim” function (which I regard as “kick-ass”).]I guess it depends on how careful you are. I’d rather reuse code than make a whole lot of mistakes replicating something someone else has already done!

I can never remember if you multiply the uncertainties by the estimated overdispersion, or by its square root. But I figure that glm already does it correctly, so I don’t have to worry about that.

To me, EK’s point about the expected number of strong – say,6+ – quakes is the interesting bit. When I first heard this story, my assumption was that the number of weak quakes increased massively but the number of noticeable ones didn’t. That isn’t quite true, it seems.

> If you extend this to 8+ quakes then you get about 0.002 quakes per year for both cases

and presumably if you extend it to 9+, then you get fewer in the new case. Which either means, physically, that the stress release from the weaker quakes prevents stronger ones; or its trying to tell you that the fit doesn’t work / is meaningless that far out.

ps: OT, but since I’m here: someone needs to take Spencer apart: http://www.drroyspencer.com/2012/04/new-u-s-population-adjusted-temperature-dataset-pdat-1973-2012/

There is a very strong correlation between moment magnitude and depth. That means material properties change; these USGS measurements cannot be extrapolated past moment magnitude 7.0 or so.

Strikes Eli that there may be a physical reason for the scaling because stress can be released by smaller quakes, so it takes unusual circumstances for enough stress to build up for a Big One

There is a physical reason but it is not found in the “because” clause given. As I replied just now to WC it is the stronger material properties at depth.

Don’t you have the red and blue lines reversed, label-wise? Looks like more quakes before 2009 the way you have it. Or are you graphing absolute numbers of quakes?

[

Response:They’re absolute numbers. Fewer quakes per year pre-2009, but far more years in the sample.]What do we know of the systatic errors?

e.g.

– Are measurements affected by the use of GPS? (is the before-after year of greatest difference 2009 or closer to when GPS became usable by science?)

– how about reporting & recording the long tail … Always a problem in power laws?

etc.

Many thanks for this post, tamino – I guess I am highly guilty by lazyness !

I was a bit unclear yesterday (I spent the whole day running after my children, so I couldn’t get further than “gut feeling” … ) , but you did far better than I ever could.

I have to digest this post, and try and bring constructive criticisms/precisions later. But my “gut feeling” is reinforced by the b-value obtained : I have never seen such an evolution for “natural” cases.

I have the impression that your fitting method is more sensible to low mag earthquakes than the traditional linear fitting to the log figure, is it right ?

Ernst K — Unfortunately this USGS data cannot be used to estimate higher on the moment magnitude scale than at most 7.0 (and even that extrapolation might not be justified).

Could you provide a link/reference? Everything I can find on this suggests that the relationship can stay linear out to M = 9 quakes (best quick example: http://pasadena.wr.usgs.gov/office/kfelzer/AGU2006Talk.pdf)

Two (or more) Russians at an institute for earthquake prediction (in Moscow?) wrote a book on extreme event distributions using actual data for a variety of phenomena. That is where I found this (seemingly little know) facet of earthquakes. I won’t vouch for their curve fitting but they did use some worldwide earthquake moment magnitude data, possibly USGS or possibly Russian.

Sorry I don’t remember more.

I suggest you try redoing the b-value estimate with magnitude cut-offs of 3.5 and 4.0. Basically, there is a recording problem as you go back in time; i.e., fewer seismometers = fewer small earthquakes recorded. I suspect the b-values before/after 2009 will not be that different.

There is an undercount of weak quakes. Maybe less undercount after 2009.