In a comment on the last post, it was mentioned that the frequency of earthquakes of any given magnitude or greater will be given by the Gutenberg-Richter law. It states that the expected number of earthquakes in a given region over a given span of time, of a given earthquake magnitude or greater, will be
where is the quake magnitude, and are constants, and is the expected number. For active regions, the constant usually has a value near 1.
We can use the data referred to in the previous post to investigate whether or not the number-magnitude relationship follows the Gutenberg-Richter law. The most straightforward way to fit this equation to the data is to compute the number of quakes with magnitude greater than or equal to each possible value (magnitudes are given to the nearest 0.1), log-transform them, then fit a line by least-squares regression. I split the data into two segments — quakes prior to 2009 and those after. Although the 2nd time span is much shorter (only about 3 years compared to 33), it has so many more quakes per year that the numbers aren’t nearly so skewed. The first time span has a total of 779 quakes, the 2nd has 340. Here are the least-squares fit lines to the log-transformed data:
There seems (visibly) to be some misfit to the data, especially that before 2009. This is even more visible if we compare the fit curve to the actual data prior to 2009:
We can do the same for the data after 2009, for which the fit looks better:
The estimated parameter is 1.015 for pre-2009 data, 1.018 for post-2009 data, and the estimated uncertainties for those figures don’t indicate any statistically significant difference in their values before and after 2009.
Nonetheless, for both time spans the expected number at low quake magnitudes is a bit higher than the observed number, and the fit is poorest for low magnitude values. Does that mean the Gutenberg-Richter law isn’t working?
Not necessarily. This is one of those cases for which least-squares regression isn’t such a good choice. For one thing, the data are sums of all counts for magnitude greater than or equal to the given value. Hence they’ll exhibit very strong autocorrelation. For another thing, they’ll show extremely strong heteroskedasticity, i.e., the variance of the different values will be dramatically different. If, for instance, the data were governed by the Poisson distribution (although they aren’t, but at least that’s “in the ballpark”) then the standard deviation when the count is as large as 779 would be about 28 times greater than the standard deviation when the count is only 1.
Log-transforming the data doesn’t help. It does reverse the “variance problem” — now it’s the low values that have much greater variance than the high values. Hence in least-squares regression, the low counts (for high-magnitude quakes) get too much statistical weight and the high counts get too little. The fit is therefore too much dominated by low counts and that’s why it’s poorest at low magnitude levels.
So I tried a different approach. I modelled the data as a Poisson process with expected value given by the Gutenberg-Richter law. It isn’t — the variance is too high — but at least that’s a better approximation than a model where the observed number is given by the Gutenberg-Richter law plus noise which follows the normal distribution. Then I fit this model to the data by maximum-likelihood. That gives this fit for data prior to 2009:
And this fit for data after 2009:
These fits are much better. Now the estimated values are 0.892 for pre-2009 data and 0.998 for post-2009 data. Furthermore, the indicated uncertainties suggest that the difference is statistically significant, strongly so.
In my opinion, an even better idea is to fit a similar model to the actual counts at a given magnitude, not to the cumulative counts for quakes of that magnitude or greater. This eliminates most of the autocorrelation problem. It gives estimates for the parameter of 0.812 for pre-2009 data and 0.952 for post-2009 data, and again indicates that the difference before and after 2009 is statistically significant.
I don’t see sufficient visual evidence to suggest that these numbers are deviating from the Gutenberg-Richter law. If I were doing this for publication, I’d do a direct test of the observed numbers against the theoretical distribution, probably using the Kolmogorov-Smirnov test. But at this point I’m feeling lazy, so I’ll just say “not sufficient visual evidence” and leave it at that.
Bottom line: it seems the data follow the Gutenberg-Richter law (at least approximately) both before and after 2009, but the parameter is significantly different between these two time spans.