Exploring the Effect of Minimum Magnitude on California Seismic Hazard Models

A recent topic of interest is the performance of probabilistic seismic hazard maps relative to historically observed shaking. Past studies in several areas have documented an apparent discrepancy be-tweenseismichazardmappredictionsandhistoricallyobservedshaking. Toexplorewhetherthisdiscrepancy arises because of incompleteness in historical intensity catalogs, we consider maps and historical intensities from California. Current probabilistic seismic hazard maps for California appear to predict stronger short period shaking than historical maxima captured by the California Historical Intensity Mapping Project (CHIMP) datasetbetween1857and2019. WeestimatethatCHIMPhasamagnitudecompletenessbetweenM6and6.6, whereas California hazard maps assume a minimum magnitude (M Min ) of 5. Disaggregating the maps shows that earthquakes smaller than M6 and 6.6 respectively contribute about 25% and 45% of hazard across California. Increasing the hazard map’s M Min to 6 and 6.6 respectively reduces the discrepancy between predicted and observed shaking by approximately 10–20% and 30–35%. These reductions are not enough to bring the maps and data in alignment. Thus, M Min inconsistencies contribute to, but are not the sole cause of, the discrepancy between predicted and historically observed shaking. These results may be generally applicable to maps elsewhere, although M Min will vary for different historical datasets.


Introduction
Probabilistic seismic hazard analyses (PSHA) forecast the probability, p, that during t years of observations shaking at a site will exceed the value shown in a model with a τ -year return period, assumed to be described by a negative exponential distribution, (Cornell, 1968;Field, 2008). Assuming the same behavior over time as over space (the ergodic assumption), p also gives the fraction of sites where observed shaking is expected to exceed the mapped value for any map return period and duration of observations. As the ratio of the observation time to the return period increases, p should also increase because an increasing fraction of the area will have experienced larger earthquakes and thus higher shaking. This approach, which was introduced by Ward (1995) and used in subsequent analyses (e.g., Albarello and D'Amico, 2008;Fujiwara et al., 2009;Miyazawa and Mori, 2009;Stirling and Gerstenberger, 2010; Nekrasova * Corresponding author: mollygallahue2023@u.northwestern.edu Tasan et al., 2014;Mak and Schorlemmer, 2016a) considers many sites to avoid the constraint that large ground motions at any given site are rare. At any one site, the annual rate of exceedance is 1/τ . PSHA models and corresponding maps are based on assumptions about the rates, spatial distribution, and sizes of potential future earthquakes, and ground-motion models (GMMs) that predict the resulting shaking.
Because PSHA maps are critical for earthquake hazard assessment and risk mitigation, assessing how well PSHA models and maps forecast observed shaking is important. The problem is challenging both because of limitations in the available data and because of conceptual issues regarding the assessment of probabilistic forecast performance (Marzocchi and Jordan, 2014;Stein et al., 2015;Wang, 2015;Vanneste et al., 2018;Brooks et al., 2019;Gneiting and Katzfuss, 2014). In particular, shaking data recorded after a PSHA model was made typically span a time period (t) that is extremely short compared to the return period of the hazard map (τ ) (e.g., 475 or 2475 years). Hence most post-PSHA model shaking datasets do not fully include the moderately large and large earthquakes that often control hazard for an area. One approach to address this problem is retrospective assessment, or hind- casting, which uses compilations of historical shaking data spanning hundreds of years (Stirling and Petersen, 2006;Nekrasova et al., 2014;Mak et al., 2014;Stein et al., 2015;Brooks et al., 2016Brooks et al., , 2017Brooks et al., , 2018, or in some cases, smaller datasets of instrumental ground-motion data (Stirling and Gerstenberger, 2010;Mak and Schorlemmer, 2016b). The method is similar to that used by meteorologists to improve weather forecasts (Stein et al., 2015).
PSHA maps have recently been compared to historical datasets for several countries. Comparisons of mean hazard maps with catalogs of historical data for Italy, Japan, and California, (USA) have all shown that maps appear to overpredict shaking relative to historical datasets (Stein et al., 2015;Brooks et al., 2016;Salditch et al., 2020). Comparisons for France and Nepal yield similar results (Drouet et al., 2020;Salditch, 2021). Similar studies using ShakeMap data for Australia also demonstrate that the models appear to predict higher shaking than observed in the last 50 years (Allen et al., 2023). Thus, a similar discrepancy arises in different regions for shaking datasets and hazard maps developed by different groups and approaches. Although some studies have found that hazard maps are consistent, or even underpredict, relative to historical datasets (e.g., Rey et al., 2018;Griffin et al., 2019), these studies consider the largest intensities, above which observed data are considered complete. In doing so, these studies exclude regions where observed intensities are significantly less than those predicted, which is where the discrepancy arises. Although a compilation of historically observed shaking may not be complete at the MMI 3 or 4 (say) level, a fair overall comparison of a hazard map should consider grid cells for which one has historically observed maximum intensities. A potential incompleteness bias in historical intensity catalogs is countered by a general reporting bias, whereby severe effects are more likely to be documented than mild or moderate effects (e.g., Hough, 2013). This reporting bias persists with the DYFI system: Mak and Schorlemmer (2016a) showed that, to first order, population density and shaking severity control the likelihood that a DYFI response will be received. Additionally, these studies examine individual sites, which differs from our approach of examining the entire map.
This discrepancy might have several causes, as discussed by Salditch et al. (2020), including that the data may be biased low, the maps may be biased high, or that the misfit may arise purely by chance. One possible cause of the data being biased low is that databases of historically observed shaking may be incomplete for magnitudes that contribute to hazard. To explore this issue, we consider U.S. Geological Survey (USGS) mean probabilistic seismic hazard maps for California and historical observed shaking captured in the California Historical Intensity Mapping Project (CHIMP). The CHIMP catalog is publicly available, and the USGS provides a publicly available tool for disaggregation (https: //earthquake.usgs.gov/hazards/interactive/), making this study possible.
CHIMP is the first comprehensive compilation of observed seismic intensities from 62 of the largest earthquakes (moment magnitude M4.7-7.9; most with M>6) in California over a 162-yr period between 1857 and Figure 2 Schematic illustrating three possible adjustments to the hazard model that could make the predicted hazard more consistent with the observed shaking data at a site. Ideally, curves should pass through the yellow diamonds. Curves to the right of the yellow diamonds predict higher shaking than observed in the historical catalog. Arrows show in which direction adjustments change the original hazard curve. 475-yr and 2475-yr return periods (RP) shown. Fig. 1b) (Salditch et al., 2020). Intensity values, measured on the Modified Mercalli Intensity (MMI) scale (Wood and Neumann, 1931) were collected from sources including historical newspaper accounts, USGS postcard questionnaires, USGS "Did You Feel It?" (DYFI) program responses, and oral history accounts (Salditch et al., 2020). Assigned intensity values were then aggregated into 10 × 10 km regions, giving the maximum observed intensity in each (Fig. 1a). The nominal uncertainty of each observation is +/-one intensity unit (Salditch et al., 2020), a standard variation between researchers (Hough and Page, 2011;Salditch et al., 2018).

(
This process is likely to miss shaking due to some smaller historical events for which accounts are limited, and possibly moderate earthquakes outside of California borders that were felt within the state. Salditch et al. (2020) explored CHIMP incompleteness from historical population distributions through interpolated inten-sities for the three largest earthquakes in California (1857Fort Tejon, 1872Owens Valley, and 1906. They showed that results did not change appreciably when more spatially dense intensity values for past large earthquakes were included. Incompleteness of intensity observations at individual points is thus unlikely to account for the discrepancy between maps and observed intensities. To compare hazard maps with CHIMP, Salditch et al. (2020) converted the 2018 USGS seismic hazard map (Petersen et al., 2020) from peak ground acceleration (PGA) to MMI via Worden et al. (2012) magnitude and distance independent ground-motion intensity conversion equation (GMICE). Here, we use PGA converted to MMI hazard maps for consistency with the Salditch et al. (2020) study.
Equation 1, for t = 162 years, predicts that the fraction of sites where observed shaking exceeds the mapped hazard value should be p = 0.289 and 0.063 for 475and 2475-yr return period maps, respectively. However, the observed fraction of site exceedances, f, calculated by dividing the number of sites where CHIMP data is higher than a 475-and 2475-yr hazard map's prediction by the total number of CHIMP sites, is considerably lower than either p. Salditch et al. (2020) found f = 0.064 and 0.006 for the 475-and 2475-yr maps, respectively.
The performance of hazard maps can be quantified using M0, the fractional exceedance metric, which is defined as |f -p| (Stein et al., 2015). Ideally this metric would equal 0, and larger M0s indicate a greater discrepancy between predictions and observations. We also consider the ratio f/p, which can be thought of as a scale factor for the number of earthquakes contributing to hazard that would be needed to align the map predictions and observations. Map "overprediction" arises when p > f. Salditch et al. (2020) found that p is greater than f for the maps with both a 10% probability of exceedance in 50 years (475-yr return period) and a 2% probability of exceedance in 50 years (2475-yr return period), so both overpredict shaking relative to the CHIMP data. This pervasive discrepancy has important implications and calls for detailed investigation.
The overprediction of the shaking data could be influenced by three different factors, shown schematically via the hazard curve for an individual site (Fig. 2) in a hazard map. If the median ground motions as predicted by GMMs were smaller, the hazard curve would shift horizontally, bringing it in closer alignment with observed data. Alternatively, if the assumed hazard model's M Min , the minimum magnitude of earthquakes included in the hazard calculation, were higher, the hazard curve would shift vertically because hazard is reduced due to fewer contributing earthquakes, again bringing it into closer alignment with observed intensities. Lastly, if the aleatory, or random, variability of the ground-motion models were smaller, the hazard distribution would be narrowed, thus steepening the hazard curve. Other effects can also change the hazard curves, but we consider these to be the simplest and thus most likely.
There are several assumptions made when building and testing a hazard map. The comparisons by Salditch et al. (2020) used PSHA maps for a reference site condition, leaving open the possibility that the results might be significantly different if site response were included. Gallahue et al. (2022) showed, however, that incorporating site effects (as predicted by NGA-West2) did not appreciably change high-frequency PSHA maps, and thus that site effects are not a cause of the discrepancy. The weak predicted influence of site response on highfrequency ground motion is due to nonlinear damping at strong shaking levels. Furthermore, comparison of hazard maps (traditionally given in PGA) to observed shaking records (in seismic intensity) requires the use of a GMICE. Salditch et al. (2020) converted PGA hazard maps to intensity using the Worden et al. (2012) GMICE. However, a recent study by Gallahue and Abrahamson (2023) found that the methodology used in the Worden GMICE (and others) leads to overestimated intensities at PGAs relevant to hazard maps. These biased GMICE will contribute to the observed discrepancy. USGS hazard maps also rely on the mean hazard, but median hazard curves may better align with observed data. Other factors, such as nonergodic ground-motion models used to develop the maps or spatial correlation effects, could also contribute to the discrepancy.
In this paper, we consider whether removing the contribution to seismic hazard from smaller earthquakes (increasing M Min in the hazard maps) to adjust for incompleteness of the CHIMP dataset could contribute to the observed overprediction (Fig. 2). This evaluates the assumption that the minimum magnitudes between maps and data are consistent, or the differences are irrelevant. USGS mean PSHA maps, which we refer to as the reference maps, use a minimum magnitude M Min of 5 in hazard calculations for California (Petersen et al., 2015(Petersen et al., , 2020. However, the magnitude for completeness is larger for CHIMP. The PSHA maps were also generated using earthquakes within a 100-km zone beyond state borders. CHIMP considers all earthquakes greater than M6, and a few smaller ones. Thus, the lowest possible magnitude of completeness of CHIMP is M6. Using the maximum curvature method (Mignan and Woessner, 2012) to calculate the CHIMP catalog's completeness, we find the maximum value of the first derivative of the cumulative frequency-magnitude distribution to be M6.6 (Fig. 3). Hence CHIMP is complete for M6.6+ but includes some smaller events if they were suspected to control the maximum observed intensity at some sites. We further note that evaluations of PSHA maps using CHIMP rely on locations for which historical intensities are available. Catalog completeness will be lowest in sparsely populated regions; i.e., intensity values will tend to be missing in regions where the catalog is least complete. We thus expect that missing moderate earthquakes in sparsely populated regions, including in the 100-km zone outside of California borders, will tend to not contribute to assessed map performance. We thus assume that the actual magnitude of completeness for CHIMP is somewhere between M6 and 6.6 and evaluate M Min at these end members. If M Min = 6, 51 earthquakes remain in the catalog. If M Min = 6.6, 28 earthquakes remain. For completeness, because M Min can be difficult to resolve, we also tested M Min = 6.8. These results did not differ from those for M Min = 6.6.

Disaggregation of hazard for M>6 and M>6.6
We explore the effect of the minimum magnitude using disaggregation, which gives the fractional contribution to the total predicted hazard for a range of magnitudes and distances. Disaggregation results allow the contribution to hazard from varying magnitudes, distances, and epsilons (number of standard deviations above the median ground motion) to be understood (Bazzurro and Cornell, 1999). Effectively, this breaks the hazard map into the input scenarios, and determines the effect of each. From a mathematical basis, the fraction of the hazard from all events with M≥5 that results from events with M≥M Min is: (2) where z is predicted ground motion (in PGA), Disagg(M≥M Min |PGA>z) is the hazard disaggregation assuming a magnitude greater than M Min and PGA greater than z, and Haz PGA (z|M≥ M Min ) is the hazard in terms of PGA of z assuming M≥5. Figure 4, a disaggregation result for one site in California, demonstrates how the removal of magnitudes less than a certain value would reduce the total hazard (rate of exceedance) due to using a larger M Min in the hazard calculations. Excluding M5-6 reduces the hazard less than excluding M5-6.6 because fewer events are excluded.
We produce approximate hazard maps for M Min = 6 and M Min =6.6 using disaggregation results for a grid of locations in California using the USGS Unified Hazard Tool for return periods of 475 years and 2475 years, corresponding to the two maps examined by Salditch et al. (2020). Because disaggregation reports for the 2018 mean hazard model and map were not available in the USGS tool at the time of this analysis, we used the 2014 mean values. Petersen et al. (2020) found that the mean hazard in California stayed consistent between the 2014 model and the 2018 update, so results for 2014 are expected to be comparable to the more recent 2018 models. Additionally, the earthquake catalog and source models for California in both the 2014 and 2018 maps are from the Uniform California Earthquake Rupture Forecast model (UCERF3). Thus, no earthquakes were added or removed from 2014 to 2018, and no substantial changes occurred in California hazard during this time (Petersen et al., 2020(Petersen et al., , 2021.
We compute the fraction of the hazard due to larger events at each grid site. The disaggregation shows that the average contribution to hazard of M≥6 events across

Figure 4
Disaggregation results for a site outside San Francisco. Most of the hazard comes from events with magnitudes greater than~7 and distances <~50 km. If the contribution from smaller magnitude events were removed, the total hazard would decrease.
all sites is~75% for the 475-yr return period map and 76% for the 2475-yr return period map. Similarly, the average contribution to hazard of M≥6.6 events across all sites is smaller,~53% for the 475-yr return period map and~55% for the 2475-yr return period map. At each grid site, we reduce the hazard to that excluding M<6 or 6.6 by scaling the hazard curve (Rukstales and Petersen, 2019) by the calculated percent contribution to hazard. Via log-interpolation on the hazard curves, we derive the corresponding decrease in PGA. To scale the hazard maps for the hazard from only M≥6 or 6.6 events, we adjust the reference hazard map PGA by the PGA percent decrease at the closest grid location (Fig. 5).
The percent decreases in PGA from the reference map to the maps scaled to exclude M<6 or 6.6 are shown in Fig. 6. Excluding the contribution from smaller earthquakes reduces the hazard and thus the corresponding predicted PGA. For 475-yr and 2475-yr return periods, the reduction in the PGA from excluding M<6 events ranges from less than 10% near the major coastal faults to up to 35% in the Sierra foothills. Excluding the contribution from M<6.6 events reduces PGA by up to 65% in the Sierra foothills.
These results show that M<6 events contribute substantially to predicted hazard (~25%), and M<6.6 events contribute even more (~45%). This result is consistent with the result of Minson et al. (2009), that due to the aleatory variability of ground motions and the Gutenberg-Richter distribution of magnitudes (Gutenberg and Richter, 1944), strong short-period shaking is more likely to be generated by more frequent moderate- magnitude earthquakes than by rarer large-magnitude earthquakes. Fig. 6 shows that M<6 and 6.6 events strongly contribute to hazard in some areas. Conceptually, we expect that hazard near major, active faults will be controlled by large earthquakes on those faults. However, away from major faults, hazard will be more controlled by distributed background seismicity, so relatively frequent moderate-magnitude events will contribute more to overall hazard. Figure 7 shows the decrease of the scaled map relative to the reference for all cases, highlighting the MMI reduction from removing the contribution to hazard from smaller earthquakes.
It is worth noting that our described approach produces approximate maps for the given M Min . The approach used here assumes that the relative contributions of the sources to the target exceedance rate does not appreciably change between the original and scaled maps. To completely alter M Min would require changing the minimum magnitude during the calculation of hazard curves that underlie the maps. In doing so, the median and variance of ground motion and the activity rate, all of which influence predictions, would change. As described in Figure 2, changes to the median ground motion shift the hazard curve horizontally, whereas changes to variance would affect the slope of the hazard curve. Changes to the activity rate would shift the hazard curve vertically (Figure 2). However, de-veloping hazard maps involves extensive computations, so in practice it is often not justifiable to compute hazard models for multiple M Min, as evaluated in this paper.

Comparison of M≥6 Or 6.6 Maps with Historical Intensity Data
Using the metrics from Salditch et al. (2020), we assessed how the hazard maps scaled for M≥6 or 6.6 perform relative to CHIMP historical intensity data. For this comparison, we edited the CHIMP dataset to include data from only M≥6 or 6.6 earthquakes, which reduced the number of sites from that shown in Fig. 1a. We only assess hazard map values at the locations in which CHIMP data existed in the M≥6 or 6.6 sets; thus, the total number of sites evaluated varies between each considered M Min . The M≥6 or 6.6 scaled PGA map values were converted to MMI via Worden et al. (2012)'s magnitude-and distance-independent equation that does not include aleatory variability. Comparison of exceedance rates (Fig. 8, 9) shows that the scaled hazard maps excluding M<6 or 6.6 perform better than the reference maps because the lower predicted hazard yields more observed exceedances (Table 1, 2). However, the fractions, f, of sites where the largest shaking exceeds that predicted from either map remain much lower than the predicted fraction, p.   178%, or 31 sites. M0 values for the scaled maps, 0.15 and 0.038 for the 10% probability and the 2% probability of exceedance in 50 years map, were lower (showing better fit) than those for the reference maps, 0.226 and 0.054. Similarly, f/p ratios increased from 0.218 and 0.143 for the 475-yr and 2475-yr reference maps to 0.481 and 0.397 for the scaled maps, respectively, indicating improvement in map performance.
Despite the improvements, the scaled maps still overpredict shaking relative to CHIMP data. We construct 5-95% intervals for the predicted fraction of exceedance to explore if this could be due to chance. We assume the uncertainty in the number of observed ex-ceedances in 162 years follows a Poisson probability model, such that where t is the length of the interval (162 years), τ is approximately the average number of years between each exceedance (equal to 162/N), N is the predicted number of exceedances, equal to 588 for the 475-yr maps and 128 for the 2475-yr return-period maps. We compute 5% and 95% uncertainty ranges for N by finding N 0.05 and N 0.95 such that P(m ≤ N 0.05 )=0.05 and P(m ≤ N 0.95 )=0.95.  Table 2 Statistics of comparison of CHIMP data for M≥6.6 to both reference and scaled (M≥6.6) maps.
Then, N 0.05 /2186 and N 0.95 /2186 are the lower and upper ranges of p, respectively, for the 2186 total sites for M≥6 comparisons (Fig. 8). Likewise, N 0.05 /2033 and N 0.95 /2033 are the lower and upper ranges of p, respectively, for the 2033 total sites for M≥6.6 ( Fig. 9). For M≥6 maps with 2186 sites and predicted exceedances at 632 and 155 sites, the 5-95% uncertainty ranges on p are [0.270, 0.308] and [0.054, 0.072] for the 475-yr and 2475-yr return period maps, respectively. For M≥6.6 maps with 2033 sites and predicted exceedances at 588 and 128 sites, the 5-95% uncertainty ranges on p are [0.270, 0.309] and [0.054, 0.072] for the 475-yr and 2475-yr return period maps. In all cases, the observed fraction of exceedances, f, for both the reference and scaled maps fall outside the uncertainty ranges of the predicted fraction, p. Thus, we conclude that the observed remaining discrepancy is not due to chance.

Discussion
Our results show that the apparent overpredictions of the USGS mean hazard maps relative to the CHIMP dataset cannot be fully explained by increasing M Min to 6 or 6.6 to account for incompleteness of CHIMP. Although removing the contribution to hazard due to M<6 or 6.6 events reduces the overprediction, it does not reduce it enough to fall within 5-95% confidence intervals on the predicted fraction of exceedance for either return period.
It remains possible that the discrepancies between predicted and historically observed ground motions are due to long-term incompleteness of the CHIMP catalog due to the short length of the historical record. If, however, the inconsistencies between the historical magnitude of completeness and hazard map M Min for California were the only cause of the discrepancy, the number of earthquakes contributing to hazard would need to be reduced by approximately a factor of 4 to align the reference map's (including hazard from M<6.6) observed fraction of exceedance with the 5-95% confidence intervals of the predicted for the 475-yr map (5% interval of f/p for reference maps). Similarly, approximately a factor of 6 reduction would be needed for the 2475-yr map. For the scaled maps excluding the hazard from M<6.6 events, reductions of about a factor of 2 for the 475-yr return period map and for the 2475-yr return period map would be needed to bring the metrics in alignment. Even greater factors would be needed to bring the scaled maps excluding hazards from M<6 maps in alignment. Although Page and Felzer (2015) show that seismicity rate tends to be underestimated from short catalogs, their results indicate it is unlikely that the rate has been underestimated by such large factors. Recent results compiled by the U.S. Geological Survey "Did You Feel It?" system reveal that shaking from M6 earthquakes is widely felt. For example, the 8 July 2021 M6 Antelope Valley earthquake, which struck a remote area near the Nevada border, was widely felt throughout north-central California (see Data & Code Availability and Reproducibility).

Conclusion
We have considered one possible factor that might explain the discrepancy between predicted shaking in California and the much lower historically observed shaking revealed by the CHIMP dataset (Salditch et al., 2020). Our analysis shows that M5-6 earthquakes contribute approximately 25% of the predicted hazard in California and M5-6.6 contribute about 45%, consistent with implications of Minson et al. (2009). Hence, when the hazard maps are scaled to include only the effects of M >6 or 6.6 events, the observed overpredictions lessen. However, this decrease is not large enough that the observed fractional exceedance falls within the 5-95% confidence interval of that predicted. As such, the discrepancy between predicted and historically observed shaking in California is not due to incompleteness of the CHIMP dataset. These inconsistencies in minimum magnitude contribute to the discrepancy but are less than 30-35% of the discrepancy. Analogous results are expected to arise for discrepancies between maps and historical datasets in other areas when their corresponding M Min s are considered. Future work is planned to focus on other possible effects contributing to the overprediction, including improved conversion equations between PGA and MMI (Gallahue and Abrahamson, 2023), which could lead to improved seismic hazard maps for California and worldwide.