Curated Pacific Northwest AI-ready Seismic Dataset
DOI:
https://doi.org/10.26443/seismica.v2i1.368Abstract
The curation of seismic datasets is the cornerstone of seismological research and the starting point of machine-learning applications in seismology. We present a 21-year-long AI-ready dataset of diverse seismic event parameters, instrumentation metadata, and waveforms, as curated by the Pacific Northwest Seismic Network and ourselves. The dataset contains about 190,000 three-component (3C) waveform traces from more than 65,000 earthquake and explosion events, and about 9,200 waveforms from 5,600 exotic events. The magnitude of the events ranges from 0 to 6.4, while the biggest one is 20 December 2022 M6.4 Ferndale Earthquake. We include waveforms from high-gain (EH, BH, and HH channels) and strong-motion (EN channels) seismometers and resample to 100 Hz. We describe the earthquake catalog and the temporal evolution of the data attributes (e.g., event magnitude type, channel type, waveform polarity, and signal-tonoise ratio, phase picks) as the network earthquake monitoring system evolved through time. We propose this AI-ready dataset as a new open-source benchmark dataset.
References
A.A.Royer, & M.G.Bostock. (2014). A comparative study of low frequency earthquake templates in northern Cascadia. Earth and Planetary Science Letters, 402, 247–256. https://doi.org/10.1016/j.epsl.2013.08.040
Albuquerque Seismological Laboratory (ASL)/USGS. (1988). Global Seismograph Network - IRIS/USGS. International Federation of Digital Seismograph Networks. https://doi.org/10.7914/SN/IU
Albuquerque Seismological Laboratory (ASL)/USGS. (1990). United States National Seismic Network. International Federation of Digital Seismograph Networks. https://doi.org/10.7914/SN/US
Allen, R. (1982). Automatic phase pickers: Their present use and future prospects. Bulletin of the Seismological Society of America, 72(6B), S225–S242. https://doi.org/10.1785/BSSA07206B0225
Allstadt, K. E., Matoza, R. S., Lockhart, A. B., Moran, S. C., Caplan-Auerbach, J., Haney, M. M., Thelen, W. A., & Malone, S. D. (2018a). Seismic and acoustic signatures of surficial mass movements at volcanoes. Journal of Volcanology and Geothermal Research, 364, 76–106. https://doi.org/https://doi.org/10.1016/j.jvolgeores.2018.09.007
Allstadt, K. E., Matoza, R. S., Lockhart, A. B., Moran, S. C., Caplan-Auerbach, J., Haney, M. M., Thelen, W. A., & Malone, S. D. (2018b). Seismic and acoustic signatures of surficial mass movements at volcanoes. Journal of Volcanology and Geothermal Research, 364, 76–106. https://doi.org/10.1016/j.jvolgeores.2018.09.007
Allstadt, Kate. (2013). Extracting source characteristics and dynamics of the August 2010 Mount Meager landslide from broadband seismograms. Journal of Geophysical Research: Earth Surface, 118(3), 1472–1490. https://doi.org/https://doi.org/10.1002/jgrf.20110
Allstadt, KE, McVey, B., & Malone, S. (2017). Seismogenic landslides, debris flows, and outburst floods in the western United States and Canada from 1977 to 2017: US Geological Survey data release. https://doi.org/10.5066/F7251H3W
Bahavar, M., Allstadt, K. E., Van Fossen, M., Malone, S. D., & Trabant, C. (2019). Exotic seismic events catalog (ESEC) data product. Seismological Research Letters, 90(3), 1355–1363. https://doi.org/10.1785/0220180402
Bartlow, N. M. (2020). A long-term view of episodic tremor and slip in Cascadia. Geophysical Research Letters, 47(3), e2019GL085303. https://doi.org/10.1029/2019GL085303
Bergen, K. J., Johnson, P. A., de Hoop, M. V., & Beroza, G. C. (2019). Machine learning for data-driven discovery in solid Earth geoscience. Science, 363(6433), eaau0323. https://doi.org/10.1126/science.aau0323
Braun, T., Frigo, B., Chiaia, B., Bartelt, P., Famiani, D., & Wassermann, J. (2020). Seismic signature of the deadly snow avalanche of January 18, 2017, at Rigopiano (Italy). Scientific Reports, 10(1), 1–10. https://doi.org/10.1038/s41598-020-75368-z
Cascades Volcano Observatory/USGS. (2001). Cascade Chain Volcano Monitoring. International Federation of Digital Seismograph Networks. https://doi.org/10.7914/SN/CC
Chmiel, M., Walter, F., Wenner, M., Zhang, Z., McArdell, B. W., & Hibert, C. (2021). Machine learning improves debris flow warning. Geophysical Research Letters, 48(3), e2020GL090874. https://doi.org/10.1029/2020GL090874
Collins, E., Allstadt, K., Groult, C., Hibert, C., Malet, J., Toney, L., & Bessette-Kirton, E. (2022). Seismogenic Landslides and other Mass Movements: US Geological Survey data release. https://doi.org/10.5066/P90VGCSK
Crosson, R. S. (1972). Small earthquakes, structure, and tectonics of the Puget Sound region. Bulletin of the Seismological Society of America, 62(5), 1133–1171. https://doi.org/10.1785/BSSA0620051133
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., & Fei-Fei, L. (2009). Imagenet: A large-scale hierarchical image database. 2009 IEEE Conference on Computer Vision and Pattern Recognition, 248–255. https://doi.org/10.1109/CVPR.2009.5206848
Dragert, H., Wang, K., & James, T. S. (2001). A silent slip event on the deeper Cascadia subduction interface. Science, 292(5521), 1525–1528. https://doi.org/10.1126/science.1060152
Ducellier, A., & Creager, K. C. (2022). An 8-Year-Long Low-Frequency Earthquake Catalog for Southern Cascadia. Journal of Geophysical Research: Solid Earth, 127(4), e2021JB022986. https://doi.org/10.1029/2021JB022986
Feng, Z. (2012). The seismic signatures of the surge wave from the 2009 Xiaolin landslide-dam breach in Taiwan. Hydrological Processes, 26(9), 1342–1351. https://doi.org/10.1002/hyp.8239
Gene A. Ichinose, P. G. S., Hong Kie Thio. (2004). Rupture process and near-source shaking of the 1965 Seattle-Tacoma and 2001 Nisqually, intraslab earthquakes. Geophysical Research Letters, 31(10), 1–4. https://doi.org/10.1029/2004GL019668
Gomberg, J, Sherrod, B., Trautman, M., Burns, E., & Snyder, D. (2012). Contemporary seismicity in and around the Yakima Fold-and-Thrust belt in eastern Washington. Bulletin of the Seismological Society of America, 102(1), 309–320. https://doi.org/10.1785/0120110065
Gomberg, Joan, & Bodin, P. (2021). The productivity of Cascadia aftershock sequences. Bulletin of the Seismological Society of America, 111(3), 1494–1507. https://doi.org/10.1785/0120200344
Harris, C. R., Millman, K. J., Van Der Walt, S. J., Gommers, R., Virtanen, P., Cournapeau, D., Wieser, E., Taylor, J., Berg, S., Smith, N. J., & others. (2020). Array programming with NumPy. Nature, 585(7825), 357–362. https://doi.org/10.1038/s41586-020-2649-2
Hartog, J. R., Friberg, P. A., Kress, V. C., Bodin, P., & Bhadha, R. (2019). Open-Source ANSS Quake Monitoring System Software. Seismological Research Letters, 91, 677–686. https://doi.org/10.1785/0220190219
Hayes, G. (2018). Slab2 - A Comprehensive Subduction Zone Geometry Model. U.S. Geological Survey. https://doi.org/10.5066/F7PV6JNV
Hearne, M., & Schovanec, H. E. (2020). libcomcat Software Release. U.S. Geological Survey. https://doi.org/10.5066/P91WN1UQ
Herrmann, R. B. (1979). FASTHYPO-a hypocenter location program. Earthquake Notes, 50(2), 25–38. https://doi.org/10.1785/gssrl.50.2.25
Hibert, C, Michéa, D., Provost, F., Malet, J.-P., & Geertsema, M. (2019). Exploration of continuous seismic recordings with a machine learning approach to document 20 yr of landslide activity in Alaska. Geophysical Journal International, 219(2), 1138–1147. https://doi.org/10.1093/gji/ggz354
Hibert, Clément, Ekström, G., & Stark, C. P. (2014). Dynamics of the Bingham Canyon Mine landslides from seismic signal analysis. Geophysical Research Letters, 41(13), 4535–4541. https://doi.org/10.1002/2014GL060592
Hunter, J. D. (2007). Matplotlib: A 2D graphics environment. Computing in Science & Engineering, 9(3), 90–95. https://doi.org/10.1109/MCSE.2007.55
Hutko, A. R., Bahavar, M., Trabant, C., Weekly, R. T., Fossen, M. V., & Ahern, T. (2017). Data products at the IRIS-DMC: Growth and usage. Seismological Research Letters, 88(3), 892–903. https://doi.org/10.1785/0220160190
IRIS Transportable Array. (2003). USArray Transportable Array. International Federation of Digital Seismograph Networks. https://doi.org/10.7914/SN/TA
Jennings, P. C., & Kanamori, H. (1983). Effect of distance on local magnitudes found from strong-motion records. Bulletin of the Seismological Society of America, 73(1), 265–280. https://doi.org/10.1785/BSSA0730010265
Kingma, D. P., & Ba, J. (2014). Adam: A Method for Stochastic Optimization. https://doi.org/10.48550/ARXIV.1412.6980
Klein, F. W. (2002). User’s guide to HYPOINVERSE-2000, a Fortran program to solve for earthquake locations and magnitudes [Techreport]. US Geological Survey.
Kong, Q., Trugman, D. T., Ross, Z. E., Bianco, M. J., Meade, B. J., & Gerstoft, P. (2019). Machine learning in seismology: Turning data into insights. Seismological Research Letters, 90(1), 3–14. https://doi.org/10.1785/0220180259
Koper, K. D., Holt, M. M., Voyles, J. R., Burlacu, R., Pyle, M. L., Wang, R., & Schmandt, B. (2020). Discrimination of Small Earthquakes and Buried Single-Fired Chemical Explosions at Local Distances (<150 km) in the Western United States from Comparison of Local Magnitude (ML) and Coda Duration Magnitude (MC). Bulletin of the Seismological Society of America, 111(1), 558–570. https://doi.org/10.1785/0120200188
Krischer, L., Megies, T., Barsch, R., Beyreuther, M., Lecocq, T., Caudron, C., & Wassermann, J. (2015). ObsPy: a bridge for seismology into the scientific Python ecosystem. Computational Science & Discovery, 8(1), 014003. https://doi.org/10.1088/1749-4699/8/1/014003
Luna, L. V., & Korup, O. (2022). Seasonal Landslide Activity Lags Annual Precipitation Pattern in the Pacific Northwest. Geophysical Research Letters, 49(18), e2022GL098506. https://doi.org/10.1029/2022GL098506
Malfante, M., Dalla Mura, M., Métaxian, J.-P., Mars, J. I., Macedo, O., & Inza, A. (2018). Machine learning for volcano-seismic signals: Challenges and perspectives. IEEE Signal Processing Magazine, 35(2), 20–30. https://doi.org/10.1109/MSP.2017.2779166
Manconi, A., Gariano, S. L., Coviello, V., & Guzzetti, F. (2017). How many rainfall-induced landslides are detectable by a regional seismic monitoring network? Workshop on World Landslide Forum, 161–168. https://doi.org/10.1007/978-3-319-53487-9_18
Michelini, A., Cianetti, S., Gaviano, S., Giunchi, C., Jozinovic, D., & Lauciani, V. (2021). INSTANCE - the Italian seismic dataset for machine learning. Earth System Science Data. https://doi.org/10.5194/essd-2021-164
Mousavi, S. M., & Beroza, G. C. (2022). Deep-learning seismology. Science, 377(6607), eabm4470. https://doi.org/10.1126/science.abm4470
Mousavi, S. M., Ellsworth, W. L., Zhu, W., Chuang, L. Y., & Beroza, G. C. (2020). Earthquake transformer—an attentive deep-learning model for simultaneous earthquake detection and phase picking. Nature Communications, 11(1), 1–12. https://doi.org/10.1038/s41467-020-17591-w
Mousavi, S. M., Sheng, Y., Zhu, W., & Beroza, G. C. (2019). STanford EArthquake Dataset (STEAD): A global data set of seismic signals for AI. IEEE Access, 7, 179464–179476. https://doi.org/10.1109/ACCESS.2019.2947848
Münchmeyer, J., Woollam, J., Rietbrock, A., Tilmann, F., Lange, D., Bornstein, T., Diehl, T., Giunchi, C., Haslinger, F., Jozinović, D., & others. (2022). Which picker fits my data? A quantitative evaluation of deep learning based seismic pickers. Journal of Geophysical Research: Solid Earth, 127(1), e2021JB023499. https://doi.org/10.1029/2021JB023499
Natural Resources Canada (NRCAN Canada). (1975). Canadian National Seismograph Network. International Federation of Digital Seismograph Networks. https://doi.org/10.7914/SN/CN
Paullada, A., Raji, I. D., Bender, E. M., Denton, E., & Hanna, A. (2021). Data and its (dis)contents: A survey of dataset development and use in machine learning research. Patterns, 2(11), 100336. https://doi.org/https://doi.org/10.1016/j.patter.2021.100336
Richter, C. F. (1958). Elementary Seismology.
Rogers, G., & Dragert, H. (2003). Episodic tremor and slip on the Cascadia subduction zone: The chatter of silent slip. Science, 300(5627), 1942–1943. https://doi.org/10.1126/science.1084783
Rutgers University. (2013). Ocean Observatories Initiative. International Federation of Digital Seismograph Networks. https://doi.org/10.7914/SN/OO
SCEDC. (2013). Southern California Earthquake Data Center. Caltech. https://doi.org/10.7909/C3WD3XH1
Schorlemmer, D., Euchner, F., Kästli, P., Saul, J., Group, Q. W., & others. (2011). QuakeML: status of the XML-based seismological data exchange format. Annals of Geophysics, 54(1). https://doi.org/10.4401/ag-4874
Survey, U. S. G. (2017). Advanced National Seismic System (ANSS) Comprehensive Catalog. https://doi.org/10.5066/F7MS3QZH
University of Oregon. (1990). Pacific Northwest Seismic Network - University of Oregon. International Federation of Digital Seismograph Networks. https://doi.org/10.7914/SN/UO
University of Washington. (1963). Pacific Northwest Seismic Network - University of Washington. International Federation of Digital Seismograph Networks. https://doi.org/10.7914/SN/UW
Wang, A., Singh, A., Michael, J., Hill, F., Levy, O., & Bowman, S. R. (2018). GLUE: A multi-task benchmark and analysis platform for natural language understanding. ArXiv Preprint ArXiv:1804.07461.
Wang, R., Schmandt, B., Holt, M., & Koper, K. (2021). Advancing Local Distance Discrimination of Explosions and Earthquakes With Joint P/S and ML-MC Classification. Geophysical Research Letters, 48(23), e2021GL095721. https://doi.org/10.1029/2021GL095721
Wech, A. G., & Bartlow, N. M. (2014). Slip rate and tremor genesis in Cascadia. Geophysical Research Letters, 41(2), 392–398. https://doi.org/10.1002/2013GL058607
Wech, A. G., Creager, K. C., Houston, H., & Vidale, J. E. (2010). An earthquake-like magnitude-frequency distribution of slow slip in northern Cascadia. Geophysical Research Letters, 37(22). https://doi.org/10.1029/2010GL044881
Wiemer, S., & Wyss, M. (2000). Minimum magnitude of completeness in earthquake catalogs: Examples from Alaska, the western United States, and Japan. Bulletin of the Seismological Society of America, 90(4), 859–869. https://doi.org/10.1785/0119990114
Witter, R. C., Kelsey, H. M., & Hemphill-Haley, E. (2003). Great Cascadia earthquakes and tsunamis of the past 6700 years, Coquille River estuary, southern coastal Oregon. Geological Society of America Bulletin, 115(10), 1289–1306. https://doi.org/10.1130/B25189.1
Woollam, J., Münchmeyer, J., Tilmann, F., Rietbrock, A., Lange, D., Bornstein, T., Diehl, T., Giunchi, C., Haslinger, F., Jozinović, D., & others. (2022). SeisBench—A toolbox for machine learning in seismology. Seismological Society of America, 93(3), 1695–1709. https://doi.org/10.1785/0220210324
Woollam, J., Rietbrock, A., Bueno, A., & De Angelis, S. (2019). Convolutional Neural Network for Seismic Phase Classification, Performance Demonstration over a Local Seismic Network. Seismological Research Letters, 90(2A), 491–502. https://doi.org/10.1785/0220180312
Yan, Y., Cui, Y., Tian, X., Hu, S., Guo, J., Wang, Z., Yin, S., & Liao, L. (2020). Seismic signal recognition and interpretation of the 2019 “7.23” Shuicheng landslide by seismogram stations. Landslides, 17(5), 1191–1206. https://doi.org/10.1007/s10346-020-01358-x
Zhu, W., Mousavi, S. M., & Beroza, G. C. (2020). Seismic signal augmentation to improve generalization of deep neural networks. In Advances in geophysics (Vol. 61, pp. 151–177). Elsevier. https://doi.org/10.1016/bs.agph.2020.07.003
Downloads
Additional Files
Published
How to Cite
Issue
Section
License
Copyright (c) 2023 Yiyu Ni, Alexander Hutko, Francesca Skene, Marine Denolle, Stephen Malone, Paul Bodin, Renate Hartog, Amy Wright

This work is licensed under a Creative Commons Attribution 4.0 International License.
Funding data
-
U.S. Geological Survey
Grant numbers G20AC00035 -
National Science Foundation
Grant numbers EAR2103701 -
David and Lucile Packard Foundation