A Global-scale Database of Seismic Phases from Cloud-based Picking at Petabyte Scale
DOI:
https://doi.org/10.26443/seismica.v4i2.1738Abstract
We present the first global-scale database of 4.3 billion P- and S-wave picks extracted from 1.3 PB continuous seismic data via a cloud-native workflow. Using cloud computing services on Amazon Web Services, we launched ~145,000 containerized jobs on continuous records from 47,354 stations spanning 2002-2025, completing in under three days. Phase arrivals were identified with a deep learning model, PhaseNet, through an open-source Python ecosystem for deep learning, SeisBench. To visualize and gain a global understanding of these picks, we present preliminary results about pick time series revealing Omori-law aftershock decay, seasonal variations linked to noise levels, and dense regional coverage that will enhance earthquake catalogs and machine-learning datasets. We provide all picks in a publicly queryable database, providing a powerful resource for researchers studying seismicity around the world. This report provides insights into the database and the underlying workflow, demonstrating the feasibility of petabyte-scale seismic data mining on the cloud and of providing intelligent data products to the community in an automated manner.
References
Allen, R. (1982). Automatic phase pickers: Their present use and future prospects. Bulletin of the Seismological Society of America, 72(6B), S225–S242. https://doi.org/https://doi.org/10.1785/BSSA07206B0225
Beyreuther, M., Barsch, R., Krischer, L., Megies, T., Behr, Y., & Wassermann, J. (2010). ObsPy: A Python toolbox for seismology. Seismological Research Letters, 81(3), 530–533. https://doi.org/https://doi.org/10.1785/gssrl.81.3.530
Bornstein, T., Lange, D., Münchmeyer, J., Woollam, J., Rietbrock, A., Barcheck, G., Grevemeyer, I., & Tilmann, F. (2024). PickBlue: Seismic phase picking for ocean bottom seismometers with deep learning. Earth and Space Science, 11(1), e2023EA003332. https://doi.org/https://doi.org/10.1029/2023EA003332
Gentemann, C. L., Holdgraf, C., Abernathey, R., Crichton, D., Colliander, J., Kearns, E. J., Panda, Y., & Signell, R. P. (2021). Science storms the cloud. AGU Advances, 2(2), e2020AV000354. https://doi.org/https://doi.org/10.1029/2020AV000354
Hibert, C., Mangeney, A., Grandjean, G., Baillard, C., Rivet, D., Shapiro, N. M., Satriano, C., Maggi, A., Boissier, P., Ferrazzini, V., & others. (2014). Automated identification, location, and volume estimation of rockfalls at Piton de la Fournaise volcano. Journal of Geophysical Research: Earth Surface, 119(5), 1082–1105. https://doi.org/https://doi.org/10.1002/2013JF002970
Journeau, C., Thomas, A., Abercrombie, R., Hirao, B., Toomey, D., Hooft, E., Liu, M., Barbot, S., & Kuna, V. (2025). OBS Data Mining and Earthquake Swarms Analysis Reveal the Complex Structure and Dynamics of the Blanco Fracture Zone [Techreport]. Copernicus Meetings. https://doi.org/https://doi.org/10.5194/egusphere-egu25-14331
Krauss, Z., Ni, Y., Henderson, S., & Denolle, M. (2023). Seismology in the cloud: guidance for the individual researcher. Seismica, 2(2). https://doi.org/https://doi.org/10.26443/seismica.v2i2.979
Lin, J.-T., Thomas, A. M., Bachelot, L., Toomey, D., Searcy, J., & Melgar, D. (2024). Detection of Hidden Low-Frequency Earthquakes in Southern Vancouver Island with Deep Learning. Seismica, 2(4). https://doi.org/https://doi.org/10.26443/seismica.v2i4.1134
Liu, T., Münchmeyer, J., Laurenti, L., Marone, C., de Hoop, M. V., & Dokmanić, I. (2024). SeisLM: a Foundation Model for Seismic Waveforms. ArXiv Preprint ArXiv:2410.15765. https://doi.org/https://doi.org/10.48550/arXiv.2410.15765
MacCarthy, J., Marcillo, O., & Trabant, C. (2020). Seismology in the Cloud: A New Streaming Workflow. Seismological Research Letters, 91(3), 1804–1812. https://doi.org/https://doi.org/10.1785/0220190357
McBrearty, I. W., & Beroza, G. C. (2023). Earthquake phase association with graph neural networks. Bulletin of the Seismological Society of America, 113(2), 524–547. https://doi.org/https://doi.org/10.1785/0120220182
Michelini, A., Cianetti, S., Gaviano, S., Giunchi, C., Jozinović, D., & Lauciani, V. (2021). INSTANCE–the Italian seismic dataset for machine learning. Earth System Science Data, 13(12), 5509–5544. https://doi.org/https://doi.org/10.5194/essd-13-5509-2021
Mousavi, S. M., Ellsworth, W. L., Zhu, W., Chuang, L. Y., & Beroza, G. C. (2020). Earthquake transformer—an attentive deep-learning model for simultaneous earthquake detection and phase picking. Nature Communications, 11(1), 3952. https://doi.org/https://doi.org/10.1038/s41467-020-17591-w
Mousavi, S. M., Sheng, Y., Zhu, W., & Beroza, G. C. (2019). STanford EArthquake Dataset (STEAD): A global data set of seismic signals for AI. IEEE Access, 7, 179464–179476. https://doi.org/https://doi.org/10.1109/ACCESS.2019.2947848
Mousavi, S. M., Zhu, W., Sheng, Y., & Beroza, G. C. (2019). CRED: A deep residual network of convolutional and recurrent units for earthquake signal detection. Scientific Reports, 9(1), 10267. https://doi.org/https://doi.org/10.1038/s41598-019-45748-1
Münchmeyer, J. (2024). PyOcto: A high-throughput seismic phase associator. Seismica, 3(1). https://doi.org/https://doi.org/10.26443/seismica.v3i1.1130
Münchmeyer, J., Giffard-Roisin, S., Malfante, M., Frank, W., Poli, P., Marsan, D., & Socquet, A. (2024). Deep learning detects uncataloged low-frequency earthquakes across regions. Seismica, 3(1). https://doi.org/https://doi.org/10.26443/seismica.v3i1.1185
Münchmeyer, J., Molina-Ormazabal, D., Marsan, D., Langlais, M., Baez, J.-C., Heit, B., González-Vidal, D., Moreno, M., Tilmann, F., Lange, D., & others. (2025). Characterizing the Atacama segment of the Chile subduction margin (24 S–31 S) with> 165,000 earthquakes. Journal of Geophysical Research: Solid Earth, 130(7), e2025JB031256. https://doi.org/https://doi.org/10.1029/2025JB031256
Münchmeyer, J., Woollam, J., Rietbrock, A., Tilmann, F., Lange, D., Bornstein, T., Diehl, T., Giunchi, C., Haslinger, F., Jozinović, D., & others. (2022). Which picker fits my data? A quantitative evaluation of deep learning based seismic pickers. Journal of Geophysical Research: Solid Earth, 127(1), e2021JB023499. https://doi.org/https://doi.org/10.1029/2021JB023499
Ni, Y., Denolle, M. A., Münchmeyer, J., Wang, Y., Feng, K.-F., Suarez, C. G. J., Thomas, A. M., Trabant, C., Hamilton, A., & Mencin, D. (2025). A Review of Cloud Computing and Storage in Seismology. Geophysical Journal International, ggaf322. https://doi.org/https://doi.org/10.1093/gji/ggaf322
Ni, Y., Hutko, A., Skene, F., Denolle, M., Malone, S., Bodin, P., Hartog, R., & Wright, A. (2023). Curated Pacific Northwest AI-ready Seismic Dataset. Seismica, 2(1). https://doi.org/https://doi.org/10.26443/seismica.v2i1.368
Norman, M., Kellen, V., Smallen, S., DeMeulle, B., Strande, S., Lazowska, E., Alterman, N., Fatland, R., Stone, S., Tan, A., Yelick, K., Van Dusen, E., & Mitchell, J. (2021). CloudBank: Managed Services to Simplify Cloud Access for Computer Science Research and Education. Practice and Experience in Advanced Research Computing 2021: Evolution Across All Dimensions. https://doi.org/https://doi.org/10.1145/3437359.3465586
Park, Y., Beroza, G. C., & Ellsworth, W. L. (2022). Basement Fault Activation before Larger Earthquakes in Oklahoma and Kansas. The Seismic Record, 2(3), 197–206. https://doi.org/https://doi.org/10.1785/0320220020
Perol, T., Gharbi, M. J., & Denolle, M. (2018). Convolutional neural network for earthquake detection and location. Science Advances, 4(2), e1700578. https://doi.org/https://doi.org/10.1126/sciadv.1700578
Retailleau, L., Saurel, J.-M., Zhu, W., Satriano, C., Beroza, G. C., Issartel, S., Boissier, P., Team, O., Team, O., & others. (2022). A wrapper to use a machine-learning-based algorithm for earthquake monitoring. Seismological Research Letters, 93(3), 1673–1682. https://doi.org/https://doi.org/10.1785/0220210279
Ross, Z. E., Meier, M.-A., Hauksson, E., & Heaton, T. H. (2018). Generalized seismic phase detection with deep learning. Bulletin of the Seismological Society of America, 108(5A), 2894–2901. https://doi.org/https://doi.org/10.1785/0120180080
Ross, Z. E., Meier, M.-A., Hauksson, E., & Heaton, T. H. (2020). P-wave arrival picking and first-motion polarity determination with deep learning. Journal of Geophysical Research: Solid Earth, 125(4), e2019JB018663. https://doi.org/https://doi.org/10.1029/2017JB015251
Ross, Z. E., Trugman, D. T., Hauksson, E., & Shearer, P. M. (2019). Searching for hidden earthquakes in Southern California. Science, 364(6442), 767–771. https://doi.org/https://doi.org/10.1126/science.aaw6888
Ross, Z. E., Yue, Y., Meier, M.-A., Hauksson, E., & Heaton, T. H. (2019). PhaseLink: A deep learning approach to seismic phase association. Journal of Geophysical Research: Solid Earth, 124(1), 856–869. https://doi.org/https://doi.org/10.1029/2018JB016674
Sun, W.-F., Pan, S.-Y., Huang, C.-M., Guan, Z.-K., Yen, I.-C., Ho, C.-W., Chi, T.-C., Ku, C.-S., Huang, B.-S., Fu, C.-C., & others. (2024). Deep learning-based earthquake catalog reveals the seismogenic structures of the 2022 MW 6.9 Chihshang earthquake sequence. Terrestrial, Atmospheric and Oceanic Sciences, 35(1), 5. https://doi.org/https://doi.org/10.1007/s44195-024-00063-9
Utsu, T. (1961). A statistical study on the occurrence of aftershocks. Geophys. Mag., 30, 521–605.
Walter, J. I., Ogwari, P., Thiel, A., Ferrer, F., & Woelfel, I. (2021). easyQuake: Putting machine learning to work for your regional seismic network or local earthquake study. Seismological Society of America, 92(1), 555–563. https://doi.org/https://doi.org/10.1785/0220200226
Wang, X., Liu, F., Su, R., Wang, Z., Bai, L., & Ouyang, W. (2025). SeisMoLLM: Advancing Seismic Monitoring via Cross-modal Transfer with Pre-trained Large Language Model. ArXiv Preprint ArXiv:2502.19960. https://doi.org/https://doi.org/10.48550/arXiv.2502.19960
West, K., Lehmann, F., Bountris, V., Leser, U., Elkhatib, Y., & Thamsen, L. (2025). Exploring the Potential of Carbon-Aware Execution for Scientific Workflows. ArXiv Preprint ArXiv:2503.13705. https://doi.org/https://doi.org/10.48550/arXiv.2503.13705
Woollam, J., Münchmeyer, J., Tilmann, F., Rietbrock, A., Lange, D., Bornstein, T., Diehl, T., Giunchi, C., Haslinger, F., Jozinović, D., & others. (2022). SeisBench—A toolbox for machine learning in seismology. Seismological Society of America, 93(3), 1695–1709. https://doi.org/https://doi.org/10.1785/0220210324
Yeck, W. L., Patton, J. M., Ross, Z. E., Hayes, G. P., Guy, M. R., Ambruz, N. B., Shelly, D. R., Benz, H. M., & Earle, P. S. (2021). Leveraging deep learning in global 24/7 real-time earthquake monitoring at the National Earthquake Information Center. Seismological Society of America, 92(1), 469–480. https://doi.org/https://doi.org/10.1785/0220200178
Yu, E., Bhaskaran, A., Chen, S., Ross, Z. E., Hauksson, E., & Clayton, R. W. (2021). Southern California Earthquake Data Now Available in the AWS Cloud. Seismological Research Letters, 92(5), 3238–3247. https://doi.org/https://doi.org/10.1785/0220210039
Zawacki, E. E., Bendick, R., & Woodward, R. L. (2023). Advancing geophysics: IRIS and UNAVCO merge to form EarthScope Consortium. Wiley Online Library. https://doi.org/https://doi.org/10.1029/2023CN000227
Zhang, M., Liu, M., Feng, T., Wang, R., & Zhu, W. (2022). LOC-FLOW: An end-to-end machine learning-based high-precision earthquake location workflow. Seismological Society of America, 93(5), 2426–2438. https://doi.org/https://doi.org/10.1785/0220220019
Zhang, X., & Zhang, M. (2024). Universal neural networks for real-time earthquake early warning trained with generalized earthquakes. Communications Earth & Environment, 5(1), 528. https://doi.org/https://doi.org/10.1038/s43247-024-01718-8
Zhong, Y., & Tan, Y. J. (2024). Deep-Learning-Based Phase Picking for Volcano-Tectonic and Long-Period Earthquakes. Geophysical Research Letters, 51(12), e2024GL108438. https://doi.org/https://doi.org/10.1029/2024GL108438
Zhu, W., & Beroza, G. C. (2019). Phasenet: a deep-neural-network-based seismic arrival time picking method. Geophysical Journal International, 216(1), 261–273. https://doi.org/https://doi.org/10.1093/gji/ggy423
Zhu, W., Hou, A. B., Yang, R., Datta, A., Mousavi, S. M., Ellsworth, W. L., & Beroza, G. C. (2023). QuakeFlow: a scalable machine-learning-based earthquake monitoring workflow with cloud computing. Geophysical Journal International, 232(1), 684–693. https://doi.org/https://doi.org/10.1093/gji/ggac355
Zhu, W., McBrearty, I. W., Mousavi, S. M., Ellsworth, W. L., & Beroza, G. C. (2022). Earthquake phase association using a Bayesian Gaussian mixture model. Journal of Geophysical Research: Solid Earth, 127(5), e2021JB023249. https://doi.org/https://doi.org/10.1029/2021JB023249
Zhu, W., Wang, H., Rong, B., Yu, E., Zuzlewski, S., Tepp, G., Taira, T., Marty, J., Husker, A., & Allen, R. M. (2025). California Earthquake Dataset for Machine Learning and Cloud Computing. https://doi.org/https://doi.org/10.48550/arXiv.2502.11500
Downloads
Additional Files
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Yiyu Ni, Marine A. Denolle, Amanda M. Thomas, Alex Hamilton, Jannes Münchmeyer, Yinzhi Wang, Loïc Bachelot, Chad Trabant, David Mencin

This work is licensed under a Creative Commons Attribution 4.0 International License.