How reproducible and reliable is geophysical research?

A review of the availability and accessibility of data and software for research published in journals

Authors

  • Mark Ireland Newcastle University https://orcid.org/0000-0001-9777-0447
  • Guillermo Algarabel Department of Physics, Durham University
  • Michael Steventon Shell Research Ltd
  • Marcus Munafò School of Psychological Science, University of Bristol

DOI:

https://doi.org/10.26443/seismica.v2i1.278

Keywords:

reproducibility, data availability, FAIR

Abstract

Geophysical research frequently makes use of agreed-upon methodologies, formally published software, and bespoke code to process and analyse data. The reliability and repeatability of these methods is vital in maintaining the integrity of research findings and thereby avoiding the dissemination of unreliable results. In recent years there has been increased attention on aspects of reproducibility, which includes data availability, across scientific disciplines. This review considers aspects of reproducibility of geophysical studies relating to their publication in peer reviewed journals. For 100 geophysics journals it considers the extent to which reproducibility in geophysics is the focus of published literature. For 20 geophysical journals it considers a) journal policies on the requirements for providing code, software, and data for submission; and b) the availability of data and software associated for 200 published journal articles. The findings show that: 1) between 1991 and 2021 there were 72 articles with reproducibility in the title and 417 with reliability, with an overall increase in the number of articles with reproducibility or reliability as the subject over the same period; 2) while 60% of journals have a definition of research data, only 20% of journals have a requirement for a data availability statement; and 3) despite ~86% of sampled journal articles including a data availability statement, only 54% of articles have the original data accessible via data repositories or web servers, and only 49% of articles name software used. It is suggested that despite journals and authors working towards improving the availability of data and software, frequently they are not identified, or easily accessible, therefore limiting the possibility of reproducing studies.

References

AGU. (n.d.). Data and Software for Authors. https://www.agu.org/Publish-with-AGU/Publish/Author-Resources/Data-and-Software-for-Authors#availability

American Journal of Political Science. (2019). A.J.P.S. Verification Policy. In American Journal of Political Science. https://ajps.org/ajps-verification-policy/

Arnold, B., Bowler, L., Gibson, S., Herterich, P., Higman, R., Krystalli, A., Morley, A., O’Reilly, M., & Whitaker, K. (2019). The Turing Way: A handbook for reproducible data science. Zenodo. https://doi.org/10.5281/zenodo.3233853

Behnke, J., Mitchell, A., & Ramapriyan, H. (2019). NASA’s Earth Observing Data and Information System – Near-Term Challenges. 18(1), 1. https://doi.org/10.5334/dsj-2019-040

Beyreuther, M., Barsch, R., Krischer, L., Megies, T., Behr, Y., & Wassermann, J. (2010). ObsPy: A Python Toolbox for Seismology. Seismological Research Letters, 81(3), 530–533. https://doi.org/10.1785/gssrl.81.3.530

Boeker, M., Vach, W., & Motschall, E. (2013). Google Scholar as replacement for systematic literature searches: Good relative recall and precision are not enough. BMC Medical Research Methodology, 13(1), 131. https://doi.org/10.1186/1471-2288-13-131

Borgman, C. L. (2010). Research Data: Who Will Share What, with Whom, When, and Why? SSRN Electronic Journal. https://doi.org/10.2139/ssrn.1714427

British Geophysical Association. (2014). What is geophysics? https://geophysics.org.uk/what-is-geophysics/

Caelleigh, A. S. (1993). Role of the journal editor in sustaining integrity in research. Academic Medicine, 68(9), 23–29. https://doi.org/10.1097/00001888-199309000-00030

Carr, T. R., Buchanan, R. C., Adkins-Heljeson, D., Mettille, T. D., & Sorensen, J. (1997). The future of scientific communication in the earth sciences: The impact of the internet. Computers & Geosciences, 23(5), 503–512. https://doi.org/10.1016/S0098-3004(97)00032-0

Childe, S. J. (2006). What is the role of a research journal? Production Planning & Control, 17(5), 439–439. https://doi.org/10.1080/09537280600888862

Christensen, G., Dafoe, A., Miguel, E., Moore, D. A., & Rose, A. K. (2019). A study of the impact of data sharing on article citations using journal policies as a natural experiment. PLOS ONE, 14(12), 225883. https://doi.org/10.1371/journal.pone.0225883

de Groot, P., & Bril, B. (2005). The open source model in geosciences and OpendTect in particular. In SEG Technical Program Expanded Abstracts 2005 (pp. 802–805). Society of Exploration Geophysicists. https://doi.org/10.1190/1.2148280

Dembe, A. E., Partridge, J. S., & Geist, L. C. (2011). Statistical software applications used in health services research: Analysis of published studies in the U.S. BMC Health Services Research, 11(1), 252. https://doi.org/10.1186/1472-6963-11-252

European Commission. (2016). G20 Leaders’ Communique Hangzhou Summit. https://ec.europa.eu/commission/presscorner/detail/en/STATEMENT_16_2967

Evangelou, E., Trikalinos, T. A., & Ioannidis, J. P. (2005). Unavailability of online supplementary scientific information from articles published in major journals. The FASEB Journal, 19(14), 1943–1944. https://doi.org/10.1096/fj.05-4784lsf

Figshare. (n.d.). Figshare API User Documentation. https://doi.org/10.6084/m9.figshare.4880372.v2

Geophysics. (n.d.). GEOPHYSICS instructions to authors. https://library.seg.org/page/gpysa7/ifa/instructions

Glynn, E., Fitzgerald, B., & Exton, C. (2005). Commercial adoption of open source software: an empirical study. 2005 International Symposium on Empirical Software Engineering, 2005, 10 pp.-. https://doi.org/10.1109/ISESE.2005.1541831

Gomes, D. G. E., Pottier, P., Crystal-Ornelas, R., Hudgins, E. J., Foroughirad, V., Sánchez-Reyes, L. L., Turba, R., Martinez, P. A., Moreau, D., Bertram, M., Smout, C., & Gaynor, K. (2022). Why don’t we share data and code? MetaArXiv. https://doi.org/10.31222/osf.io/gaj43

Goodman, S. N., Fanelli, D., & Ioannidis, J. P. (2016). What does research reproducibility mean? Science Translational Medicine, 8(341), 341 12-341 12. https://doi.org/10.1126/scitranslmed.aaf5027

Hager, B. H., & Clayton, R. W. (1989). Constraints on the structure of mantle convection using seismic observations, flow models, and the geoid. https://resolver.caltech.edu/CaltechAUTHORS:20121002-141328164

Hamman, J. (2017). xarray: N-D labeled Arrays and Datasets in Python (Vol. 5). https://doi.org/10.5334/jors.148

Harvey, M. J., Mason, N. J., & Rzepa, H. S. (2014). Digital Data Repositories in Chemistry and Their Integration with Journals and Electronic Notebooks. ACS Publications; American Chemical Society. https://doi.org/10.1021/ci500302p

Harzing, A. W. (2007). Publish or Perish. https://harzing.com/resources/publish-or-perish

Harzing, A. W. (2010). The publish or perish book. Tarma Software Research Pty Limited Melbourne.

Hauge, Ø., Ayala, C., & Conradi, R. (2010). Adoption of open source software in software-intensive organizations – A systematic literature review. Information and Software Technology, 52(11), 1133–1154. https://doi.org/10.1016/j.infsof.2010.05.008

Hodson, S., Jones, S., Collins, S., Genova, F., Harrower, N., Laaksonen, L., Mietchen, D., Petrauskaité, R., & Wittenburg, P. (2018). Turning FAIR data into reality [Techreport]. https://doi.org/10.2777/1524

Houtkoop, B. L., Chambers, C., Macleod, M., Bishop, D. V., Nichols, T. E., & Wagenmakers, E. J. (2018). Data sharing in psychology: A survey on barriers and preconditions. Advances in Methods and Practices in Psychological Science, 1(1), 70–85. https://doi.org/10.1177/2515245917751886

Ireland, M. (2022). Reproducibility in Geophysics. https://doi.org/10.25405/data.ncl.21564381.v1

Jun, H., & Cho, Y. (2022). Repeatability enhancement of time-lapse seismic data via a convolutional autoencoder. Geophysical Journal International, 228(2), 1150–1170. https://doi.org/10.1093/gji/ggab397

Konkol, M., Kray, C., & Pfeiffer, M. (2019). Computational reproducibility in geoscientific papers: Insights from a series of studies with geoscientists and a reproduction study. International Journal of Geographical Information Science, 33(2), 408–429. https://doi.org/10.1080/13658816.2018.1508687

Lepak, D. (2009). Editor’s Comments: What is Good Reviewing? Academy of Management Review, 34(3), 375–381. https://doi.org/10.5465/amr.2009.40631320

McCullough, B. D., & Heiser, D. A. (2008). On the accuracy of statistical procedures in Microsoft Excel 2007. Computational Statistics & Data Analysis, 52(10), 4570–4578. https://doi.org/10.1016/j.csda.2008.03.004

McCullough, B. D., & Wilson, B. (2002). On the accuracy of statistical procedures in Microsoft Excel 2000 and Excel XP. Computational Statistics & Data Analysis, 40(4), 713–721. https://doi.org/10.1016/S0167-9473(02)00095-6

McCullough, B. D., & Wilson, B. (2005). On the accuracy of statistical procedures in Microsoft Excel 2003. Computational Statistics & Data Analysis, 49(4), 1244–1252. https://doi.org/10.1016/j.csda.2004.06.016

Mélard, G. (2014). On the accuracy of statistical procedures in Microsoft Excel 2010. Computational Statistics, 29(5), 1095–1128. https://doi.org/10.1007/s00180-014-0482-5

Mesirov, J. P. (2010). Accessible Reproducible Research. Science, 327(5964), 415–416. https://doi.org/10.1126/science.1179653

Muenchow, J., Schäfer, S., & Krüger, E. (2019). Reviewing qualitative GIS research—Toward a wider usage of open‐source GIS and reproducible research practices. Geography Compass, 13(6), 12441. https://doi.org/10.1111/gec3.12441

National Academies of Sciences. (2016). Statistical challenges in assessing and fostering the reproducibility of scientific results: Summary of a workshop. https://doi.org/10.17226/21915

National Academies of Sciences. (2019). Understanding Reproducibility and Replicability. In Reproducibility and Replicability in Science. https://www.ncbi.nlm.nih.gov/books/NBK547546/

Nature. (2014). Journals unite for reproducibility. Nature, 515(7525). https://doi.org/10.1038/515007a

Nosek, B. A., Alter, G., Banks, G. C., Borsboom, D., Bowman, S. D., Breckler, S. J., Buck, S., Chambers, C. D., Chin, G., & Christensen, G. (2015). Promoting an open research culture. Science, 348(6242), 1422–1425. https://doi.org/10.1126/science.aab2374

Nüst, D., & Pebesma, E. (2021). Practical Reproducibility in Geography and Geosciences. Annals of the American Association of Geographers, 111(5), 1300–1310. https://doi.org/10.1080/24694452.2020.1806028

Oguntimilehin, A., & Ademola, E. O. (2014). A Review of Big Data Management, Benefits and Challenges. A Review of Big Data Management, Benefits and Challenges, 5(6), 6.

Oren, C., & Nowack, R. L. (2018). An overview of reproducible 3D seismic data processing and imaging using Madagascar. Geophysics, 83(2), 9–20. https://doi.org/10.1190/geo2016-0603.1

Pendlebury, D. A. (2009). The use and misuse of journal metrics and other citation indicators. Archivum Immunologiae et Therapiae Experimentalis, 57(1), 1–11. https://doi.org/10.1007/s00005-009-0008-y

Piwowar, H. A., Day, R. S., & Fridsma, D. B. (2007). Sharing Detailed Research Data Is Associated with Increased Citation Rate. PLOS ONE, 2(3), 308. https://doi.org/10.1371/journal.pone.0000308

Poldrack, R. A., & Gorgolewski, K. J. (2014). Making big data open: Data sharing in neuroimaging. Nature Neuroscience, 17(11), 11. https://doi.org/10.1038/nn.3818

Pop, M., & Salzberg, S. L. (2015). Use and mis-use of supplementary material in science publications. BMC Bioinformatics, 16(1), 237. https://doi.org/10.1186/s12859-015-0668-z

Rallison, S. (2015). What are Journals for? Annals of The Royal College of Surgeons of England, 97(2), 89–91. https://doi.org/10.1308/003588414X14055925061397

Reese, R. J. (1965). Recent Applications of Digital Computers to Geophysical Problems. AAPG Bulletin, 49(7), 1089–1089. https://doi.org/10.1306/A66336EE-16C0-11D7-8645000102C1865D

Robinson, E. A., & Treitel, S. (2000). Geophysical signal analysis. Society of Exploration Geophysicists.

SCImago. (n.d.). SCImago Journal & Country Rank. In SCImago. https://www.scimagojr.com/aboutus.php

Starr, J., Castro, E., Crosas, M., Dumontier, M., Downs, R. R., Duerr, R., Haak, L. L., Haendel, M., Herman, I., Hodson, S., Hourclé, J., Kratz, J. E., Lin, J., Nielsen, L. H., Nurnberger, A., Proell, S., Rauber, A., Sacchi, S., Smith, A., & Clark, T. (2015). Achieving human and machine accessibility of cited data in scholarly publications. PeerJ Computer Science, 1, 1. https://doi.org/10.7717/peerj-cs.1

Steventon, M. J., Jackson, C. A., Hall, M., Ireland, M. T., Munafo, M., & Roberts, K. J. (2022). Reproducibility in subsurface geoscience. Earth Science, Systems and Society, 12. https://doi.org/10.3389/esss.2022.10051

Tedersoo, L., Küngas, R., Oras, E., Köster, K., Eenmaa, H., Leijen, Ä., Pedaste, M., Raju, M., Astapova, A., Lukner, H., Kogermann, K., & Sepp, T. (2021). Data sharing practices and data availability upon request differ across scientific disciplines. Scientific Data, 8(1). https://doi.org/10.1038/s41597-021-00981-0

Tenopir, C., Allard, S., Douglass, K., Aydinoglu, A. U., Wu, L., Read, E., Manoff, M., & Frame, M. (2011). Data Sharing by Scientists: Practices and Perceptions. PLOS ONE, 6(6), 21101. https://doi.org/10.1371/journal.pone.0021101

van Rooij, S. W. (2011). Higher education sub-cultures and open source adoption. Computers & Education, 57(1), 1171–1183. https://doi.org/10.1016/j.compedu.2011.01.006

Waage, M., Bünz, S., Landrø, M., Plaza-Faverola, A., & Waghorn, K. A. (2018). Repeatability of high-resolution 3D seismic data. Geophysics, 84(1), 75–94. https://doi.org/10.1190/geo2018-0099.1

Walker, R., Gill, S. P., Greenfield, C., McCaffrey, K., & Stephens, T. L. (2021). No demonstrated link between sea-level and eruption history at Santorini. In Earth arXiv. https://eartharxiv.org/repository/view/2638/

Wildman, G., & Lewis, E. (2022). Value of open data: A geoscience perspective. Geoscience Data Journal. https://doi.org/10.1002/gdj3.138

Downloads

Additional Files

Published

2023-05-30

How to Cite

Ireland, M., Algarabel, G., Steventon, M., & Munafò, M. (2023). How reproducible and reliable is geophysical research? A review of the availability and accessibility of data and software for research published in journals. Seismica, 2(1). https://doi.org/10.26443/seismica.v2i1.278

Issue

Section

Articles