Seismic interferometry in the presence of an isolated noise source

Seismic interferometry gives rise to a correlation waveﬁeld that is closely re-lated to the Green’s function under the condition of uniformly distributed noise sources. In the presence of an additional isolated noise source, a second contribution to this waveﬁeld is introduced that emerges from the isolated source location at negative lapse time. These two contributions interfere, which may bias surface wave dispersion measurements signiﬁcantly. To avoid bias, the causal and acausal parts of correlation functions need to be treated separately. We illustrate this by applying seismic interferometry to ﬁeld data from a large-N array where a wind farm is present within the array


Introduction
Seismic interferometry is a well-established technique to estimate wavefields propagating between pairs of stations from recordings of ambient seismic noise (Nakata et al., 2019, and references therein). These wavefields are commonly used to image (e.g., Lin et al., 2008;de Ridder and Biondi, 2015;Schippkus et al., 2018) and monitor (e.g., Wegler and Sens-Schönfelder, 2007;Brenguier et al., 2008;Steinmann et al., 2021) Earth's structure. For a uniform distribution of uncorrelated noise sources, the wavefield that emerges from crosscorrelation of seismic records between two stations is closely related to the Green's function between them. This relation may be derived by assuming a diffuse wavefield (Lobkis and Weaver, 2001), equipartion of energy across seismic wave modes (Weaver, 2010), or sources on a boundary surrounding the two stations (Wapenaar et al., 2005).
In this study, we investigate the case where the presence of an additional isolated source violates these assumptions and introduces an additional contribution to cross-correlation functions. We consider vertical component recordings of surface waves. In the following, we demonstrate the isolated-source contribution using data from a large-N deployment in the Vienna basin, Austria, derive the expected behaviour of this contribution, compare our predictions with observations from the large array, and explore what impact the second contribution may have in practice. * Corresponding author: sven.schippkus@uni-hamburg.de

Cross-correlation of the recorded wavefield
We use data from 4907 seismic stations with ∼200m inter-station spacing in the Vienna basin, Austria ( Fig.  1). Stations were deployed in March 2019 as part of a seismic exploration survey by OMV E&P GmbH and recorded data continuously over four weeks. This deployment is similar to the one described in Schippkus et al. (2020), with comparable instruments -several co-located 10 Hz geophones (Sercel JF-20DX) per station, stacked and recorded with AutoSeis High Definition Recorders -and in an area that is partly overlapping with the previous deployment towards the Southeast. Therefore, the same sources of seismic noise characterised by Schippkus et al. (2020) are also present in this data: wind farms, railway tracks, roads, and oil pumpjacks, among others. Schippkus et al. (2020) already hinted at the potential impact of strong isolated sources on correlation functions in this region. To investigate the impact of these sources, we compute cross correlations between all stations and a master station at location r M in the center of the deployment (Fig. 2a). The seismograms were spectrally whitened and cut into 1 hr-long windows before cross-correlation, stacked linearly after crosscorrelation, and bandpass-filtered from 0.5 to 1.0 Hz. There were no significant earthquakes globally or regionally during the recorded timeframe. A movie of correlation function amplitudes over time is provided in the electronic supplementary material. tinct contributions to the correlation wavefield. First, there is a contribution converging onto the master station r M at acausal lapse times τ < 0 and diverging at causal lapse times τ > 0 (Fig. 2a). This is the expected behaviour that commonly arises from seismic interferometry under the assumptions described above (e.g., Lin et al., 2008). In addition, there is a second contribution emerging from a location r N in the Northeast of the deployment at τ ≈ −12 sec that propagates only outwards from r N . The center of this wavefield contribution coincides with the location of the wind farm Prottes-Ollersdorf (red crosses in Fig. 1), the strongest and most consistent source of anthropogenic noise at these frequencies in the region (Schippkus et al., 2020). There are other wind farms in the region (blue crosses in Fig. 1), which do not appear to excite a significant contribution to the correlation wavefield. This is consistent with previous observations that these wind turbines produce much lower seismic energy (Schippkus et al., 2020). Similarly, the other anthropogenic noise sources in the region also appear to be negligible. Wind turbine towers excite seismic energy at frequencies related to the eigenfrequencies of the towers and passing frequency of the rotor blades, including the in the range of 0.5 to 1.0 Hz (Neuffer et al., 2021). In the next section, we derive the behaviour of the contribution by the wind turbines to the cross correlations.

Cross-correlation in the presence of an isolated noise source
We consider a wavefield that is generated by a combination of noise sources on a closed boundary S surrounding the array and an isolated noise source within the boundary with noise spectrum N I (ω) at location r N (Fig. 2). The treatment of this section is formulated in the frequency domain. We assume that the noise sources on the boundary have equal power spectrum |N B | 2 , and that the noise generated at different locations are uncorrelated. This means that where · · · denotes the expected value. We also assume that the noise on the boundary and the noise from the isolated noise source with spectrum N I is uncorrelated, hence The wavefield is excited by the superposition of noise sources at the boundary S and the isolated noise source, hence The cross correlation of the wavefield at location r with the wavefield at the master station at r M is given by Because of expression (1) the double integral in the first term reduces to a single integral, and because of equation (2) the second and the third term in equation (4) vanish, hence Note the symmetry between the contribution of the surface sources and the contribution of the isolated source.
The surface integral in the first term can be rewritten using equation (11) of Wapenaar et al. (2005), which in the notation of this paper is given by G(r, r M ) + G * (r, r M ) = (2/ρc) S G(r, r )G * (r M , r )d 2 r , hence equation (5) can be written as The first term on the right hand denotes the superposition of the Green's function and its time-reversed counterpart. These terms usually arise in seismic interferometry. The second term on the right hand side describes an additional contribution to the crosscorrelation of the wavefield that is caused by the isolated noise source. We analyse the kinematics of this term in the next section.

Figure 2
Snapshots of cross-correlation function amplitudes in the presence of an isolated source at different lapse times τ = [−5, 0, 5] sec. The white triangle marks the master station r M , the red cross marks the approximate location of the isolated source r N . a) Correlation functions from four weeks of data, bandpass-filtered from 0.5 to 1.0Hz. The isolated source induces a contribution centered on r N . b) Modelled correlation functions for the two contributions by sources on a boundary and by the isolated source (eq. 9) predict the observations.

Kinematics of the isolated noise source contribution
The surface wave Green's function is, in the far field, proportional to with wavenumber k (Aki and Richards, 2009). Thus, in the far field the last term in expression (6) satisfies which gives an arrival at lapse time for a homogeneous medium with velocity c. Note that for a given master station |r M −r N |/c = constant. Equation (9) shows that all locations r with the same distance to r N have the same arrival time τ (r); the travel time of the contribution to the correlation wavefield induced by the isolated source is constant on a circle centered on r N . This contribution emerges from r N at and reaches the master station at τ (r = r M ) = 0.
To understand the relation between the waveforms described by the first term of equation (6) and the additional term, we analyze the arrival time of these waves on on a line from r M to r N . Take the x-axis to point in the positive direction from r M to r N and consider points on the line between these locations, hence x M < x < x N . For a given location x, the acausal wave described by the term G * (r, r M ) gives an arrival at time t = −|r − r M |/c = −(x − x M )/c. This means that for a given time t, the acausal direct wave is located at (Note that since t < 0, x > x M .) The additional arrival due to the isolated noise source gives for a location x according to expression (9) an arrival at t = ( This means that for a time t the wavefronts from the acausal direct wave and the contribution from the isolated noise source are at the same location at the line from r M to r N . Geometrically speaking, the incoming wave to r M and the outgoing wave from r N touch at the line from r M to r N . Similarly, the contribution by the isolated noise source touches the causal wave described by the term G(r, r M ) for locations x < x M . This behavior is confirmed by the touching wavefronts in Figure  2 and the supplemental movie. Note that there is no acausal contribution in the second term of expression (6), because the original wavefield induced at r N only propagates in one direction (away from r N ), in contrast to the wavefield emitted at the boundary, which propagates in all directions. Therefore, the contribution to the correlation wavefield by the isolated source has no energy at τ (r) < −|r M − r N |/c.
We model the described kinematics and compare against our observations (Fig. 2). We approximate the wind farm Prottes-Ollersdorf as a single source and assume that both the boundary sources and the isolated source emit the same Ricker wavelets. For demonstration purposes, we assume a constant medium velocity c ≈ 550 m/s, estimated from the time the isolatedsource contribution emerges τ (r = r N ) and the distance |r M − r N |. We do not consider amplitude effects. Our model explains the observed contributions to the correlation wavefield.

Velocity measurement errors due to interference
Because the wavefronts from the two contributions touch and have the same wavelengths, they interfere. Along the line connecting r N and r M they are exactly in phase, and show varying degrees of constructive and destructive interference away from this line (Fig. 2). This behaviour implies that measurements on crosscorrelation functions may be adversely affected in the presence of an isolated source for station pairs not on this line. In a standard ambient noise tomography application, travel times of seismic waves are measured between all station pairs from correlation functions and inverted for maps of seismic wave speed. We demonstrate the impact the isolated source has on such measurements by measuring group travel times from the modelled correlation functions (Fig. 2b). From these measurements we compute relative groupvelocity measurement errors (Fig. 3). Two cases are investigated: one where the isolated source induces a contribution in the correlation wavefield with 25% higher amplitudes than the contribution due to sources on the boundary (Fig. 3a), and one where the boundary sources produce the stronger contribution (also 25%, Fig. 3b).
In the first case, the measurement errors vanish only along the line connecting r N and r M where the two contributions are in phase (Fig. 3a). Away from this line, measurement errors increase to infinity (apparent travel times of 0) for stations r with |r − r N |= |r M − r N |. In practice, velocity measurements deviating significantly from expected values are commonly classified as outliers or attributed to spurious arrivals and discarded. Our results show that measurement errors of at least 10% occur for the majority of station pairs in the case of a stronger isolated source.
In the case of a weaker isolated source, we find a distinct pattern of measurement errors of several percent (Fig. 3b). Such measurement errors would likely not be identified as clear outliers or spurious arrivals and could bias results. To illustrate why this pattern occurs, we show the group travel time measurements at five stations ( Fig. 3e-g, red circles in Fig. 3b). Starting at the line connecting r N and r M , we find that both contributions are in phase, resulting in no error (Fig. 3c).

Figure 3
Group-velocity measurement errors due to the interference between the two contributions to the correlation wavefield. We cap the colormap at 10% error for illustration purposes. a) Errors if the correlation wavefield induced by the isolated source r N has higher amplitudes. Interference of the two contributions results in significant measurement errors away from the line connecting r N and r M . b) Errors if the correlation wavefield induced by the sources on the boundary has higher amplitudes. Significant errors due to interference. c-g) Picked group arrivals on correlation functions for b). These show correlation function contributions by the two types of sources (dashed grey lines), sum of the two contributions (thick grey line), the sum's envelope (blue line), theoretical arrival time (red dashed line), and picked arrival time (red dot). Note the wider time window in e) and its zoom-in e').
As we increase distance to this line, a slight shift between the two contributions shifts the envelope's peak towards lower lapse time, resulting in a higher-velocity estimate (Fig. 3d). This error increases until another band with zero error (Fig. 3e). This band exists because destructive interference decreases amplitudes to values lower than the acausal part of the correlation function, which is caused only by the boundary sources at this location. The travel time is then automatically picked on the acausal side where no interference occurs (Fig.  3e). If the travel time was picked in the causal part instead, interference would result in negative velocity errors (zoom-in Fig. 3e'). At a certain distance, the two contributions interfere constructively again (Fig. 3f), resulting in a bias similar to the case in Figure 3d. Finally, as the two wavefields separate, no interference occurs and the envelope of the stronger contribution to the correlation wavefield is picked; in this case the contribution of the boundary sources (Fig. 3g). This also explains the behaviour in the first case, where the isolated source dominates the measurement away from the line simply due to higher amplitudes.
The distribution of errors for both cases depends on relative amplitudes of the two contributions, source terms, frequency range, and the locations of r M and r N . With knowledge of these factors, measurement errors can be avoided. One straightfoward strategy is to avoid measuring where interference occurs by selecting which side of the correlation functions to measure ondepending on the geometry of r, r N and r M -in com-bination with a windowing function around expected arrival times. In the case of a stronger contribution by the boundary sources (Fig. 3b) selecting the side of the correlation function without interference is sufficient (Suppl. Fig. S1b). In the case of a strong isolated source, an additional windowing function is necessary (Suppl. Figs. S2, S3). See supplementary material for more details.

Discussion
We describe the contribution of an isolated noise source to the cross-correlation wavefield in seismic interferometry and how it relates to the contribution by boundary sources. Our derivation predicts the observed correlation wavefield (Fig. 2). In the following, we discuss the implications our results have for studies based on seismic interferometry and how this work may be expanded upon in the future.
The dataset in this study is not the first to record isolated noise sources that are used in the context of seismic interferometry. Zeng and Ni (2010) located an isolated source at primary microseism frequencies near Kyushu Island, Japan. Droznin et al. (2015) used crosscorrelation of continuous recordings of volcanic tremor to estimate their location. Retailleau et al. (2017) investigated spurious arrivals in correlation functions to locate noise sources near Iceland at ∼ 20 sec. Dales et al. (2017) exploited the correlation wavefield contribution from continuously operating ore crushers for monitor-ing of an underground mine. Brenguier et al. (2019) proposed to use body waves from train signals excited in the stationary phase of two arrays for structural monitoring of a fault between the two arrays. The crucial feature these sources have in common is that they are fairly localised and excite seismic energy repeatedly, similar to the wind farm in our dataset. In previous studies that use such sources, the correlation wavefield has often been dominated by the isolated source contribution, masking the contribution by boundary sources (Droznin et al., 2015;Dales et al., 2017;Brenguier et al., 2019). In other cases, both contributions have comparable amplitudes and the isolated source contribution arrives earlier than the expected direct wave (Zeng and Ni, 2010;Retailleau et al., 2017).
Signals that arrive before the expected direct wave in correlation functions are often called "spurious" arrivals (Snieder et al., 2006(Snieder et al., , 2008. The ambient seismic noise community often distinguishes two kinds of spurious arrivals: those that are induced directly by isolated noise sources (this study; Zeng and Ni, 2010;Retailleau et al., 2017), and those that emerge from uncancelled cross terms in correlation functions (Snieder et al., 2006;Colombi et al., 2014;Li et al., 2020). So far, cross terms in correlation functions have been observed between direct and reflected body waves (Li et al., 2020;Colombi et al., 2014). In principle, there may also be uncancelled cross terms between boundary sources (in violation of equation 1) but we are not aware of any field data example of this. In any case, understanding the cause of spurious arrivals (be they from isolated noise sources or direct-wave to reflected-wave cross terms) is necessary to exploit them for information. In this study, we investigate the behaviour of surface wave contributions induced directly by isolated noise sources and see no evidence for contributions due to uncancelled cross terms. Isolated source contributions to the correlation wavefield always emerge at negative lapse time for any chosen master station and propagate only outwards (Eq. 10, Fig. 2). These additional arrivals often manifest in distance-vs-lapse-time plots of correlation functions as nearly parallel (depending on velocity structure and exact geometry) to the causal direct arrivals emitted from the master station (see e.g., Zeng and Ni, 2010). The spurious arrivals exploited by Retailleau et al. (2017) show the same behaviour but reversed in time due to a different convention during processing, i.e., taking the timereversed signals of the receiver stations instead of the master station for cross-correlation.
In this study, we also recover the two different contributions to the correlation wavefield, by the isolated noise source and by the boundary sources, simultaneously. Expression (6) shows that for both contributions to the correlation wavefield to have comparable amplitudes, the source terms must have the "right" ratio of energy. For our data, both contributions emerge clearly only with spectral whitening applied, i.e., normalisation of energy across frequencies. Without spectral whitening, the correlation wavefield is dominated by the contribution of the wind farm Prottes-Ollersdorf, similar to how the 26s microseism biases correlation functions in Bensen et al. (2007). It is likely that whiten-ing is successful on our data because wind turbines excite seismic energy most effectively at specific frequencies related to the eigenmodes of the wind turbine towers (Neuffer et al., 2021), whereas other sources of ambient noise in the region excite energy over a wider frequency range at lower energy levels (Schippkus et al., 2020). Normalising the energy levels across frequencies changes their relative strength to be comparable in the wideband correlation functions we investigate here. Early tests have shown that using only time windows with wind speeds below the minimum operation specifications of the wind turbines in the wind farm Prottes-Ollersdorf, cross-correlations show a reduced but not eliminated wind farm contribution. Additional contributions to the correlation wavefield may also occur at lower frequencies where the presence of isolated sources is usually not considered, e.g., near the secondary microseism band (Zeng and Ni, 2010;Retailleau et al., 2017). Our analysis demonstrates the contribution of an isolated noise source can have significant impact on travel-time measurements (Fig. 3), which may be missed if one is unaware of the presence of an isolated source. This applies in a similar manner to measurements of amplitudes or phase velocities, as can be seen from Figure 3c-g.
The basic approach we propose to avoid travel-time measurement errors requires a nearly symmetric contribution to the correlation wavefield by the boundary sources, i.e., an even distribution of boundary sources (Snieder et al., 2008). In real-world applications, strongly asymmetric correlation functions with sufficient signal-to-noise ratio on only one side are common (e.g., Brenguier et al., 2008;Retailleau et al., 2017;Schippkus et al., 2018). If that side also is the side that contains the contribution by the isolated source, our proposed strategy is not applicable. In the context of tomography, one may still achieve sufficient coverage of measurements when applying a windowing function. Related to this, the causal and acausal parts of correlation functions are often stacked ("folded") to increase signal-to-noise ratio (e.g., Lin et al., 2008;de Ridder and Biondi, 2015;Schippkus et al., 2018). In the presence of an isolated noise source, folding correlations effectively forces the asymmetric contribution of the isolated source to become symmetric. In the case of a stronger boundary source contribution, this can result in too slow group velocity measurements for some station pairs (negative errors in Fig. S4b), an effect that was entirely avoided by not folding (Fig. 3). While a windowing function may still be applied, such a function is not sufficient to eliminate all errors (Fig. S2). Folding prevents selection of the appropriate side of the correlation function for measurement and results in irreconcilable errors. A related approach for stabilising velocity measurements is to measure on both sides of the correlation function and compute the mean, often combined with a quality criterion based on consistency (e.g., Stehly et al., 2009;Boué et al., 2014;Zigone et al., 2015). This approach is similarly adversely affected in the presence of an isolated noise source. The considerations above are also instructive for deployments where receiver stations are only available on the side of the master station away from the isolated noise source, e.g., in a scenario where ocean noise acts as an isolated source with the master station near the coast and all receiver stations further inland. This resembles the geometry in Figure 3 for stations to the Southwest of r M , i.e., for station pairs where r N is located within the Fresnel zone (Wapenaar et al., 2010). In such cases, folding of correlations or measuring on both sides of correlation functions may be used to increase signal-to-noise ratio without introducing additional measurement errors but this depends on the exact geometry of r, r M , and r N . Without detailed knowledge of r N , we advise against folding correlations or similar post-processing.
Isolated noise sources may also have significant implications for monitoring applications that exploit the coda of correlation functions. While the direct waves of both contributions to the correlation wavefield only interfere for certain station pair geometries (Fig. 2), coda waves of both contributions can overlap and interfere for a larger range of geometries. Because correlation functions contain the sum of multiple contributions to the correlation wavefield (Eq. 6), changes in the strength of the isolated source over time could induce apparent velocity changes due to interference, similar to how velocity measurement errors on the direct wave depend on relative amplitude (Fig. 3). However, this would likely be accompanied by a drop in correlation coefficient, which can indicate a change in source distribution and is often used as a quality criterion (e.g., Wegler and Sens-Schönfelder, 2007). Additionally, the origin of the coda wavefield dictates its spatial sensitivity (Margerin et al., 2016). If the origin is misattributed, it may lead to misinterpretation of results. Isolated noise sources that move over time can also lead to bias in measurements of velocity variations and their spatial interpretation and should be considered carefully (Hadziioannou et al., 2009). Similar to the strategy for traveltime measurements described above, careful coda window selection, based on the asymmetry of the isolated source contribution, may help avoid these effects also for monitoring applications.
We treat the wind farm Prottes-Ollersdorf as a single source in our derivation and modelling (Fig. 2). When considering multiple isolated noise sources, the derivation straightforwardly gives rise to a single contribution for each of those sources, assuming they are uncorrelated. Indeed, when we consider each wind turbine in the wind farm separately, the fit with observed correlation functions appears to improve (Fig. S5). This suggests that knowledge about the presence and characteristics of isolated sources may be used to remove their contributions and achieve correlation functions that are less impacted by local sources. Multiple isolated noise sources complicate the estimation of velocity measurement errors due to further interference between each individual source contribution. Above, we investigate the edge case of a single source, i.e., the worst-case scenario. The other edge case of isolated noise sources at every possible location approaches the condition of sources on a closed boundary, which would eliminate any isolated source contributions and reduce errors to zero. In practice, the real impact most likely lies some-where in between.
In our analysis, we have only considered vertical components, because only vertical component recordings are available in our dataset. Because the two contributions to the correlation wavefield propagate in different directions for some station pairs, questions arise about the interaction between differently polarised wave types with different velocities when analysing horizontal component recordings, i.e., potential interference of Love and Rayleigh waves. They may not be wellseparated and may interfere to affect measurements, similar to the above. Defining an appropriate windowing function may prove more difficult in that case. The case of horizontal components is a potential target for future works.
We demonstrate that different contributions to the correlation wavefield can carry similar energy and interfere. For certain station geometries this leads to significant travel-time measurement errors, if not properly accounted for. Ideally, studies that rely on seismic interferometry should always consider the possibility of isolated noise sources in their data and how such sources may impact results, especially at frequencies where anthropogenic sources dominate.

Data Availability and Resources
Seismograms used in this study were collected using an array for industrial exploration by OMV E&P GmbH. Due to a non-disclosure agreement with OMV E&P GmbH, the authors cannot make this data publicly available. The supplemental material includes a movie of cross-correlation function amplitudes over time, more details on the proposed strategy to avoid measurement errors, and the case when mulitple isolated noise sources are considered. Colormaps used for illustrations are perceptually uniform (Crameri, 2021).

S1 Avoiding erroneous group-velocity measurements
One strategy to avoid erroneous group-velocity measurements (Fig. 3) is to carefully select which parts of the correlation functions to measure velocities on. The goal is to avoid all cases where interference occurs and may bias measurements. We propose to make the selection in two steps: first a rough causal/acausal selection based on geometry (this is already sufficient in the case of a stronger boundary source contribution), and second a windowing function around expected arrival times, which is necessary if the isolated source causes the stronger contribution.
In a first step, we select the line perpendicular to the line connection r M and r N and going through r M (dashed grey line in Fig. S1). West of this line, we measure group travel times on the acausal part of the correlation functions, and East of the line on the causal part. Because the correlation wavefield contribution emerging from r N emerges at negative lapse time τ (r = r N ) = −|r M − r N |/c, there can be no interference to the West of the defined line.
To the East of the line, where we measure on the causal part of the correlation function, the resulting errors depend on which contribution has higher amplitude. In the case of a higher contribution by the boundary sources (Fig. S1b), we have avoided all measurement errors except for stations very close to r M . These remaining errors occur for stations where |r − r M |≤ w, with w the width of the wavelet, due to interference of causal and acausal parts of the correlation functions. Station pairs with distances shorter than a few wavelengths are commonly excluded in studies of seismic interferometry for this exact reason.
For the case of a stronger isolated source contribution (Fig. S1a), a circle of correct velocity measurements emerges to the East of the line, because this contribution propagates through the circle at negative lapse times. Because we pick at positive lapse times on this side, we pick the undisturbed contribution by the boundary sources. Outside of this circle and up to the defined line, measurements are affected by the contribution of the isolated source, because it has higher amplitudes.
A second criterion helps avoid those remaining measurement errors. We define a symmetric windowing function around the master station's location r M of expected arrival time windows and pick only within this windowing function (Fig. S2). We choose the half-width of the Ricker wavelet as the window width. In practice, due to unknown velocity structure, a wider windowing function would be needed. We show the the impact of the narrow windowing function to illustrate the best-case scenario one can reach with only a windowing function. The case of a stronger isolated noise source (Fig. S2) approaches the measurement errors one finds for a weaker isolated noise source (Figs. 3b and S2b).
Finally, if we combine the two criteria, we avoid velocity measurement errors for all station pairs except the stations near r M , as described above (Fig. S3).
A different strategy may be to define the windowing function around the isolated noise source instead of the master station. Still, one would need a two-step approach and this would require more precise knowledge of the isolated source location. The strategy proposed above relies on the fact that the isolated source contribution is asymmetric, whereas the boundary source contribution is symmetric. If this is violated, a different strategy is necessary.