Local station correlation: large-N arrays and DAS

The use of cross-correlation between seismic stations has had widespread applications particularly in the exploitation of ambient seismic noise. We here show how the effects of a non-ideal noise distribution can be understood by looking directly at correlation properties and show how the behaviour can be readily visualised for both seismometer and DAS configurations, taking into account directivity effects. For sources lying in a relatively narrow cone around the extension of the inter-station path, the dispersion properties of the correlation relate directly to the zone between the stations. We illustrate the successful use of correlation analysis for both a large-N array perpendicular to a major highway and a DAS cable along a busy road. When considering cross-correlations, the co-array consisting of the ensemble of inter-station vectors provides an effective means of assessing the behaviour of array layouts, supplementing the standard plane-wave array response. When combined with knowledge of the suitable correlation zones for noise sources, the co-array concept provides a useful way to design array configurations for both seismometer arrays and DAS.


Introduction
The use of inter-station correlations to extract a surface wave component from the ambient noise field has been widely applied and successful results achieved even when the conditions do not meet theoretical expectations (see, e.g., Nakata et al., 2019).In the ideal conditions of a uniform distribution of uncorrelated noise sources, the cross-correlation of seismic records between two stations is closely related to the Green's function for the path between them.A number of different derivations have been made with different assumptions such as a diffuse wavefield (Lobkis and Weaver, 2001), energy equipartitioned among surface wave modes (Weaver, 2010), or with sources on a boundary surrounding the two stations (Wapenaar, 2004;Wapenaar and Fokkema, 2006).A summary of this theoretical background is provided by Fichtner and Tsai (2019).
Here we approach the situation by looking directly at the cross-correlation of seismic records between two stations and examining how far the result can approach an approximation to the Green's function with a nonideal noise distribution.Our objective is to explore how * Corresponding author: brian.kennett@anu.edu.aubest to exploit available noise sources when using highdensity observations.For large-N arrays the distribution of stations is, in principle, under user control.Although, use of large scale deployments is often made in the context of exploration or production (Chmiel et al., 2019), or seismicity monitoring (Dougherty et al., 2019) where the typical pattern is a regular rectangular grid.Where likely noise sources are well-characterised in advance, the array design can be adapted to their configuration and exploit their correlation properties.
However, with Distributed Acoustic Sensing (DAS) the array of sensor points is confined to the line of the fibre-optic cable and there are strong directivity effects.When a cable is specifically deployed for an experiment, the array's configuration can be optimised with knowledge of likely noise sources, but often 'dark-fibre' is used exploiting existing telecommunication channels, and then the orientation of the cable can be important.Nevertheless, useful results can be achieved in circumstances that may appear unpropitious, such as a DAS cable running along a major highway (Yang et al., 2022).
We show how the properties of local correlation can be understood directly from the interaction of the influence of distributed sources, by analysing the nature of the cross-correlation between the seismograms at two separated stations.We then illustrate the application of such local station correlations to a large-N nodal experiment in southeastern Australia adjacent to a major highway and a wind farm, and to a DAS recording in an urban environment in Bern, Switzerland.

Inter-station correlations from distributed sources
We provide an outline of the theoretical development for inter-station correlation based on Chapter 6 of Kennett and Fichtner (2020), using a local coordinate system rather than spherical coordinates.We consider a situation with structure that depends solely on depth and represent the seismograms at each location in terms of a synthesis in frequency-slowness space.For simplicity we concentrate on a single frequency ω and consider the vertical component from an isotropic source with source spectrum M (ω).Then for a station at a distance X from a source we can represent the resulting seismogram in the frequency domain as an integral over slowness p of the response of the local stratified medium G z (p, ω) multiplied by a horizontal phase term: (1) Here we have assumed that we can use the high frequency asymptotic form for the horizontal phase dependence.For the same source recorded at stations 1 and 2 at distances X 1 , X 2 , the cross-correlation U 12 (ω) is represented by a multiplication in the frequency domain and (2) A natural consequence of this relation is that the crosscorrelation involves the difference in the phase of the contributions from the two stations, and it is this property that allows the emergence of path related effects in the presence of many sources.Using equation 1, the cross-correlation can be written as We now introduce the distance between the two stations X 12 = |X 1 −X 2 |−δX 12 and recast the second slowness integration in terms of the difference in slowness ζ = p − q.Then (4) In this form for the cross-correlation between the two stations from a single source we are able to identify a phase component relating directly to propagation between the stations exp[iωpX 12 ], which is modulated by a further slowness integral.
Simplification occurs when we have contributions from a distribution of sources, because only the coherent part corresponding to the direct propagation path survives, and the the remainder is eliminated by destructive interference.A broad distribution of sources is needed to achieve the suppression.The application of a stationary phase treatment to the integral over differential slowness, as in Snieder (2004), extracts the neighbourhood of ζ = 0.In consequence, the slowness of the arrivals that contribute to the net cross-correlation is the same at both stations and equation 4 reduces to a single integral over slowness.Full suppression of slowness contamination requires a good distribution of sources relative to the inter-station path (Halliday and Curtis, 2008).But, when the wavefield is dominated by fundamental mode surface waves, well separated in slowness from the other contributions, the requirements are less stringent.
The first integral in equation 4 includes a term exp[iωpδX 12 ] that depends on δX 12 , the extent that the inter-station distance X 12 deviates from the difference between the distances from each source to the two stations |X 1 − X 2 |.This oscillatory term is again suppressed by destructive interference leaving just contributions where δX 12 ∼ 0, so that the paths from the effective sources to the two stations are approximately aligned with the inter-station path.Two such zones are present stretching out from the two stations along the continuation of the inter-station path.
The summed cross-correlation over many sources reduces to a form representing a virtual source-receiver pair at the two stations with contributions from propagation in each direction (5) Equation 5 has a similar form to the time derivative of the Green's function between the two stations, but the combination G z (p, ω)G * z (p, ω) replaces G z (p, ω).The geometrical spreading term F (X) will not have a simple relation to the path, but will tend to be dominated by source contributions from near the two stations, and hence F (X) ∼ X 12 .Surface wave contributions come from the poles of the integrand in equation 5. Their position in slowness, which controls dispersion, is unchanged from the Green's function but the pole is now second order (rather than first order for the Green's function) and so the amplitude factor is modified (Kennett and Fichtner, 2020).Typically the dominant contribution comes from the fundamental mode and so, provided there are sources with a broad range of azimuths to the interstation path to contribute to the net cross-correlation, the contribution from between the stations is emphasised and the dispersion for the fundamental mode can readily be extracted.

Geometric
We can examine the way that the cross-correlation field is built up by looking at the various contributions at a single frequency (Figure 1).We show two stations separated by 350 m in a simulation of local conditions.The effect of a source drops off quite rapidly with distance.So, if we consider the net geometrical spreading effects to the two stations, those sources close to the two stations dominate even when we make an improved approximation to the spreading function than the asymptotic form used in the theory above (Figure 1a).We have noted that the constructive interference condition emphasises those source locations for which the difference between the distance from the source to the two stations is close to the inter-station distance.In Figure 1(b) we represent this effect by plotting the ratio of the differ-ence in distance to the path length |X 1 − X 2 |/X 12 , from zero (black) to unity (white).
With the specification of frequency ω and slowness p f we can display the phase effects through the function cos[ωp f (|X 1 − X 2 |−X 12 )] as in Figure 1(c).We here use a frequency of 2 Hz and phase speed of 500 m/s (slowness = 0.002 s/m), typical of situations at local arrays.As would be expected, the zones approximately in line with the stations show slow variation in phase, but as the inclination to the path increases, variation is rapid and increasingly so at higher frequencies.It is the superposition of these rapidly varying phases that leads to destructive interference and the concentration of the cross-correlation on the inter-station path.
When all the contributions to the correlation are combined for a pair of seismometer stations, the total effect is as in Figure 1(d).The distance-match term has been applied here as a multiplier to the product of the geometric spreading and the phase variation.The dominant component comes from beyond the stations, but some contamination is possible from sources lying between the stations.
When considering the correlation of stations in a DAS array, additional factors have to be taken into consideration, because the strain-rate along the cable assigned to a reference point is averaged over a gauge length g around the point.For a Rayleigh wave, with slowness p arriving with an inclination ψ relative to the cable, the gauge-length effect is (e.g., Kennett, 2022) For a typical gauge length of 10 m and a phase speed of 500 m/s, for frequencies less than 10 Hz the sine function does not impose much distortion.
The most common local ambient noise is Rayleigh waves from anthropogenic activities such as traffic, and Love waves are less frequently encountered since they have a less favourable orientation effect.Because strain is a tensorial quantity, the effect of inclination depends on 2ψ and to a good approximation for Rayleigh waves along a cable with uniform orientation In consequence there is a strong dependence on the position of any source.The effect of the orientation factor cos 2 ψ for Rayleigh waves, is displayed in Figure 1(e) and has a strong suppression effect for sources broadside to either of the stations.This factor modulates the response for the seismometer to give a total contribution shown in Figure 1(f) where there is a strong emphasis on sources nearly in-line with the cable.
It is interesting to note that the application of the distance-match mask produces a net effect that has a strong resemblance to the source kernels derived by Sager et al. (2017).In a similar way we can simulate the structural kernel.Figure 2(a) displays the phase factor cos[ωp f (X 1 + X 2 − X 12 )] representing the difference between the phase accumulated in passage from the source to the two stations compared with that for the inter-station path.When the pattern in Figure 2 Sager et al., 2017).It is thus possible to achieve an effective visualisation of the effects of local correlation with a simple implementation that can be adapted to the configuration of a distributed array, or DAS cable.

Illustrations of inter-station correlations
In Figure 1 we see significant differences between the net effect of sources for the seismometer and DAS configurations.For the correlation of a pair of seismometers, the dominant contribution lies in a cone behind the stations with a significant width of potential useful zone.Contributions will be muted by the effect of geometrical spreading, but if sufficient noise sources are present over time, stacking will readily enhance the correlation functions.For correlations of channels along a DAS cable the strong orientation effects limit the zone of most effective sources.In this case, noise sources travelling beside the cable can be exploited to create stacked correlation functions.
We here illustrate the exploitation of the correlation properties of traffic dominated noise fields in two different configurations associated with independent experiments.
The first case is a nodal experiment adjacent to a major highway in southeastern Australia, where the array stretches perpendicular to the highway on a dry lake bed.The dispersion of Rayleigh waves across the nodes allows the delineation of the thickening sediments.The second case shows how a DAS cable running along a street in Bern, Switzerland can be used to extract correlation functions that provide insight into the nature of the noise field with secondary sources linked to road conditions and also to characterise the near-surface structure from Rayleigh wave dispersion.

Large-N array -Lake George experiment near Canberra, Australia
The Lake George nodal array was a short-term seismic experiment conducted on a then dry lake bed located ∼35 km northeast of the Australian capital city, Canberra (Figure 3).This experiment was conceived and led by Meghan Miller from the Australian National University.The nodal array included 97 threecomponent SmartSolo sensors recording continuously with 250 Hz sampling rate, and was operated between December 2020 and January 2021 with an average interstation spacing of 30-40 m.The array configuration is mainly composed of five lines right next to and perpendicular to the Federal Highway connecting Canberra to Sydney, thus recording dramatic amounts of traffic noise.Apart from the five lines, this array also included three nodes as a separated group about 500 m away from the nearest stations in the west to increase the array aperture.To the southeast, the array lies about 15 km away from the capital wind farm, with a series of windmills operating continuously during the deployment time.

Cross-correlations across the array
To process the traffic noise data, we take advantage of the open-source Python package NOISEPY (Jiang and Denolle, 2020), which is a high-performance tool designed specifically for large-N ambient noise seismology.In NOISEPY, the main noise data processing procedures generally follow the conventional workflow of Bensen et al. (2007) and are briefly described below.First, continuous noise data are down-sampled to 60 Hz sampling rate, before they are cut into 4-hour long traces.Each trace is further divided into 15-min segments with a 75% overlap between adjacent segments to increase the signal-to-noise ratio of the stacked crosscorrelation functions at a later stage.Any 15-min segments with maximum amplitudes over 10 times the standard deviation of the amplitude within each 4-hour window are removed to reduce contamination from large transient signals (such as earthquakes).Second, the mean and trend of the remaining time series are removed before a taper and a 4-pole 2-pass Butterworth filter with corners at 0.05-28 Hz are applied.To further reduce the effects of large transient signals, each timeseries is normalized by the corresponding smoothed version produced using a moving average over a window length of 500 samples.The cross-correlation is then calculated in the frequency domain and a moving average with a window length of 20 samples is used to smooth the source and receiver spectra.Finally, the cross-correlations of the small-time windows are linearly stacked for each station-pair, generating about 4650 stacked cross-correlation functions across the array.
Since the above procedures are applied to all three components of each station, each station pair has nine components of cross-correlation functions in the R-T-Z system, i.e., RR, RT, RZ, TR, TT, TZ, ZR, ZT, and ZZ (with the first letter denoting the component of the source station and the second letter the receiver station), forming a complete correlation tensor.
We focus on the frequency band of 1-10 Hz to take ad- Strong asymmetric features can be observed with the negative lag displaying generally higher frequency energy than those in the positive lag.This is due to the dominant origin of traffic noise from the west.Though strong coherency exists in the correlations through time, some variations can also be observed, possibly due to the changing traffic conditions on the highway.To quantify the similarities, we compute the correlation coefficients of each trace relative to the final stacked (i.e., mean) cross-correlation function (Figure 4b).When the traffic is active, the resulting correlations are almost the same as the final stack with the associated correlation coefficients mostly larger than 0.9.During quiet times, particularly the Christmas and New Year holidays, the 4-hour cross-correlation functions are significantly different from the average with correlation coefficients as low as 0.5.A comparable analysis for a pair of stations on opposite sides of the highway (LG069-LG094) is shown in Figure S3 of the Supplementary Material.The temporal pattern is in phase with that in Figure 4, indicating the greater importance of traffic conditions than station location.
To further demonstrate the time dependence of the cross-correlation functions, we stack the crosscorrelations using different time periods and matrices and summarize the resulting waveforms in Figure 5.As can be seen from the figure, the stack over the 4hour time window with busy traffic conditions (between 11 am and 3 pm each day) is almost the same as the final stack as well as the stack using waveforms of high correlation coefficients relative to the final stack; while it is distinct from the stack using a same length of 4-hour time window but crossing midnight (between 11 pm and 3 am).Such behaviour shows little frequency dependence within the 1-10 Hz band investigated here (see Supplementary Material Section S1.2).This further indicates that the coherent contributions from the traffic noise dominate the final stacked cross-correlation.

Enhancing Rayleigh wave signals
Due to the complex waveforms of the cross-correlation functions, we enhance the Rayleigh wave signals assuming retrograde elliptical particle motion by manipulating the cross-component of the correlation tensor.This is achieved by following equation 3 of Nayak and Thurber (2020).We refer to the resulting correlation function as the M0 component.The general idea behind this process is to correct the different initial phases of the fundamental-mode Rayleigh wave (assumed to Figure 5 Comparison of stacked cross-correlation functions using different time periods and matrix.S1 is the stack of all cross-correlation functions (same as the black line in Figure 4a).S2 is the stack of all cross-correlation functions with a correlation coefficient large than 0.7 relative to S1. S3 is the stack of correlation functions over the time window 11 am-3 pm.S4 is the stack of correlation functions over the time window 11 pm-3 am each day.
have retrograde motion) on different cross-components and stack them to boost the signal.A similar approach has also been applied in van Wijk et al. ( 2011), Takagi et al. (2014), and Gribler and Mikesell (2019).We also performed an equivalent procedure to enhance the prograde motion using the cross-component of the correlation tensor but found generally weak coherent energy.This suggests surface wave energy is dominated by retrograde motion in the correlation functions from the traffic noise.

Dispersion extraction
To extract the dispersion information, we apply slantstacking in the c − ω domain to the M0 correlation functions, e iφr e iω|xr−xs|/c (ω) , where φ r denotes the phase of the cross-correlation function between source station s and receiver station r at an angular frequency ω, c is the phase velocity, x is the station location, and |x r − x s | is the distance between the station-pair r and s.F is the sum of the phaseshifted cross-correlation functions over a total of N receiver stations in the neighbouring region of each station source (x s ).In spite of its simplicity, this method has been demonstrated to be effective for extracting short period dispersion data (< 5 s) from dense arrays to characterize sedimentary structures (e.g., Nayak and Thurber, 2020;Jiang and Denolle, 2022).
We adopt a two-step approach to construct a phase diagram for a single site via the slant-stacking method represented by equation 8. Firstly, we define a receiver bin with a radius of 150 m around each station and pair each station within the receiver bin with a virtual source station that is at least 300 m away from the bin center to respect the plane-wave assumption underlying equation 8.The slant-stacking using these cross-correlations generate one phase diagram for this receiver bin.Secondly, we linearly stack the phase diagrams from all virtual sources satisfying the above distance criteria to form the final image for that receiver bin. Figure 6(a) shows one example of the final phase diagram for the receiver bin centered around LG015, and clear, coherent and relatively simple dispersion energy can be observed over the 0.1-0.4s period range.We then extract the dispersion data by tracking the maximum envelope of the stacked data and quantify the uncertainty using the band of 90% of the maximum energy at each period.We conduct the above procedure for each receiver bin, and the 30-40 m inter-station spacing allows us to extract high-quality dispersion data at 0.1-0.4s period range across most of the array (except the western edge with sparse stations as well as a topographic change).Figure 6(b) shows the period dependent phase velocity variations across the five lines of the array with the major feature of the increasing period range for low velocities when moving to the east.This reflects the gradual thickening of a slow and weak regolith layer in the region from west to east, as the stations move out onto the dry lake bed.

DAS correlation along Bern Street, Switzerland
This DAS deployment was a pilot experiment conducted in Bern, Switzerland in November 2019 by a group from ETH Zürich, under the direction of Andreas Fichtner.The experiment ran for 2 weeks, and utilised 'dark' telecommunication fibre -currently unused fibres within telecommunication fibre cables (access provided by the SWITCH foundation).The fibre optic cables are believed to be housed in a plastic conduit, buried at a depth of ∼0.7 m beneath the surface of the road.During construction, this conduit was covered with sand before the road surface was laid on top, and is likely to have been cemented in places (for example, near manholes).
The DAS layout consisted of ∼3 km of cable in a Tconfiguration, with the signal reflected at the far end, resulting in signal measured over ∼6 km of fibre, with repeating sections.Data were collected using a Silixa For the production of cross-correlations in this study, we chose to use the ∼1 km section of fibre running along Länggassstrasse (Figure 7), to allow us to assume that we are only seeing Rayleigh surface waves, without components of Love waves.The road was treated as a separate top and bottom section, to avoid any complications arising due to the slight bend in the road.The southern section lies on glacial gravels, whilst at the northern end of the street there is a transition to moraine material (Ketterhals et al., 2000).
The anthropogenic noise sources in this experiment are primarily cars and buses travelling along the road, parallel to the fibre (as illustrated in Supplementary Material S2.1).The main train station for the city is also situated at the end of the fibre, resulting in more diffuse train noise (as the trains do not pass directly on top of the fibre).
For much of the length of the Bern street, the road is bordered by substantial concrete basements.Such barriers in the near surface tend to channel surface waves along the road conduit.At the northern end of the road near point a the situation is more open, and there are fewer concrete structures below ground close to the road.The road also has regular manholes marking access points to the subsurface and these structures also act as scatterers to produce a more complex wavefield.

Computation of cross-correlations
Mean and linear trends were first removed from the raw data.We then computed cross-correlations, using 1-hour windows of night-time data (spanning 11 pm -5 am, local time), as we found that this time period contained fewer noise sources directly on top of the fibre, therefore the noise field was more diffuse and homogeneous.We also applied spectral whitening, to suppress the most dominant peaks in the frequency spectrum (Bensen et al., 2007).
Cross-correlations were computed for 100 m sections of the fibre -the northernmost channel along the straight section of the fibre was used as a virtual source, and a 1-hour window of data was cross-correlated with the same hour for all other channels within this 100 m section (at distances of 2 m, 4 m, etc.).All defined nighttime hours were then stacked (6 hours per night for 12 days, totalling 72 1-hour windows), from which we kept the central 4 seconds of each stack to reduce the final data volume.This process was then repeated for each channel along the fibre, using each channel as a virtual source and producing a cross-correlation record section covering 100 m.We were limited to just 100 m distance due to the presence of significant secondary sources along the fibre, resulting in non-ideal crosscorrelations, with many additional signals present (see Supplementary Material S2.2).
An example of a cross-correlation section using channel a as the virtual source is displayed in Figure 8, showing the complex nature of the observed signals, largely due to the presence of secondary sources.
F-k filtering was applied to all the cross-correlation record sections, in order to remove signals propagating in the opposite direction to the desired signal, and this largely eliminates the extraneous effects.

Production of dispersion curves
In order to produce dispersion curves from our crosscorrelation record sections, we apply the MASW (Multichannel Analysis of Surface Waves) method outlined in Park et al. (1998Park et al. ( , 1999)).MASW has already been successfully applied to DAS data; for example in Lancelle et al. (2021).This method is almost identical to the slant-stack described in section 3.1, however, following Park et al. (1999), there is a normalisation of each spectra with its own absolute value, to ensure equal weighting of each trace.Additionally, we use a phase-weighted stack to help the dispersion curve to converge more quickly (e.g.Cheng et al., 2021).Examples of the resulting dispersion In spite of the complex nature of the data and the short inter-channel distances over which the crosscorrelations are computed, we are still able to produce reasonable dispersion curves, particularly for frequencies between 10 and 21 Hz.The subsurface phase velocity along the DAS array is expected to be low for higher frequencies, as the street is built upon unconsolidated sediments; particularly late glacial retreat gravels and alluvial sands (Ketterhals et al., 2000).While the dispersion curves show some variability and local oscillations, this is not unexpected given the complex geological and anthropogenic structure along the street (concrete infrastructure built on top of soft sediments, with bedrock beneath).
The dispersion behaviour for the southern segment (Figure 9b) is more coherent.This portion of the road has consistent geology and a similar building style with basements directly lining the street.At the northern end there is more variation in surface geology and some buildings lie further away from the road.The net result is a lower-quality of dispersion estimate (Figure 9a).The higher phase velocities seen for lower frequencies (< 10 Hz) correspond to the presence of bedrock at depths of ∼40 m.
In Bern, Switzerland, the road is bounded by deep concrete structures and multiple access points along the road that act as secondary sources and need to be treated with careful processing.These structures lie just in the zone most strongly sampled by surface waves.There is a major contrast with a similar experiment in Athens, Greece where only minimal processing was needed, with good coherent signal for hundreds of metres.The built environment in Athens does not have such consistent deep structures and so there are fewer impediments for surface wave propagation.

Discussion
For situations where noise sources are well characterised, such as traffic noise, it is frequently possible to adapt experimental layouts to make good use of the source and its directionality.Where space allows it can be feasible to lay out lines of recorders just beside a highway, mimicking the way that active source seismology is conducted using seismic vibrators to generate strong seismic energy along the road.Several seismic experiments have demonstrated the feasibility of such array-source configurations to generate reasonable dispersion results.For example, Zhang et al. (2020) con-Figure 8 An example of a 100 m cross-correlation record section, produced using channel a as the virtual source, and crosscorrelating with each of the other channels within a distance of 100 m to the south.
Figure 9 A comparison between the dispersion curves produced at channel a at the northern end of the fibre (a) and channel b at the southern end of the fibre (b).We observe a significant decrease in the quality of dispersion curves towards the northern end of the fibre.The circles indicate picked phase velocities with frequency, in increments of 0.5 Hz, where the dispersion curves were deemed to be reliable.duct a seismic survey of 352 geophones along a country road in the North China Plain and managed to extract dispersion curves up to 18-20 Hz range using 80 minutes long segments of continuous traffic noise.The resulting dispersion has also been benchmarked with that from active source survey.Quiros et al. ( 2016) deployed about 100 geophones along a railway within the Rio Grande rift, New Mexico, and used about 120 hours of continuous train noise to extract Rayleigh wave dispersion data up to 12 Hz.They also managed to reveal clear, direct and reflected P-wave signals.However, the proximity of the seismic arrays to a highway or a railway means such environments can have a modified structure.As we have seen for the Lake George experiment, an alternative is to work with an array perpendicular to the road, so that only a small portion of the highway acts as a persistent source as traffic passes through the zone.By getting away from the immediate vicinity of the road the siting is improved and extraneous noise reduced.However, we note that such configurations sometimes could be limited by space and environmental concerns.
Many DAS cables are laid under roads or just beside them, as in the Bern Street, and then the traffic noise can be exploited directly with signals aligning with the axis of the cable as detected by the modification of laser scattering in the DAS system.However, a cable laid parallel to the road but at some distance from the road itself largely picks up broadside signals and the axial component is weak (Dou et al., 2017).In these circumstances, a perpendicular DAS cable can be used with the effective source being the passage of vehicles through a rather narrow part of the road, thanks to the DAS inclination factors.Dou et al. (2017) have demonstrated that such a perpendicular cable can be used for time lapse analysis.Often the layout of DAS cables, as in dark fibre, is determined by the most convenient geometries for telecommunication purposes and so the orientation may not be ideal for seismic applications.It may be possible to compensate to some extent with directional corrections, but it is probably preferable to choose portions of the DAS cable for analysis that have the best orientation (e.g., Fang et al., 2022).
When the source of noise is not known or there are many different forms of noise, such as traffic noise from many directions, there are a different set of challenges.Conventional analysis of array behaviour is based on the response to a plane wave with a specified slowness.For a set of N sensors at positions x j relative to a reference site at a suitable origin, the linear array sum for frequency ω as a function of slowness s takes the form where the terms w j allow for signal weighting by sensor.The array response S(s, ω) is a scaled version of the Fourier transform with respect to the wavenumber of a set of weighted delta-functions placed at the array positions.The same functional form is derived irrespective of the slowness of an incoming wave p, with the pattern shifted to be centred on p and characterised by the differential slowness ∆s = p − s.The function S(∆s, ω) can be characterised by calculating the response for a vertically incident wave for which s 1 = s 2 = 0. Good array designs display a strong central lobe in the array response, with weak secondary peaks well removed from the origin.
For DAS systems, the strain-rate response is modulated by the slowness p and the directional factors at each segment of the cable depend on the specific incoming wave.As demonstrated by Näsholm et al. (2022) and Kennett (2022), the result is that the actual response is distorted from the ideal expressed by equation 9 with bias toward larger slowness.Nevertheless the array response (equation 9) remains a useful comparator.
For each array configuration specified by the set of points {x j }, there is an associated "co-array" (Haubrich, 1968) comprising the vectors (10) X ij = x i − x j , i, j = 1, 2, . . ., N .
In the context of inter-station correlation, the pattern of the co-array specifies the sampling achievable.For optimum performance when using correlations, we desire both a reasonable array response function and thorough sampling by the co-array vectors.Maranò et al. (2014) have presented an optimisation scheme for the design of small arrays with an objective function based on the character of the array response.For the co-array, it is not obvious what would be the most suitable criterion for any optimisation.Haubrich (1968) used a space-filling approach based on direct search for a small number of sensors, but this is not easy to generalise.Here we use the visual properties of the co-array as a guide to its behaviour.
We consider arrays with around 36 elements, sufficiently large to show complex character but small enough that the nature of the response can be readily appreciated.We first look at designs that can be readily implemented with a suite of seismometers, and then transfer attention to the case for DAS where the 'sensors' are required to be directly connected.

Co-Array for large-N array deployments
In Figure 10 we compare the behaviour of three designs using 36 elements and uniform weighting.The first is based on a rectangular 6×6 configuration, to which mild dithering has been applied to provide some distortion of the regularity.Such configurations are often used for large-N array deployments.The second uses a random configuration that ends up with rather variable spacing of stations.The third uses a 6-arm spiral array (Kennett et al., 2015).Despite the effort to reduce the regularity of the near-rectangular array, very strong side-lobes appear in the slowness response and the coarray shows concentrations of vectors.The side lobes of the array function are not too close to the main lobe so that suitable windowing can be found, but the coarray behaviour is restrictive.In contrast, a random array achieves a good co-array pattern and side-lobes of the array-function are much suppressed.It is unlikely that such a pattern would be chosen for field implementation, but it demonstrates the merits of breaking regularity.The array designs of Haubrich (1968) based on co-array properties also show a mixture of concentration and sparseness.The third array with spiral arms achieves a good compromise in array behaviour.The local side lobes are suppressed near the main lobe and there is a good azimuthal and distance coverage in the co-array.Such arrays also have the merit that their properties are resilient to distortions introduced in layout and even missing stations (Kennett et al., 2015), whilst achieving good areal coverage.
From Figure 10 we can see that it is desirable to minimise regularity in the layout of a large-N array where the primary aim is the exploitation of inter-station correlations, and there is no dominant noise source.A set of patches exploiting simple spiral-arm layouts can achieve comparable areal coverage and improved array behaviour without too great experimental complexity.

Co-Array with orientation factors for DAS
When we turn to the design of DAS configurations with a broad range of noise sources, we are faced with the topological necessity that all sensor locations can be connected by a single cable.To allow direct comparison with the arrays in Figure 10 we have scaled DAS designs up to comparable size and selected only about 36 elements on each cable.The regular co-array is very helpful for assessing the potential of an array of seismometers for crosscorrelation analysis, but when we consider an application to DAS arrays we also need to bear in mind the influence of cable orientation relative to the path between the stations (cf.Martin et al., 2021).At the two stations being correlated with angles ψ 1 , ψ 2 between the local cable configuration and the path between the stations, the scaling factor for Rayleigh waves due to the relative orientation is and the equivalent factor for Love waves is For each vector in the co-array we can associate these orientation factors and so assess the effectiveness of the In Figure 11 we show two possible DAS configurations with reasonable properties for both slowness response and co-array properties.The first is the Archimedean spiral considered by both Näsholm et al. (2022) and Kennett (2022).Even where the cable does not complete a full loop, the geometry of the co-array including the orientation factors gives good azimuthal coverage for Rayleigh waves -though there is some clumping in distance.
The second design, a fan array with 36-elements, is aimed to exploit the roughly 60°span around the vector between 'sensors' that will make an effective contribution to the correlation.With 6 such cable segments, and their external coupling, surprisingly good proper-ties are achieved.As might be expected, the strongest Rayleigh factors are associated with direct propagation along the arms of the fan, but reasonable sensitivity for Rayleigh waves is achieved across a wide range of directions.For such an array configuration, some mild irregularity in layout could also be beneficial by spreading the range of azimuths.A similar 'umbrella' design for a DAS layout has been suggested by van den Ende and Ampuero ( 2021), but they do not provide any analysis of its performance.
In general, we see that the Rayleigh wave response for the arrays dominates that for Love waves, even when the DAS cable has a significant curvature.For both DAS designs, the L factors remain quite small for most station pairs (Figure 11), so that Love wave contamination of Rayleigh wave results will only become an issue if the local ambient noise has much stronger Love wave content.It is possible to weight array contributions to enhance Love waves and suppress Rayleigh contributions (e.g.Kennett, 2022), and such schemes are likely to be needed to extract and identify Love waves.
With a DAS cable it is possible to use a much larger number of recording positions than illustrated in Figure 11, so that the discrete spots will spread into diffuse patches.It is also possible to select the portions of a DAS cable to be used for correlation, so that poorly oriented segments or linking loops can be excluded.
Future DAS interrogators may prove capable of handling multiple cables simultaneously, and then a wider range of designs will become feasible.Already, some experiments use two separate interrogators that allows more complex geometrical configurations to be exploited if a common time base is available.

Conclusions
For both large-N array and DAS experiments it is possible to extract the correct dispersion behaviour for Rayleigh waves from cross-correlated records even though the amplitude factors differ from the true Green's function between the stations.With careful processing to remove extraneous signals, e.g., reflections from lateral structures, well-defined modal dispersion can be achieved for the fundamental mode at higher frequencies.The lower frequency limit depends on the maximum spacing between stations in an array deployment and on the maximum span with coherent behaviour along a straight segment of cable for DAS work.The high frequency end for DAS arises from the influence of gauge length averaging to produce the local strain-rate signal (e.g., Näsholm et al., 2022).In principle, the unaliased wavefield attainable with DAS allows the extraction of multiple modes, but this depends on the nature of the excitation.For surface sources such as traffic, low surface wavespeeds and a strong vertical gradient in wavespeed provides favourable conditions for higher mode excitation.
When working with traffic as a source of noise, good results can be achieved provided that a significant component of the noise sources lies inline with recorder pairs.For large-N arrays, the requirements of placing recorders very close to the traffic can be a limitation for deployment parallel to the road.Fortunately a perpendicular arrangement works just as well, though again local circumstances may affect the ease of deployment.For DAS, dark fibre is commonly within a conduit under or just at the side of roads, so recorder pairs are naturally in a suitable arrangement.Where cable is to be laid specifically, a line perpendicular to a road may well prove easier to install.
For situations with a broad distribution of noise sources it is desirable to use deployment configurations that provide a wide range of measurable azimuths.The use of rectangular grids for large-N array deployment does not meet this objective at all, even when deployment is non ideal.Alternative space-filling designs can provide better azimuthal control.For DAS, even though the recording points have to lie along a continuous cable, we have been able to show that it is possible to achieve effective azimuthal coverage with simple configurations.

Figure 1
Figure 1 Correlation simulation for 2 Hz waves with phase speed 500 m/s at stations 350 m apart.The positions of stations are shown by purple dots in each panel.(a) Geometric spreading effects from a source to the two stations.(b) The ratio of the difference between the distance from each source to the two stations and the inter-station path length.(c) Phase contributions to the cross-correlation.(d) Total effect for two seismometers of the terms in (a), (b) and (c) -amplified by 5. (e) Orientation effects for DAS cable with orientation along the inter-station path.(f) Net effect for two DAS sensors of the terms (a), (b), (c) and (e) -amplified by 10.
(a) is modulated by the geometric spreading effect from Figure 1(a), the result displayed in Figure 2(b) emphasises the zone in the immediate vicinity of the inter-station path (cf.

Figure 2
Figure2(a) Relative phase between the contributions from propagation from a source to the two stations and the interstation path.(b) Net effect of modulating the phase term by geometrical spreading.The configuration is the same as in Figure1.
vantage of the dominant signals from traffic noise.Figure 4(a) displays the filtered vertical-vertical (ZZ) crosscorrelation functions for the station pair LG015-LG049 stacked over every 4-hour time window throughout the deployment time.

Figure 3
Figure 3The station distribution of the Lake George Seismic array in Australia.The inset shows the geographic location of the array (red arrow) with respect to the Australian continent.

Figure 4
Figure 4 (a) The 4-hour stacked cross-correlation functions over the entire deployment time for the station pair of LG015 and LG049 plotted in matrix form.(b) The final stacked cross-correlation for the station pair by taking the mean of the 2D matrix in (a).(c) The correlation coefficients (CC) of each trace relative to the mean.The red dashed line denotes the correlation coefficient of 0.9.

Figure 6
Figure 6 (a) The final dispersion diagram for the receiver bin centered around the station LG015.The pink circles show the extracted phase velocities at each period with the error bars representing the associated uncertainties.(b) The variations of extracted dispersion across the five lines of the array with line 1 representing the northernmost and line 5 the southernmost.The horizontal axes represent the index of the corresponding receiver bin (not necessarily same to the station location) from the east.

Figure 7
Figure 7 Map showing the fibre section used to produce cross-correlations (blue).The red circles a and b denote the positions of a reference northern and southern channel along the DAS line.Note the proximity of the Bern main train station to the southern end of the fibre.The inset map shows the location of Bern (red circle) within Switzerland.

Figure 10
Figure 10 36 element array responses showing the geometrical layout, the co-array behaviour and the array function in slowness space for a 1 Hz signal.(a) Rectangular array with mild dithering; (b) a 36-element random array with similar aperture; (c) a 6-arm spiral array.

Figure 11
Figure 11 DAS array responses showing the geometrical layout with the cable orientation at each sample point marked, the co-array behaviour with orientation factors and the array function in slowness space for a 1 Hz signal.The arrays are scaled to match those in Figure 10.(a) Archimedean spiral with 36 elements; (b) 36-element fan using a continuous cable.The amplitudes of R and L orientation factors for Rayleigh and Love waves from equations 11 and 12 are plotted for each vector in the co-array, with the L factor superimposed on the centre of the larger R symbol.Stronger colours indicate the expectation of good cross-correlation results for the wave type.