Rayleigh-wave group velocities in Northwest Iran: SOLA Backus-Gilbert vs. Fast Marching tomographic methods

In this study, we focus on Northwest Iran and exploit a dataset of Rayleigh-wave group-velocity measurements obtained from ambient noise cross-correlations and earthquakes. We build group-velocity maps using the recently developed SOLA Backus-Gilbert linear tomographic scheme as well as the more traditional Fast-marching Surface-wave Tomography method. The SOLA approach produces robust, unbiased local averages of group velocities with detailed information on their local resolution and uncertainty; however, it does not as yet allow ray-path updates in the inversion process. The Fast-marching method, on the other hand, does allow ray-path updates, although it does not provide information on the resolution and uncertainties of the resulting models (at least not without great computational cost) and may suffer from bias due to model regularisation. The core of this work consists in comparing these two tomographic methods, in particular how they perform in the case of strong vs. weak seismic-velocity contrasts and good vs. poor data coverage. We demonstrate that the only case in which the Fast-marching inversion outperforms the SOLA inversion is for strong anomaly contrasts in regions with good path coverage; in all other configurations, the SOLA inversion produces more coherent anomalies with fewer artefacts.

As is common when comparing the results of tomographic studies of a single region, the surface-wave studies mentioned earlier agree on the main structures of the region -they all show high velocities at shallow depth and short periods in the Sanandaj-Sirjan zone and low velocities in the Zagros fold and thrust belt -but differ in the details.For example, the groupvelocities from Shakiba et al. (2020) are higher than those from Mortezanejad et al. (2019) and Zandi and Rahimi (2020), while the region south of Sahand volcano has slow group-velocities at short periods in Shakiba et al. (2020) and Mottaghi et al. (2013) and fast velocities in Mortezanejad et al. (2019), Zandi and Rahimi (2020), and Movaghari and Doloei (2019).There are multiple factors contributing to these discrepancies.We have previously discussed the diverse sources of surfacewave data (earthquakes or ambient noise) and variations in tomographic inversion methods.Additionally, we should consider disparities in uncertainty estimates for the measurements, differences in model parameterisation, and variations in the choice of trade-off parame-ters.In the absence of consistent resolution and uncertainty estimates of the tomographic models, meaningfully comparing them becomes difficult (e.g.Rawlinson et al., 2014).Of the studies cited above, only those using the linear-tomographic inversion method of Ditmar and Yanovskaya (1987) and Yanovskaya and Ditmar (1990) include spatial estimates of tomographic model resolution and uncertainties (Rahimi et al., 2014;Mortezanejad et al., 2019;Zandi and Rahimi, 2020;Shakiba et al., 2020), though not the full resolution-matrix with which to also quantify model bias.The other tomographic inversion methods used -the LSQR damped-least squares inversion method of Paige and Saunders (1982) for the studies using Partitioned waveform inversion and the subspace inversion method of Kennett et al. (1988) for those using Fast-marching seismic tomography -do not produce the full resolution-matrix either.For a good explanation of why producing the resolution matrix for such methods is computationally expensive, see Deal and Nolet (1996).
New seismic tomographic inversion methods are continuously being developed, including some that focus on the question of tomographic model resolution.One such method is called Subtractive Optimally Localised Averages (SOLA).This computationally efficient variant of the Backus-Gilbert linear-inversion paradigm (Backus andGilbert, 1967, 1968) was introduced in helioseismology (Pijpers andThompson, 1992, 1993) then adapted to seismic tomography (Zaroli, 2016;Zaroli et al., 2017;Zaroli, 2019;Latallerie et al., 2022).The SOLA method not only produces full resolution and uncertainty information for tomographic models, it also constrains the models to be unbiased, and allows users direct control on the trade-off between resolution and uncertainty.
In this study, we apply the SOLA tomographic inversion of Zaroli (2016) to Northwest Iran to construct maps of Rayleigh-wave group-velocities, using a dataset of Rayleigh-wave dispersion measurements obtained both from earthquakes and from seismic noise crosscorrelations.We compare the resulting tomographic images with those obtained using the same dataset and parameterisation but applying the Fast-marching surface-wave tomography method of Rawlinson and Sambridge (2005).We focus on how each method performs in cases of strong vs. weak seismic-velocity contrasts and good vs. poor data coverage.

Geological context
Northwest Iran forms part of the Arabia-Eurasia continental collision zone and is subject to a local transpressional tectonic regime with a high level of seismicity.The region is bounded in the North by the Lesser Caucasus thrust belt and the Kura depression, in the East by the Talesh mountains, the Alborz mountains, and the South Caspian Basin, in the South by the Zagros fold and thrust belt, and in the West by Eastern Anatolia.The crust and upper mantle structure of Northwest Iran has been strongly shaped by the convergence occurring on the southern edge of the Eurasian plate (e.g.Sengor, 1990).
The region contains two major active faults -the North Tabriz Fault and the Main Zagros Thrust Faultand can be divided into a handful of tectonic units, as shown in Fig. 1a.The North Tabriz Fault (NTF in Fig. 1a) has a clear surface expression, is considered one of the most active faults in Northwest Iran, and has been implicated in catastrophic historical earthquakes (Moradi et al., 2011).The Main Zagros Reverse Fault (MZRF in Fig. 1a) forms the suture between the Arabian plate and Central Iran block, which occurred after the closure of the Neotethys Ocean (Talebian and Jackson, 2002).
To the southwest of this suture, the Zagros Fold and Thrust Belt (ZFTB in Fig. 1a) contains a 12-km thick sequence of sediments over an altered Precambrian basement (Stocklin, 1968), with several active reverse faults that accommodate surface folding (Jackson and Fitch, 1981).To the northeast of the Main Zagros Thrust Fault lies the Sanandaj-Sirjan zone (SSZ in Fig. 1a), a metamorphic region that extends northwestwards into Eastern Anatolia and becomes the East-West trending Bitlis metamorphic massif.To the northeast of the Sanandaj-Sirjan zone lies the Urmieh-Dokhtar magmatic Arc (UMDA in Fig. 1a), composed of intrusive magmatic rocks related to the Neotethys subduction and mostly emplaced during the Eocene (Alavi, 1994).
Further northeast, beyond the Alborz and Talesh mountains, lies the South Caspian Basin, a relatively aseismic rigid basement block that has affected the deformation history of the surrounding continental regions.The South Caspian Basin and the Kura depres-sion to its west are thought to be a relic back-arc of the Tethyan Mesozoic subduction, or possibly a piece of unusually thick oceanic-like crust, trapped within a continental collision zone (Berberian, 1983;Mangino and Priestley, 1998;Brunet et al., 2003), similar to the Black Sea (Okay et al., 1994) and the eastern Mediterranean (de Voogd et al., 1992).Because of the South Caspian basin's low elevation and its southwest motion relative to central Iran, Talebian and Jackson (2002) and Allen et al. (2003) suggested that it underthrusts the Talesh and Alborz mountains on its western and southern margins.Accurate location of local seismicity along these margins by Aziz Zanjani et al. (2013) indicates that deep earthquakes beneath the Talesh mountain range only occur on its Caspian flank, implying that the underthrusting beneath the Talesh is not extensive.
Northwest Iran has experienced extensive volcanism throughout the Cenozoic and contains volcanic rocks that are Eocene to Quaternary in age.The Sahand and Sabalan volcanoes (Fig. 1a) are very large structures that dominate the Pliocene-Quaternary magmatic activity.The Eocene and Oligocene rocks of NW Iran are related to arc magmatism (e.g.Agard et al., 2011), while the late Miocene to Quaternary units are believed to have formed in a post-collisional setting and become progressively younger from West to East (Sengor and Kidd, 1979;Pang et al., 2013).The earliest post-collisional magmatism dated by Pang et al. (2013) occurred in the late Miocene (11 Ma) just east of Lake Urumieh, in the region of Sahand volcano, and was followed by eruptions in the late Miocene to Pliocene (6.5-4.2Ma) and then farther east by eruptions at the Sabalan volcano in the Quaternary (<0.4 Ma).

Data Processing and Measurement
We measured Rayleigh-wave group velocity dispersion curves on both earthquake recordings and ambient seismic noise cross-correlations in Northwest Iran.We used 55 broad-band and mid-band stations (see Fig. 1b and Table S1): 32 operated by the Iranian Seismological Center (IRSC, affiliated with the Institute of Geophysics, University of Tehran) and 23 belonging to the International Institute of Earthquake Engineering and Seismology (IIEES, operated by the Iranian National Broadband Seismological Center).

Data processing
We collected vertical component seismograms (due to noisier signal and the misorientation issue in the horizontal components, documented for Iranian stations by Movaghari et al. (2021) with clear surface-waves at distances between 100 and 800 km from 103 M >4.5 earthquakes that occurred between 2012 and 2022.To equalise the sampling frequency and reduce computational time and storage, the data were decimated to 2 Hz.We detrended the signals, removed the instrument responses and filtered between 5 and 120 s period.We chose the lower limit of this filter to be able to make measurements at 10 s period to constrain the crustal structure without measuring too close the the fil- ter's edge; we chose its upper limit because of the interstation distance (mean distance of 506 km) and the sismometer responses.We then visualised the waveforms and retained those with clear dispersed surface-waves.An example is shown in Fig. 2a.
We also collected continuous, vertical seismic records dating between January 2013 and December 2015 from 19 of the 55 available broad-band stations.Several of the broadband stations in the IRSC and IIEES networks were deployed after 2015 or at the end of 2014, so did not produce enough continuous data for our study.Moreover, some stations recorded part of the time as short-period stations and part of the time as broadband stations (instrument updates over the network).In some cases, we had insufficient coincident recording between two stations to cross-correlate and produce stable surface waves.We followed the procedures of Bensen et al. (2007), Lin et al. (2008), andPoli et al. (2012) to process continuous seismic noise data, and to extract Rayleigh-waves.We cut continuous data into one-day segments and decimated them to two samples per second.Then we removed the trend and instrument response from the daily segments and filtered them using a 5-120 s period band.We used a procedure similar to that of Zigone et al. (2015) to normalise the data and minimise the effects of transients and data irregularities: we cut the daily traces into 4-hour time windows then removed strong impulsive signals by discarding windows whose energy exceeded the daily average by over 30% and those with gaps over 10% of the total duration.We chose to remove signals based on 4-hour windows because there are often multiple aftershocks after larger impulsive earthquakes.We chose 30% for the daily average energy threshold by experimenting with a representative subset of our data: for larger values, some high amplitudes still remained in the signal and could perturb the correlations; for lower ones, windows without strong amplitudes started to be removed, reducing the overall amount of data available for correlation.We chose a 10% gap threshold to ensure that sufficient noise data was present in the selected windows before the computation of the correlation function.We then applied spectral-domain whitening between 5 and 120 s period and cut the processed data into one-hour windows to increase the speed of cross-correlation computation.Finally, we cross-correlated across all available station pairs and stacked the correlation functions over the fullest available time.
Fig. 2b shows the resulting stacked correlation functions sorted by inter-station distance.The amplitudes of the causal and a-causal parts are almost identical, indicating complete noise homogenisation over the threeyear recording time.We averaged the two sides of each stacked correlation function to create one-sided symmetric correlation functions and evaluated their quality using the period-dependent signal-to-noise ratio: ratio of the peak amplitude of the narrow-band filtered surface-wave to the root-mean-square of the trailing noise.We defined the time window of the surface-wave signal by the arrival times at the maximum (4 km/s) and minimum (2 km/s) surface-wave velocities.

Dispersion Curve Measurements
We measured dispersion curves from seismograms and noise correlation functions in the same way.We excluded all paths with an epicentral or inter-station distance smaller than 100 km (approximately 3 wavelengths) to ensure a good sampling of the medium along the path and rejected noise correlation functions whose signal-to-noise ratios were lower than 5.We applied the automated multiple filter technique of Pedersen et al. (2003) to create the group-velocity dispersion diagrams, as shown in Figure 3.We selected the period range within which the maximum of the dispersion diagram corresponded to the fundamental mode Rayleigh-wave while rejecting all parts of the dispersion curves affected by scattered waves, multi-pathing effects, overtones, or persistent noise sources.

Data Uncertainties
Uncertainties are important in all tomographic inversions, but even more so for SOLA Backus-Gilbert inver- Figure 4 (a) From the dispersion diagrams in Fig. 3, we extract plots of energy versus velocity at a given period (in this example 50 s).We estimate the uncertainty of the velocity measurement from the range of velocities at which the seismic energy is at least 90% of that at the apex of the curve.(b) Schematic figure of the data error related to the location of the earthquake.The error ellipse is drawn around the exact location of the events and uniform points with Gaussian distribution surrounding it.
sions in which they trade-off directly with the model resolution.It is impossible to perform a meaningful SOLA Backus-Gilbert inversion without robust estimates of data uncertainties.We considered only the two largest sources of data uncertainties: those connected with measuring the maximum energy for each period on the group-velocity dispersion diagrams and those connected with the uncertainties in earthquake locations.For the ambient seismic noise cross-correlations mea-surements, only the first kind applies; for the earthquake measurements, both apply and are combined as independent errors.
To extract the group-velocity uncertainties from the dispersion diagrams, we used the strategy of Zigone et al. (2015) and Ouattara et al. (2019): we fit a Gaussian to the energy in the diagram at each period, picked the maximum of the fitted Gaussian as the group-velocity, and used the spread of the Gaussian at 90% of the max- imum to define the uncertainty (Fig. 4a).The uncertainty is half of the interval at 90% amplitude.Note that choosing the standard half-width of the fitted Gaussian would result in over-estimated uncertainties: Gaussian half-widths correctly estimate the uncertainty of a random variable with a normal distribution, but here we were trying to estimate the uncertainty of picking the maximum energy of an envelope.
To evaluate the contribution of the location uncertainties to the uncertainties in group-velocities, we used the latitude and longitude errors given by the International Institute of Earthquake Engineering and Seismology (IIEES) and Iran Seismological Center (IRSC) networks for each earthquake to plot an error ellipsoid around the hypocenter and drew points from the resulting 2D-Gaussian distribution (Fig. 4b).We used the distances from each of these points to the station to determine the distribution of the resulting group-velocity estimates; as this distribution was approximately Gaussian, we considered its σ value as the contribution of the location uncertainty to the group-velocity uncertainty.
We rejected group-velocity measurements with uncertainties larger than 0.35 km/s (approximately 10% of maximum observed velocity at 50s period).The resulting number of measurements per period and average group-velocities are shown in Fig. 5. Maps of the measured group velocities for each period are shown in Figure 6.

Tomographic Methods
We inverted the group-velocity measurements shown in Fig. 6 to produce maps of group velocities at periods of 10, 20, 30, and 50 s using two different tomographic methods: the Fast-marching surface-wave tomography method of Rawlinson and Sambridge (2005) and the SOLA Backus-Gilbert method of Zaroli (2016) and Zaroli et al. (2017).For detailed explanations of the two methods, we refer the reader to these publications.However, since the main objectives of this study are to compare the two methods using an identical dataset and explore their respective advantages and disadvantages, we present below an overview of how each method ad-dresses the forward and inverse aspects of the tomographic problem.

Forward Problem
The forward problem is commonly expressed as the following path integral: (1) where t o is the time taken by the seismic wave to travel along its path from source to receiver, n(x) is the slowness (inverse of the velocity) of the seismic wave as a function of position, and s is a parametric variable that indicates the position along the path.Equation (1) is the integral form of the well-known eikonal equation that relates travel-times to the spatial distribution of slownesses.For group-velocity tomography, the equivalent formulation becomes where U o is the measured group-velocity, U (x) is the group-velocity as a function of position, and L is the length of the path.Equations ( 1) and ( 2) are linear in the integrands n(x) and 1/U (x).If the position of the path is known in advance, then we can say that the whole tomographic forward problem is linear.However, seismic-wave paths have a non-linear relationship with the spatial distribution of seismic velocities.If the velocity anomalies are small, the ray path can be approximated by the great circle connecting the source and receiver.Otherwise, deviations from the great-circle path may be important and cannot be neglected.
Since the slowness distribution is unknown, tomographers assume a starting slowness model, then use an inverse method to update their model based on differences (residuals) between the measurements (in our case group-velocity measurements) and the predicted values.In the SOLA Backus-Gilbert tomographic inversion, only the slowness distribution is updated while the ray path is fixed.In the Fast-marching tomographic inversion, the slowness distribution is updated first, then used to predict a new path that minimises the traveltime between source and receiver thanks to an eikonal solver (for more details on the workings of this specific eikonal solver, consult Rawlinson and Sambridge, 2005) and the process is repeated until the paths no longer move.As the path updates are a non-linear function of the slowness distributions, methods that perform such updates are termed non-linear-tomographic methods (amongst many examples, see Rawlinson and Spakman, 2016).

Inverse Problem
The inverse problem in seismic tomography consists in updating a starting slowness model m to minimise the residuals between the measurements and the corresponding predicted values.For simplicity, the following description assumes the slowness distribution is described by a set of values that represent slowness in a discrete set of geographical cells that span the region of interest.As the path is assumed to be known (albeit more or less accurately, as discussed above), we can write a discretised version of the integral expressions for the forward problem as the following matrix equation: (3) where d is a vector containing all measurements (in our case 1/U i for the i'th path), n is a vector containing the noise on the measurements, m is a vector containing the slowness value in all cells (in our case 1/U j for the j'th cell), and G is a matrix containing the length of each path in each cell (G ij is the length of path i in cell j).An equivalent and more convenient formulation of the forward problem interprets m as corrections to the starting model and d as the residuals.
The point of inverse methods is to estimate the slowness (or the slowness perturbations) m as a function of the measurements (or measurement differences) d.If the inverse method is linear, we can write the final slowness estimate as where G −g is called the generalised inverse of G (Penrose, 1955).The process of estimating m is complicated by many factors, including uncertainties in the measurements and uneven path coverage of the region lead- ing to some cells being traversed by many independent paths and others by none at all.This led to the development of various inverse methods, each with its own advantages, disadvantages, and trade-offs.Some methods search for an appropriate slowness estimate m by iteratively perturbing the prior slowness model without trying to construct the generalised inverse G −g , for example, the conjugate gradient method first proposed by Hestenes and Stiefel (1952) or the LSQR method developed by Paige and Saunders (1982); others try to estimate G −g directly, for example the singular value decomposition method first proposed by Penrose (1955) and clarified by Lanczos (1961).If G −g is estimated directly, we can combine equations ( 3) and (4) to obtain where R = G −g G is called the resolution matrix.This shows that the slowness estimate is actually a weighted average of the true slowness perturbed by noise.For a single cell, for example the k'th cell, the estimated slowness is , where the k'th row of the resolution matrix R acts as an averaging operator called an averaging kernel by Backus and Gilbert (1968) and denoted A (k) .
Inverse methods can be split into two general categories, based on the trade-offs they use to stabilise the inversions.Most inverse methods trade fitting the measurements against prior beliefs about the slowness distribution (its smoothness, for example); we call such methods data-fitting inversions and they are often exemplified by, though not limited to, general leastsquares inversions (e.g.Tarantola and Valette, 1982;Menke, 2015).Data fitting is the most intuitive inversion paradigm.The measurements force the model to update and where the measurements lack sensitivity or have large uncertainties, the model is not modified (e.g.Scales and Snieder, 1997).The Fast-marching surface-wave tomography method of Rawlinson and Sambridge (2005) fits into this paradigm and uses the subspace inversion method of Kennett et al. (1988) to minimise an objective function that includes the residuals, a damping factor that discourages changes in the starting model, and a smoothing factor that constrains the model smoothness.
Despite its intuitive appeal, data fitting is not the only inversion paradigm that exists.In the late 1960s, Backus and Gilbert introduced an inversion paradigm that constructed each row of the G −g matrix independently by requiring an optimal resolution of the slowness distribution, and proved that the resulting distribution still fits the measurements (Backus andGilbert, 1967, 1968).We call such methods resolution-optimising inversions, also known as optimised local average (OLA) methods.Although the Backus-Gilbert inversion paradigm can become numerically inefficient when the number of parameters and data is large and has been deemed difficult to apply to noisy data (Parker, 1994;Trampert, 1998;Nolet, 2008), it has been successfully applied to a few tomographic problems at scales ranging from local to global (e.g.Chou and Booker, 1979;Trampert and van Heijst, 2002;Bonadio et al., 2021).The comparatively small family of Backus-Gilbert inversion methods now contains a highly-efficient method first proposed in helioseismology (Pijpers and Thompson, 1993) then adapted to discrete and continuous tomographic inversions by Zaroli (2016), Zaroli et al. (2017), and Zaroli (2019): the SOLA (Subtractive Optimally Localised Average) Backus-Gilbert method.It estimates each row of G −g by minimising an objective function that includes the difference between the averaging kernel A (k) and a target kernel based on the path distribution, a trade-off factor called η that regulates the trade-off between resolution and uncertainties, and a condition that the sum of the values in the averaging kernel adds up to 1 (unimodular averaging kernels, the condition for obtaining unbiased model estimates).

Implementation Details
We used the Fast-marching Surface-wave Tomography (FMST) package developed by Rawlinson and Sambridge (2005) and the SOLA Backus-Gilbert code developed by Zaroli (2016) and Zaroli et al. (2017) and adapted to surface-wave tomography by Latallerie et al. (2022).We parameterised the slowness space in cells of 0.25 • in latitude and longitude.
For the Fast-marching method we selected the values of the damping and smoothing factors of the subspace inversion by examining the trade-offs between data residual and model variance as shown in Fig. 7a,b (often called L-curves): the chosen values were 5 for the damping factor and 30 for the smoothing factor, each located near the elbow of its L-curve.
For the SOLA tomographic method we chose the tar-  get kernels for each cell to be circles whose radii depended on the logarithm of the path density (Zaroli et al., 2017;Latallerie et al., 2022): where r is the target kernel radius, constrained to lie between r min and r max , and ρ is the path density (sum of the columns of G) whose minimum and maximum values are ρ min and ρ max .We chose r min = 50 km and r max = 250 km, based on the dominant wavelengths and path lengths within the dataset.The path densities and target kernel radii are shown in Fig. 8.We selected the value of the trade-off parameter η between resolution and uncertainty of the SOLA inversion by examining the trade-offs between the average resolution length and the average model uncertainties as shown in Fig. 7c.Fig. S3 of the Supplementary materials show tomographic models, uncertainty maps, and resolution lengths obtained using a range of η values.Fig. 9 shows three target kernels and the corresponding averaging kernels produced by the SOLA inversion, projected onto the 0.25 • grid.The size of the target kernels increases in regions of poor coverage and the shape of the averaging kernels shows potential smearing of information from outside the target kernel.As it would be unwieldy to plot averaging kernels for all points, in the following we summarise the size of the averaging kernel by a single number, the resolution length, which we take to be the mean of the semi-major and semi-minor axes of the ellipse that contains 68% of the averaging kernel (an approach roughly equivalent to that of Yanovskaya et al., 1998).

Results
Figure 10 shows Rayleigh-wave sensitivity kernels as a function of depth at various periods.Figures 11, 12, 13, and 14 show the group-velocity maps we obtained from our combined noise correlation and earthquake dataset using Fast-marching and SOLA tomographic approaches for periods of 10, 20, 30 and 50 s respectively.The figures also show the SOLA resolution lengths and uncertainties for each period.

Group velocities maps at 10 s period
At periods of 10 s, fundamental-mode Rayleigh-wave group velocities are expected to be primarily sensitive to the upper crust (Fig. 10).There is a strong correlation between the thick sedimentary basins of the upper crust and low short-period group velocities (e.g.Laske et al., 2013).Fig. 11 shows that the Fast-marching and SOLA maps share similar large-scale features though they differ in several details.Both maps contain high velocities along the Talesh mountains and the Sanandaj-Sirjan zone (SSZ in Fig. 11), sandwiched between low velocities in the South Caspian Basin and Kura Depression to the North and East and in Eastern Anatolia and the Zagros fault thrust belt (ZFTB in Fig. 11) to the South and West.
The slow velocities in the Caspian Sea basin and Kura Depression were observed in previous studies (e.g.Mortezanejad et al., 2019) and are related to thick, lowvelocity sediments overlying lower crust thought to be oceanic in origin (Mangino and Priestley, 1998).The shape of these low-velocity anomalies seems to be more strongly smeared towards the southwest in the Fastmarching map than in the SOLA map.
The two maps also show a sharp velocity contrast along the boundary between the Sanadaj-Sirjan zone and the Zagros fault thrust belt.The fast group velocities of the Sandaj-Sirjan zone are related to its metamorphic Paleozoic-Cretaceous rocks, which show little to no surface sedimentation or volcanic activity.The low group velocities in the Zagros fold and thrust belt are related to weakening and fracturing on shallow and low-angle reverse faults because of intense deformation (Jackson and Fitch, 1981) and thick Meso-Cenozoic sediment cover (Mouthereau, 2011).The boundary between these two regions more clearly coincides with the main Zagros reverse fault (MZRF in Fig. 11) in the SOLA map than in the Fast-marching map.Both maps also show low velocities in the Urumieh-Dokhtar region (UMDA in Fig. 11) that in previous studies were attributed to volcanic and sedimentary rocks left by lava flows through pre-existing pyroclastic deposits (e.g.Mottaghi et al., 2013).The Fast-marching map shows a strong velocity contrast between this region and the Alborz mountains to the North, a contrast which is more attenuated in the SOLA map.In the lesser Caucasus, the SOLA map shows low group velocities north of the Sablan volcano that transition to high group velocities farther East, in the region of the Sabalan volcano; the low velocities are less pronounced in the Fast-marching map.The transition has been seen before and has been interpreted as a transition between warm magmatic rocks to colder ones (e.g.Zandi and Rahimi, 2020).

Group-velocities maps at 20 s period
At periods of 20 seconds, fundamental mode Rayleighwave group velocities are primarily sensitive to the average shear-wave velocity of the crust at depths up to 20-25 km (Fig. 10).Fig. 12 shows that the Fastmarching and SOLA maps have similar features to those at 10 s period, with fast velocities in the Sandaj-Sirjan zone flanked by lower velocities in the South Caspian Basin and the Zagros fault thrust belt.The strong lowvelocities of the South Caspian and Kura basins visible in the Fast-marching map are elongated in the direction of the ray paths (Fig. 6), continuing to suggest these features may be both smeared and locally biased.The SOLA resolution lengths at 20 s period (Fig. 12c) are systematically larger than those at 10 s period (Fig. 11c), leading to SOLA maps that are smoother with less pronounced velocity contrasts.

Group-velocities maps at 30 s period
At periods of 30 seconds, fundamental mode Rayleighwave group velocities are expected to be primarily sensitive to the shear-wave velocity of the lower crust and uppermost mantle (Fig. 10).In continental regions, low group velocities at these periods indicate either a thick crust overlying a normal continental mantle lid or a normal crust overlying a weak lithospheric mantle, while high group velocities usually indicate a normal continental crust over a thick or oceanic-like mantle lid.Fig. 13 again shows similarities between the two maps, with stronger smearing in the South Caspian Basin region in the Fast-marching map.The low velocities in the Talesh region, visible in both images, may be related to a thickening of the crust seen by Mortezanejad et al. (2013) and confirmed by receiver functions that show a Moho depth of 54 km compared to values of 46 km in the South Caspian Basin and 47 km in the rest of Northwest Iran (Mortezanejad et al., 2019).The high velocities in the Zagros fold and thrust belt, although visible in both models, are smaller than the SOLA resolution length.Previous studies of the region based on the Fastmarching method observed a similar high-velocity region under the belt that contrasts with slower velocities in the Sandaj-Sirjan zone.This correlates with crustal thickening from 38 km in the Zagros fold and thrust belt, to 54 km in the Sandaj-Sirjan zone, to 44 km in the Urumieh-Dokhtar region (Mortezanejad et al., 2019).

Group-velocities maps at 50 s period
At periods of 50 seconds, fundamental mode Rayleighwave group velocities are expected to be primarily sensitive to the shear-wave velocity of the uppermost man- tle (Fig. 10).Slow velocities at these periods are usually attributed to thin or absent lithosphere, while fast velocities are attributed to a stable continental mantle lid or to an oceanic lithosphere.Fig. 14 shows that Fastmarching and SOLA group-velocity maps both contain apparent short-wavelength structures, stronger and at smaller spatial scales in the Fast-marching map, that differ significantly between the maps.Disagreement between tomographic images is common when coverage is poor, as the priors imposed in the inversion act more strongly in the absence of data.The Fast-marching map contains three prominent features probably cased by smearing and/or local bias: the elongated slow velocities in the South Caspian Basin, the fast velocities that trend South-West from the Kura Basin, parallel to a group of paths with the same orientation seen in Fig. 6d; and the high velocity in Zagros fault and thrust belt (ZFTB).These artefacts are absent from the SOLA map.It is hard to reconcile the group-velocity maps in Fig. 14 with previous studies that find deeper lithosphere-asthenosphere boundaries beneath the Alborz and South Caspian Basin (Bavali et al., 2016), or a thick high-velocity mantle lid under the Zagros and the South Caspian Sea (Mangino and Priestley, 1998), or that the lithosphere beneath the South Caspian Sea may underthrust the Talesh mountains (Mangino and Priestley, 1998;Bavali et al., 2016;Mortezanejad et al., 2019).

Discussion
We have constructed a dataset of fundamental mode group-velocity measurements from noise correlations and local/regional earthquakes in Northwest Iran and inverted them with two different techniques -Fastmarching and SOLA Backus-Gilbert -to obtain the group-velocity maps shown in Figs.11 to 14.In the previous section, we attempted to compare the groupvelocity maps produced by the two techniques to known geological features in the region seen by previous studies.Comparisons between tomographic images are notoriously difficult to make as their resolution and uncertainties differ.In our case, we know the resolution and uncertainties of the SOLA group-velocity map, but not those of the Fast-marching one.We have inferred several smearing artefacts and possible local bias in the Fast-marching maps by singling out regions with poor coverage where there is a convergence of paths, notably in the South Caspian Basin (g and h in Fig. 9) where the convergence is due to a few earthquakes on the Apscheron Sill that separates the northern and southern Caspian.Proving that these features are indeed artefacts requires careful construction of ad-hoc synthetic tests.
As the SOLA Backus-Gilbert tomographic inversion provides full-resolution and uncertainty information, we do not need to resort to such ad-hoc tests to identify robust features in the SOLA maps, but can follow a simple workflow proposed by Latallerie et al. (2022): 1. assume or construct a relevant reference model of seismic velocity to which we want to compare the tomographic images; 2. filter this reference model with the SOLA resolution to obtain the tomographic image that would be obtained if the Earth resembled the reference model; 3. subtract the filtered reference model from the SOLA tomographic image to obtain an anomaly map; 4. divide the anomaly map by the SOLA uncertainties σ m to obtain anomalies in "units" of σ m ; 5. mask regions where the anomalies are smaller than ±1σ m and/or ±2σ m ; 6. compare the sizes and shapes of the anomalies that remain with the resolution lengths and individual averaging kernels to spot unresolved regions and artefacts and identify significant anomalies for further analysis.

Anomaly analysis for SOLA groupvelocity maps
Although the outcrop geology of Northwest Iran is well known (see section 2), transforming this geology into expected group-velocity maps at different periods requires making assumptions about the detailed shapes and seismic velocities of these geological bodies at depth.Although such an endeavor would be scientifically worthy, it falls outside of our team's expertise and would take this study outside of its seismological scope.Instead, we will take as our reference models uniform velocities equal to the average group-velocity measured at each period.Such uniform models remain uniform after filtering with the SOLA resolution because the averaging kernels (i.e. the rows of the resolution matrix) are constrained to be uni-modular (the sum of the averaging weights is equal to 1).Figs 15 and 16 show the deviations of the SOLA tomographic maps from the uniform reference models in units of the model uncertainties, σ m , with masks at ±1σ m (equivalent to a 68% confidence threshold) and ±2σ m (equivalent to a 95% confidence threshold) for regions traversed by at least 3 surface-wave paths.Simply identifying anomalies as exceeding ±σ m or ±2σ m is not enough to declare them significant because even if the Earth were, in reality, identical to the reference model, we would still expect 32% of velocities to exceed ±σ m and 5% of them to exceed ±2σ m because of how the measurement uncertainties propagate into model uncertainties.We could be justified in declaring anomalies to be significant only if more points than expected exceed the ±σ m and ±2σ m thresholds, or if these points organised geographically in coherent regions and these anomalous regions could indeed be resolved by the tomography (anomalies larger than the resolving lengths) and showed no indication of smearing.This definition of significance is stricter than the one used in most tomographic studies and highlights the importance, when using SOLA, of correctly estimating the data uncertainties that feed into the estimates of σ m .
At 10 s period (Fig. 15a,b), 42% of cells in the image exceed ±1σ and 27% exceed ±2σ; the relevant resolution lengths and σ values are found in Fig. 11c,d.The large, slow anomaly in the Caspian basin is significant at both the 1 and 2-σ thresholds and its size (largest dimension ∼ 100 km×300 km) is of a similar order as the resolution length (radius of 125-225 km).It shows some smearing in the NE-SW direction, following the dominant direction of the paths (Fig. 6); to verify this smearing hypothesis, we can examine the target and averaging kernels at the same location (Fig. 9), which indeed show evidence of low amplitude recovery and smearing along the same direction.We can conclude that the Caspian Basin is indeed significantly slower than the rest of the region, and is probably even slower than indicated in Fig. 11b.Without detailing the full analysis for each of the features visible in the 10 s period maps, we can be confident in the high velocities shown in the Sanandaj-Sirjan Zone and their contrast with the slower velocities of the Zagros Fold and Thrust Belt, at least in the ∼ 100 km either side of the Iran-Irak border; the higher velocities under the Sabalan volcano also seem significant and wellresolved.However, many of the small scale features in the Talesh and Alborz mountains may be too small to be correctly resolved; should we be interested in resolving them in the future, we would need to improve the data coverage in this region.
At 20 s period (Fig. 15c,d), 38% of cells in the image exceed ±1σ and 23% exceed ±2σ; the relevant resolution lengths and σ values are found in Fig. 12c,d.The Caspian Basin is again significantly slower than average, with the same smearing problem.The contrast between the high velocity Sanandaj-Sirjan Zone and the low-velocity Zagros Fold and Thrust Belt seems again both significant and well-resolved, as do the high velocities under the Sabalan volcano and in the northwesternmost corner of Iran, north of the North Tabriz Fault and the Urumia lake.
At 30 s period (Fig. 16e,f), 22% of cells in the image exceed ±1σ and 13% exceed ±2σ; the relevant resolution lengths and σ values are found in Fig. 13c,d.The resolution lengths are systematically larger than for the 10 and 20 s maps (compare Fig. 13c with Figs.11c and 12c) and there are few anomalies that seem well-resolved and significant at the 2σ level, with the exception of the highvelocity anomaly sandwiched between the Sanandaj-Sirjan Zone end the Urumieh-Dokhtar Magmatic Arc.
At 50 s period (Fig. 16g,h), 14% of cells in the image exceed ±1σ and around 5% exceed ±2σ; the relevant resolution lengths and σ values are found in Fig. 14c,d.Here, we no longer have any well-resolved anomalies that are significant at the 2σ level, meaning that interpreting the group-velocity variations at this period (Fig. 14a,b) would probably be meaningless.Should we be interested in the deeper structure of this region, therefore, we would need to improve the data coverage and also reduce the uncertainties in the measurement of groupvelocity dispersion at long periods.

Fast-marching vs SOLA
We have observed that having full-resolution and uncertainty information allows us to visualise robust anomalies and identify artefacts.We have also seen that the Fast-marching method -more precisely the subspace inversion method used by the Fast-marching tomographic implementation we used here (Rawlinson and Sambridge, 2005) -did not provide this information and so limited our capacity to compare rigorously its re- sults with the SOLA results.Despite this, we do not believe that the current implementation of SOLA Backus-Gilbert tomography is necessarily the best method to use in all cases.
All tomography is data-driven: if the sensitivity of the data does not cover adequately the target region, tomographic studies that attempt to achieve resolutions finer than those compatible with the data coverage will produce images that are unreliable, while those that limit themselves to the resolution compatible with the data coverage will produce images that are uninformative, as this resolution is too poor to give meaningful information.On the other hand, we expect that where data coverage is sufficient and the forward tomographic problem (G in equation 3) identical at each iteration, then any well-implemented tomographic inversion should produce similar features.The relevance of these features could be analysed in detail if the inversion also provided resolution and uncertainty information at a reasonable computational cost.
Figure 16 Group-velocity anomalies scaled by the SOLA uncertainties σ m at periods of 30, and 50 s.(e) and (g) are masked at ±1σ m ; (f) and (h) are masked at ±2σ m .We masked all cells with fewer than three rays passing through them.
Therefore, where data coverage is good, both datafitting schemes and resolution-fitting tomographic inversion schemes should provide similar fits to the data and similar model resolutions and uncertainties.In such cases, it might be useful to choose the implementation of the forward problem (constructing G in equation 3) that is most adapted to the expected velocity or slowness variations of the region being studied.In particular, if we expect strong contrasts such as those created by geological units of different types, forward schemes that allow ray paths to adapt to the heterogeneous velocity or slowness distributions would be likely to allow the inversion to converge on tomographic models that are more similar to the true Earth compared to simpler forward schemes that predict straight ray paths and do not update them, such as the one implemented in the context of the SOLA Backus-Gilbert inversion we used here.
However, where data coverage is poor, we expect the implementation of the inverse problem to be a strong predictor of the quality of the final tomographic model.In particular, we expect that having the ability to influence the resolution, either indirectly through irregular parameterisation (Curtis and Sieder, 1997;Trampert, 1998;Bijwaard et al., 1998;Sambridge and Gudmundsson, 1998;Alinaghi et al., 2007) or directly using a Backus-Gilbert type inversion (Backus and Gilbert, 1968;Trampert and van Heijst, 2002;Zaroli, 2016;Bonadio et al., 2021;Latallerie et al., 2022) as we did here, would limit smearing and local bias artefacts.If fullresolution and uncertainty information are also available, then we could extract robust inferences from the tomographic models, know exactly how informative or uninformative they are, and spot artefacts.
If we apply this reasoning to the two tomographic methods we used in this study, we would expect Fastmarching to outperform (produce a more-interpretable seismic image with fewer smearing artefacts and/or better resolution of velocity contrasts) SOLA where velocity contrasts are strong and data coverage is good, for SOLA and Fast-marching to give equivalent results where velocity contrasts are weak and data coverage is good (in such cases the advantage may still go to SOLA because it constrains the resolution to neighboring areas and produces full-resolution and uncertainty information), and for SOLA to outperform Fast-marching where data coverage is poor, regardless of velocity contrasts.We tested this expectation with a series of synthetic tests based on the 20 s ray-coverage from Fig. 6: Fig. 17 shows strong and weak velocity contrasts in regions of poor ray-coverage (northern parts of Fig. 17a,d) and good ray-coverage (southern parts of Fig. 17a,d).The only case in which the Fast-marching inversion outperforms the SOLA inversion is indeed for the strong anomaly contrast in a good coverage region (southern part of Fig. 17b); in all other configurations, the SOLA inversion produces more coherent anomalies with fewer artefacts.

Conclusion
We have assembled a data-set of group-velocity measurements from ambient noise cross-correlations and earthquakes that cover Northwest Iran.Using this dataset, we have produced group-velocity maps using two tomographic techniques: the Fast-marching method of Rawlinson and Sambridge (2005) and the SOLA Backus-Gilbert approach of Zaroli (2016).We have compared them with each other and with known geological features in the region.Thanks to the resolution and uncertainty information provided by the SOLA inversion, we were able to single out robust features of the tomographic maps -for example the high velocities at short period (shallow depth) shown in the Sanandaj-Sirjan Zone and their contrast with the slower velocities of the Zagros Fold and Thrust Belt (Figs 15 and 16).We were also able to show that the SOLA method allows us to minimise artefacts caused by poor coverage and identify any ones that do remain.
Although the advantages of the SOLA Backus-Gilbert method for suppressing artefacts and allowing robust interpretation are clear and significant, the SOLA method as currently implemented (without path updating) may not produce the best tomographic images of regions with strong seismic-velocity contrasts and good data coverage.In such situations, the Fast-marching method may produce superior images albeit without the resolution and uncertainty information required for their robust interpretation.We suggest it could be advantageous to add path-updating capability to the SOLA Backus-Gilbert method, provided the uncertainties can be correctly estimated and the resolution correctly taken into account at each iteration.as well as the handling editor Suzan van der Lee.We would like to express our gratitude to Franck Latallerie who shared generously his experience in working with the SOLA Backus-Gilbert code.We wish to thank Ramin Movaghari (Southern University of Science and Technology, China) and Taghi Shirzad (Institute of Geophysics, Polish Academy of Sciences, Poland) for their useful comments and suggestions during the signal processing part of this work.Christophe Zaroli acknowledges the High-Performance Computing Center of the University of Strasbourg for supporting this work by providing scientific support and access to computing resources.

Data and code availability
All data used in this study are freely available.The continuous seismic data and raw earthquake data were provided by the Iranian Seismological Center (IRSC, http://irsc.ut.ac.ir/) and the International Institute of Earthquake Engineering and Seismology (IIEES, http://epp.iiees.ac.ir/datarequest/) to the corresponding author.
The processed data and codes that support the findings of this study are available from the corresponding author (https://doi.org/10.5281/zenodo.8004726).The Fast-marching surface-wave tomography (FMST) package developed by Nick Rawlinson is available at http://iearth.edu.au/codes/FMST/ and we used http://rses.anu.edu.au/~nick/surftomo.html to obtain 2D Rayleigh-wave dispersion maps of the Iran plateau.The SOLA tomography code we use in this study consists in running the LSQR algorithm of Paige and Saunders (1982) with specific input matrix and vectors.These inputs can be constructed from the (studydependent) sensitivity matrix and target kernels (denoted by G and T in this study) as detailed in the appendix A1 of Zaroli (2016).The LSQR code (Paige and Saunders, 1982) is freely downloadable on the web page of the Systems Optimisation Laboratory (Stanford University): https://web.stanford.edu/group/SOL/software/lsqr/.For those who prefer a pre-constructed software package for SOLA tomography, one is is available from Christophe Zaroli (c.zaroli@unistra.fr)upon e-mail request.

Figure 1
Figure 1 (a) The location map of the main seismotectonic units: the volcanic and intrusive rocks (brick red areas); the Zagros fold and thrust belt (ZFTB); the Sanandaj-Sirjan Zone (SSZ); the Urumieh-Dokhtar Magmatic Arc (UDMA), the Alborz and Talesh; the South Caspian Basin; and the Lesser Caucasus.Two major active faults are indicated with solid lines: the North Tabriz Fault (NTF) and the Main Zagros Reverse Fault (MZRF).The locations of the Sahand and Sabalan volcanoes are shown by black triangles.(b) Locations of the seismic stations and earthquakes used.Triangles indicate stations; red stars indicate earthquakes.Stations surrounded by circles were used for both earthquakes and ambient noise cross-correlations.

Figure 2
Figure 2 (a) Example of the vertical component of a seismogram (in velocity) used for dispersion measurements, filtered between 5 s and 120 s.Horizontal axes show time duration after origin time.The signal is detrended, decimated and station responses have been removed.Station names and times are shown.(b) Vertical two-sided noise correlation functions sorted by inter-station distance.The surface-wave move-out is between 2 and 4 km/s.Rayleigh-waves are observable at both positive (causal) and negative (acausal) lag times.Only the noise correlation functions with signal-to-noise ratios larger than 10 are plotted.

Figure 3
Figure 3 Examples of group-velocity dispersion diagrams for earthquakes and ambient seismic noise cross-correlations.Diagram (a) was measured from an earthquake waveform recorded at station HSB station at an epicentral distance of 571 km and related to the earthquake on 2020-04-29, 17:01:34; diagram (b) was measured from a noise correlation function for the station pair BZA-GHVR distant 311 km from each other.The color scale represents normalised energy.The maximum energy observed at different periods is indicated in red.

Figure 5
Figure 5 (a) The number of group-velocity measurements at periods of 10, 20, 30, and 50 s.(b) The average group-velocities at each period; the error bars show the average uncertainties of the measurements at each period.

Figure 6
Figure 6 Path coverage of group-velocity measurements at periods of 10, 20, 30, and 50 s.Paths are colored by their respective group-velocity values.

Figure 7
Figure 7 Trade-off curves (L-curves) for Fast-marching (a,b) and SOLA inversions (c).A trade-off between data residual and model variance for different values of the damping (a) and smoothing (b) parameters for the Fast-marching Surface-wave Tomography method.(c) A trade-off between average resolution and average uncertainty for the SOLA method for different values of the trade-off parameter η.The numbers in the boxes show the chosen trade-off value for the period of 10s.

Figure 8
Figure 8 Path densities and SOLA target kernel radii at 10 s period.(a) The number of paths per 0.25 • cell.(b) Radii of target kernels derived from the path densities using equation 6; cells containing no paths are masked in white.

Figure 9
Figure 9 Target and averaging kernels for the SOLA inversion at 10 s period.(a) Circular target kernels at locations with different path densities.(b) Averaging kernels obtained after a SOLA inversion with η = 0.4.

Figure 10
Figure 10 Sensitivity kernels of Rayleigh-wave group velocity at different periods.

Figure 11
Figure 11 Tomographic results for group velocities at 10 s period.(a) Group velocity map obtained using Fast-marching tomography.(b) Group velocity map obtained using SOLA Backus-Gilbert tomography.(c) Resolution lengths and (d) model variances from the SOLA inversion.

Figure 12
Figure 12 Tomographic results for group velocities at 20 s period.Panels as in Fig. 11.

Figure 13
Figure 13 Tomographic results for group velocities at 30 s period.Panels as in Fig. 11.

Figure 14
Figure 14 Tomographic results for group velocities at 50 s period.Panels as in Fig. 11.

Figure 15
Figure 15 Group-velocity anomalies scaled by the SOLA uncertainties σ m at periods of 10 and 20 s. (a), (c), are masked at ±1σ m ; (b), (d), are masked at ±2σ m .We masked all cells with fewer than three rays passing through them.

Figure 17
Figure 17 Synthetic tests for different regions.(a) and (d) are unfiltered synthetic model with sharp and weak anomaly contrasts, respectively.(b) and (e) are filtered models with Fast-marching method.(c) and (f) are filtered model with SOLA Backus-Gilbert.The ray-coverage used corresponds to the 20 s coverage from Fig. 6.