This paper: GeoComp'98, Fredricksburg, VA. (Note: some citation references missing in this paper, but these are in the References here.)
See also: M. Morehart, F. Murtagh and J.L. Starck, "Spatial Representation of Economic and Financial Measures used in Agriculture via Wavelet Analysis", International Journal of Geographical Information Science, 13, 557-576, 1999. (Email f.murtagh@qub.ac.uk for a copy.)

# Multiresolution Spatial Analysis

Mitchell Morehart 1, Fionn Murtagh 2, and Jean-Luc Starck 3

1Economic Research Service, USDA, 1800 M Street NW, Washington DC 20036, USA
Tel. +1 202 694-5581, Fax +1 202 694-5758, Email morehart@econ.ag.gov
2School of Computer Science, Queen's University of Belfast, Belfast BT7 1NN, Northern Ireland, UK
Tel. +44 28 9027-4620, Fax +44 28 9068-3890, Email f.murtagh@qub.ac.uk
3CEA/DSM/DAPNIA, 91191 Gif-sur-Yvette cedex, France
Tel. +33 1 69.08.57.64, Fax +33 1 69.98.65.77, Email jstarck@cea.fr

## Abstract

Geographic information systems (GIS) are increasingly used as tools for topographical applications and research. A comprehensive GIS is characterized by its capabilities in the areas of data processing, analysis, and post processing. This paper explores the use of the wavelet transform as a spatial analysis tool for modeling complex multivariate geographic relationships. The use of wavelets in spatial statistics is a relatively recent phenomenon that is rapidly developing. The appeal of wavelet methods stems from their ability to process noisy data with local structures and represent discontinuities such as jumps or peaks in a function. Several examples from agricultural date are used to illustrate the exploratory data analysis inherent in the wavelet transform. The resulting maps provide a convenient means of visually conveying tremendous amounts of information. The redundant à trous discrete wavelet transform is shown to aid enormously in feature detection and exploration in the succession of resolution views of the data. Analysis is carried out through use of the MR/1 multiresolution image and data analysis package.

Keywords: nonparametric regression, wavelets, multiresolution, spatial analysis

## 1  Introduction

Through advances in computer technology, telecommunications, and spatial statistical theory, practitioners have recently gained a powerful new ability to obtain, organize, and study complex relationships between farming and its natural resource base. Applications of this new capability are broad, ranging from investigating the farm-level benefits of precision farming to assessing the impact of conservation practices over wide geographic areas. Opportunities that arise from these developments also raise numerous questions relating to the interpretation, manipulation, and analysis of Geographic Information System (GIS ) data.

This paper explores the use of the wavelet transform as a spatial analysis tool for modeling complex multivariate geographic relationships. The wavelet transform is shown to facilitate several desirable functions such as multiresolution support, smoothing, and noise removal which enhance the construction and post processing of geographic data representations. Multiresolution support (Starck, Murtagh, and Bijaoui 1995) involves studying an image through information contained at various resolution levels. The statistical concept of smoothing is generally associated with nonparametric regression methods used to uncover a relationship within data, without strong a priori restrictions on its form. There are many examples of both liner and nonlinear applications of wavelet regression to this type of problem (See for example Donoho 1993, Donoho and Johnstone 1992, Walter 1992, DeVore and Lucier 1992, Kerkyacharian and Picard 1992, and Kerkyacharian and Picard 1993 ). We can advance two reasons why it is effective to work on tasks such as denoising or filtering in wavelet transform space. First, the wavelet transform is often found to be relatively sparse. Many coefficients are zero or small in value. Second, by virtue of the multiresolution decomposition we can base our filtering on scale related components in our data. Wavelet methods are also appealing because of their ability to process noisy data with local structures and represent discontinuities such as jumps or peaks in a function. This is an important consideration for estimating geographically-referenced multivariate surfaces where a high degree of spatial inhomogeneity is expected.

The next section presents an overview of wavelet analysis with emphasis on application to geographic data analysis. This is followed by a discussion of the data source and construction of a geographic information system. The third section provides an empirical framework for analysis and the underlying methodology. In the fourth section, spatial distribution problems for two agricultural indicators are used to illustrate multiresolution spatial analysis. The fist determine those areas of the country with the highest corn yield. The second involves identifying specific areas of the country where farms are most dependent on wheat as a source of income. Concluding remarks focus on the implications of these techniques for broader GIS application and areas for further research.

## 2  A Wavelet Transform for Geographic Data Analysis

Wavelets are inherently tied to the concept of multiresolution analysis. Using wavelets, a function can be depicted as a course overall shape, plus detail that ranges from broad to narrow. This ability to zoom in or zoom out that is characteristic of multiresolution analysis permits an image to be interpreted as a sum of details which appear at different resolutions. Furthermore, each scale of resolution may pick up different types of structure in the image.

The à trous DWT (Shensa 1992, Holschneider 1989, Starck, Murtagh, and Bijaoui 1998 ) differs from the standard Discrete Wavelet Transform (DWT) in that decimation or subsampling is not carried out at successive resolution levels, in spite of the fact that the information carried by these resolution levels decreases. In fact, detail coefficients remain as numerous as the input data, at these successive levels. This has great benefits for cartographic representation, as we will see below. This sort of redundancy also aids enormously in feature detection and exploration in the succession of resolution views of the data, which are provided by the detail coefficients. For given data x, a relationship based on the unknown function m using the à trous DWT gives us the following additive decomposition:

 x = m + e = d1 + d2 + ... + dJ + cJ,
(1)

where the residuals ei are assumed i.i.d. with mean zero and variance s2.

Each term here is a vector of length p, or an image of dimensions n × p, or even a data cube. The sum is simply carried out element-wise, or pixel-wise. The additive decomposition is one where the components are the succession of detail coefficients given by dj which are dependent on the resolution level, plus the underlying smooth version of the data which is given by cJ.

In the continuous wavelet transform, CWT, the detail coefficients at any given resolution scale are obtained by convolving (implying translation ) the given data with continuously dilated versions of the wavelet function or mother wavelet . The term wavelet is explained by the fact that some wavelet functions have the appearance of a central bump with one or two diminishing ripples, and others are more like a tidal wave. At any rate, leaving aside such informal description, wavelet functions are nearly always of compact support, and, in order to guarantee an exact inverse transform, produce wavelet coefficients of zero mean at each level.

Above, we have seen that dilated versions of a wavelet function are used to produce d1, d2,...dJ. A scaling function is used to produce cJ. The scaling function has the effect of smoothing the data. The à trous DWT has the nice property that dilated versions of a smoothing function are used to produce a set of successively smoothed versions of the input data, c1, c2, ... , cJ and then the wavelet coefficients can be defined straight away from these just by subtraction: d1 = x - c1, d2 = c1- c2, ... , dJ = cJ-1 - cJ. The scaling function commonly used is a B3 spline, known for its good interpolation properties which extend to irregularly sampled data. (In passing we note that the data which we are studying are irregularly sampled: the discrete convolution with the scaling function - whose shape and support determine the scale-related resolution - provide mappings of our data onto a regular grid.)

Data dependent noise modeling may be based on the noise model posed for the input data, such as (in the additive noise model case) x = { Xi +ei }i = 1n where e need not be i.i.d. Based on such a noise model, significance levels can be determined at successive resolution levels using the classical hypothesis testing approach.

The MR/1 multiresolution image and data analysis package (MR/1 1998, Starck, Murtagh, and Bijaoui 1998) was used in this work. This package handles (raster) image and point pattern data. An extensive range of wavelet and other multiscale transforms are supported (à trous, Mallat, Feauveau, Haar, and others, in various pyramidal and non-pyramidal versions; Laplacian, median and morphological transforms are also available). Data dependent noise models include Gaussian and Poisson, additive and multiplicative, stationary and non-stationary. Supported are such tasks as visualization, filtering, deconvolution, compression, feature and structure finding, and many other operations.

## 3  Data Source

Geographic information systems are increasingly used as tools for topographical applications and research. A comprehensive GIS is characterized by its capabilities in the areas of: (1) data processing, (2) analysis, and (3) post processing. Spatial data is fundamental to the construction of any GIS and involves the integration or collection of data comprised of spatial (geographic identification) and attribute information.

Spatial analysis of relationships between farming and its natural resource base are carried out using data collected from USDA's Agricultural Resource Management Study (ARMS). The ARMS is a personally enumerated survey, conducted since 1984 by the National Agricultural Statistics Service (NASS) and the Economic Research Service (ERS) of the U.S. Department of Agriculture. The ARMS is a probability-based multi frame, stratified survey that uses multiple questionnaire versions to collect information on farm production expenses, capital purchases, income, production practices, and other farm operating characteristics.

Local geographic references for sample points are often omitted from surveys designed to collect economic and financial information. In this instance, some form of data integration is required to associate these attributes with geographic references that are more specific than the usual political boundaries such as State or County. The simplest form of data integration involves attaching geographic references to sample points from survey data. More sophisticated forms of integration can be achieved by overlaying information from one GIS framework to another. In any event, the method of integration involves some assumption about the underlying continuity of sample data and the inherent level of statistical reliability established by the sampling scheme.

The most common geographic reference is the longitude/latitude coordinate system. Location may also be annotated by such systems as ZIP codes or highway mile markers. Any variable that can be located spatially can be fed into a GIS. For this study, geographic references were obtained for each of the 3,100 county population centroids. The assignment of latitude/longitude coordinates for each sample observations is based on the FIPS code. Not all counties contained samples due to survey coverage limitations or absence of agricultural activity. The degree of resolution is much more intense in the eastern half of the contiguous U.S. since the size and number of counties is much greater than in western states (Figure 1).

A final consideration for creating GIS data is map orientation or projection. A projection is a mathematical means of transferring information from the Earth's three-dimensional curved surface to a two-dimensional medium. Different projections are used for different types of maps because each projection differs in the way they handle distance, direction, shape and area. For example, a projection that accurately represents the shapes of the continents will distort their relative sizes. The Universal Transverse Mercator (UTM) projection is one of the most commonly used projections. Visually, it preserves shapes for small areas, but gives a distorted representation of the contiguous United States that is not easily recognized when compared with projections used by most atlas and road map presentations.

The projection selected for this analysis was the Albers-Equal-Area (AEA). The Albers Equal-Area Conic projection uses two standard parallels to reduce some of the distortion produced when only one standard parallel is used. Although neither shape nor linear scale are truly correct, the distortion of these properties is minimized in the region between the standard parallels. The country centroid reference data was originally specified in decimal degrees. The latitude/longitude coordinate system geo-references were converted to AEA projected meter geo-references. For example, the population centroid of Carrol County, Missouri (FIPS 29033) is represented as -39.404 latitude and 93.498 longitude in decimal degrees and 1,822,938.2851079 meters from 23 degrees latitude and 213,456.11668433 meters from -96 degrees longitude.

Another component of data preparation for this study is binning. Binning methods originated with development of the histogram as a means to graphically display univariate data. More recently, binning has been used to reduce computational difficulties associated with applying nonparametric regression techniques for smoothing large data sets([], [], and []). In this context, binning involves reducing the original data to a relatively small number of estimates at equally spaced intervals. While in practice this treatment of sample data has been used to support various kernel-type estimators such as the Average Shifted Histogram ([]), bivariate local linear regression ([]), and Fast Fourier Transform (FFT) computation methods ([]). [] provide examples for density estimation based on wavelet thresholding, while [] incorporated binning directly into a nonparametric wavelet regression estimator.

In addition to computational efficiencies, binning offers other important benefits to practical problems associated with spatial modeling of complex survey data. First, the binning algorithm developed by Scott allows the complex, probability-based sample characteristics to be incorporated in the analysis ([]). The second advantage of this approach is that the selection of bin width or grid spacing allows the practitioner some flexibility in controlling the degree of resolution given the statistical reliability constraints of the sample data. Results by [] for kernel density estimation suggested that the accuracy of binning and the choice of grid size are dependent on the relative smoothness of the underlying density and sample size. Finally, as noted by [], binning can be used to compensate for several restrictions that pertain to widely used wavelet estimators (section 2) such as fixed equidistant design, homoscedasticity, and sample sizes that are a power of two.

In the context of the à trous DWT introduced in section 2, binning arises naturally as the discrete convolution of a scaling function with the given input data. We use, below, a heteroscedastic noise model with this wavelet transform.

## 4  Model Specification and Methodology

Having established a GIS framework from which to examine multivariate relationships, the next step involves implementation of the wavelet transform in a two-dimensional setting. The major goal of this process is to produce a sequence of versions of the binned approximation of the original data at more and more coarse resolutions separating noise from salient features in the data.

Extension of the nonparametric regression principles for wavelet analysis to the two dimensional problems presented by the geographic data system described in the previous section is based on estimating the unknown mean response function:

 ^m (x,y) = E(Z|X = x,Y = y),
(2)
where (x,y) represents the center of one of the geographically referenced bins and Z represents the variable of interest. The preconditioned [Z\tilde] matrix is constructed by calculating the weighted average value within each of the bins as
^
z

j,l1,l2
=
 n i = 1 wiZi

 n i = 1 wi
,
(3)
where n is the number of Zi's in Bj,l1,l2 and wi represents the corresponding sampling weights for each observation. In many cases, the estimate will be mathematically undefined or zero. The first situation occurs when there are no sample observations in the bin, while the latter reflects no positive response for the variable of interest.

The redundant à trous method has a range of useful properties for the handling of sparse spatial data. Compared to other wavelet transforms, we may note the following:

Boundary treatment
The handling of data boundaries in the DWT always requires attention. Implementations of DWT methods which use decimation usually require that the input data dimensions be integer powers of two; or else that the input data be padded to be of such pre-specified dimensions. In the case of the redundant à trous method, we do not have such a constraint. In practice, we usually use reflection in the boundary (the virtual'' value at position n+1 is taken as the known value at n, position n+2 is given by n-1, and so on), which makes the convolutions with the scaling function well-defined at all stages.

Stationary transform
The redundancy involved in the succession of applications of the scaling function, yielding c1, c2, ... , means that we ought not take each translated value into account when constructing the following resolution level: in fact, a sampling scheme based on gaps of {2i}J-1i = 0 unused values between values used in the convolution does the job admirably. Such a sampling scheme is compatible with what would have been the case, had we decimated. (Note that decimation can be carried out with the à trous wavelet transform, if so desired.) This works well all the more so since we are using a scaling function with excellent interpolation properties. It is for this reason that the à trous (with holes'' or gaps) method bears this name.

Computational efficiency
The à trous DWT is of linear computational complexity, i.e. O(n) for n input values (and is thus more efficient than, e.g. the FFT).

Due to redundancy, the à trous transform provides intuitively understandable visualization. The noise, or other, filtering of sub-sampled data can be entirely avoided. And as noted in section 3, we have at our disposal a wide range of powerful data dependent noise models.

## 5  Results

One of the unique features of wavelet analysis is that no matter what the function of interest a number of alternative views that represent detail in the data are presented graphically. Visualization of the wavelet transform provides knowledge about the structure of the data and presents a convenient forum for conducting wavelet-based diagnostics prior to reconstruction. The multiresolution aspect of the transform also permits the detection and parsing of objects within an image.

Figures and show perspective plots of the data as spatially mapped over the surface of the continental United States, from a due south direction, of the wheat dependence and debt data. These provide wheat and debt topographies' at varying resolution scale. Our original data, in both cases, is the sum of the 4 scales shown. The last scale is the most smoothed version retained of the data. The detail or wavelet coefficients are shown in figures and , with positive values only being represented in these figures.

Figure 1: From top to bottom: Wheat dependency data; detail or wavelet coefficients (non-negative values shown) at successive resolution levels, d1, d2, d3, c3

Figure 2: From top to bottom: Debt capacity utilization ratio, DCUR, data; detail or wavelet coefficients (non-negative values shown) at successive resolution levels, d1, d2, d3, c3.

Regression smoothing, or nonparametric function estimation, is necessary for interpretation of the spatially represented data under consideration here. A family of smoothing methods avoids undue dependence on just one estimate ([]). Such a family of smooths provides not just a set of alternative smoothing possibilities, but the transition from one smooth to the next may be important for change point detection and picking out scale-related features in the data. A condition for this is that the analyst can `visually connect'' ([]) successive members of the family. This possibility immediately follows from the à trous wavelet transform. Figures and show such a family of fits to the wheat and debt functions under consideration.

Simple visualizations of the data based on a stationary Gaussian model are portrayed in Figures and . There are various ways to analyze empirical wavelet coefficients that enhance statistical applications such as nonparametric estimation. Graphic displays of wavelet coefficients provide good diagnostics for data analysis since the data or function of interest is completely captured in the wavelet coefficients. Time-scale (scaleogram) and time-frequency plots (spectogram) are useful for studying the tradeoff between time and scale localization ([]). Image querying, multiresolution editing, and multiresolution plots provide alternative means of visually exploring complex relationships in the data. Other standard methods of graphical data analysis such as box plots and Q-Q plots can also be applied to wavelet coefficients. We now turn our attention to more detailed noise modeling, aiming at a definitive regression smooth of our data.

## 6  Discussion

A foundation for the application of wavelets was put forth for the statistical problem of nonparametric function estimation. This provided a framework from which to estimate a smooth nonparametric function which describes complex multivariate relationships embedded in spatial data. The exploratory data analysis inherent in the wavelet transformation and resulting maps provide a means of visually conveying tremendous amounts of information. This portrayal of the complex relationship between economic attributes and demographic characteristics makes an important contribution to understanding the spatial implications of government policy.

The spatial analysis developed and demonstrated in this paper could be augmented by further integration of demographic characteristics and resource-based attributes. These types of applications might include: (1) identifying the proximity of farm businesses that are dependent on off-farm income to areas of rural and urban non-agricultural employment, (2) determining the extent to which highly leveraged farms are located in disaster prone areas such as flood planes, and (3) examining how these geographic dependencies change over time and the extent to which soil quality, weather, and other natural occurrences influence these changes. Implementation would involve either extending the nonparametric model to accommodate higher dimensions or a more complex post-processing activity focused on merging geographic representations.

In comparing density estimation methods (an adaptive kernel density estimator and the maximum penalized likelihood method) and the à trous wavelet transform using extensive one-dimensional simulations, [] find that the wavelet transform performs well when there are local gaps in the data distribution, and when small substructures are superimposed on larger structures. (They also argue for the robustness of kernel methods.) To this we would add the wide range of possibilities for exploratory visualization, exemplified in this paper, when using the wavelet transform. Finally, we would stress the importance of a multiscale transform approach whenever our data is taken as a set of superimposed or compounded resolution scale-related phenomena.

## References

Antoniadis, A., and Pham, D.T., 1995 , Wavelet regression for random or irregular design. Technical Report, University of Grenoble.

Bruce, A. and Gao, H., 1996 , Applied Wavelet Analysis with S-Plus (New York: Springer).

Chui, C. K., 1992 , An Introduction to Wavelets (New York: Academic Press).

Daubechies, I., 1988, Orthonormal bases of compactly supported wavelets. Communications in Pure and Applied Mathematics, 27, 1271-1283.

DeVore, R. A., and Lucier, B. J., 1992 , Fast wavelet techniques for near optimal processing. In Proceedings of the IEEE Military Communications Conference , 48.3.1-48.3.7.

Donoho, D. L., 1993 , Nonlinear wavelet methods for recovery of signals, densities, and spectra from indirect and noisy data. in Proceedings of Symosia in Applied Mathematics , 47, 173-205.

Donoho, D. L., and Johnstone, I. M., 1992 , Nonlinear solution for linear inverse problems by wavelet-vaguelet decomposition. Technical Report 403, Department of Statistics, Stanford University, Stanford, California.

Donoho, D. L., Johnstone, I. M., 1994 , Ideal spatial adaptation via wavelet shrinkage. Biometrika, 81, 425-455.

Donoho et al.(1995)]don95 Donoho, D. L., Johnstone, I. M., Kerkyacharian, G., and Picard, D., 1995 , Wavelet shrinkage: Asymptopia? Journal of the Royal Statistical Society, Series B, 57, 301-369.

Fadda, D., Slezak, E., and Bijaoui, A., 1998 , Desity estimation with non-parametric methods. Astronomy and Astrophysics Supplement Series, 127, 335-352.

Fan, J., and Marron, S., 1994 , Fast implementations of nonparametric curve estimators. Journal of Computational and Graphical Statistics, 3, 35-56.

Hall, P., and Nason G., 1996 , On choosing a non-integer resolution level when using wavelet methods. Preprint, Department of Mathematics, University of Bristol, UK.

Hall, P., and Patil, P., 1996 , On the choice of smoothing parameter, threshold, and truncation in nonparametric regression by nonlinear wavelet methods. Journal of the Royal Statistical Society, Series B, 58, 361-377.

Hall, P., and Wand, M., 1996 , On the accuracy of binned kernel density estimators. Journal of Multivariate Analysis, 56, 165-184.

Härdle, W., and Scott, D., 1992 , Smoothing by weighted averaging of shifted points. Computational Statistics, 7, 97-128.

Holschneider, M., Kronland-Martinet, R., Morlet, J., and Tchamitchian, Ph., 1989 , A real-time algorithm for signal analysis with the help of the wavelet transform. In J.M. Combes,A. Grossman and Ph. Tchamitchian, Eds., Wavelets: Time-Frequency Methods and Phase Space (Berlin: Springer-Verlag),  286-297.

Kerkyacharian, G., and Piccard D., 1992 , Density estimation in Besov spaces. Statistics and Probability Letters, 13, 14-24.

Kerkyacharian, G., and Picard, D., 1993 , Density estimation by kernel and wavelet methods: Optimality of Besov spaces. Statistics and Probability Letters, 18, 327-336.

Mallat, S. G., 1989 , A theory for multiresolution signal decomposition: The wavelet representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11, 674-693.

Marron, J.S., and Chung, S.S., 1997 , Presentation of smoothers: the family approach. Preprint, Department of Statistics, University of North Carolina, Chapel Hill.

Multi Resolutions Ltd., MR/1: The Multiresolution Analysis Software, User Manual, Version 2.0, 200 pp. (http://www.multiresolution.com or email info@multiresolution.com)

Nason, G., 1995 , Choice of the threshold parameter in wavelet function estimation. In Wavelets and Statistics, Eds Antoniadis A., and Oppenheim G. (New York: Springer-Verlag) 281-299.

Ogden, R, T., 1997 , Essential Wavelets for Statistical Applications and Data Analysis (Boston: Birkhäuser).

Pinheiro, A., Vidakovic, B., 1995 , Estimating the square root of a density via compactly supported wavelets. DP 95-14 ISDS, Duke University.

Scott, D. W., 1992 , Multivariate Density Estimation: Theory, Practice, and Visualization (New York: Wiley).

Scott, D. W., and Whittaker, G., 1996 , Multivariate applications of the ASH in regression. Communications in Statistics A: Theory and Methods, 25, 2521-2530.

Shensa, M. J., 1992 , Discrete Wavelet Transforms: Wedding the à trous and Mallat algorithms. IEEE Transactions on Signal Processing, 40, 2464-2482.

Starck, J.-L., Murtagh, F., and Bijaoui, F., 1995 , Multiresolution support applied to image filtering and decomposition. Graphical Models and Image Processing, 57, 420-431.

Starck, J.-L., Murtagh, F., and Bijaoui, F., 1998 , Image Processing and Data Analysis: The Multiscale Approach (Cambridge: Cambridge University Press).

U.S. Department of Agriculture, 1996 , Crop Production, 1996 Summary,CrPr 2-1(97). National Agricultural Statistics Service, Washington, D.C.

Vannucci, M., and Vidakovic, B., 1996 , Preventing the Dirac disaster: wavelet based density estimation. DP96- ISDs, Duke University.

Walter, G. G., 1992 , Approximation of the delta function by wavelets. Journal of Approximation Theory, 71, 329-343.

Walter, G. G., 1994 , Wavelets and Other Orthogonal Systems With Applications (Boca Raton, Florida: CRC Press).

Wand, M. P., and Jones, M. C., 1994 , Kernel Smoothing (London: Chapman & Hall).

Werthenbach, C., and Herrmann, E., 1998 , A fast and stable updating algorithm for bivariate nonparametric curve estimation. Journal of Computational and Graphical Statistics, 7,(1), 61-76.

File translated from TEX by TTH, version 2.00.
On 10 Feb 1999, 11:25.