Introduction to Spatial Statistics

Reflection of Introduction to Spatial Statistics

by TAN SU MEI

Tags: Introduction to Spatial Statistics

Lecture 1 - Introduction to Spatial Statistics

1.1 Conventional vs Spatial Statistics

Conventional statistics can be used to describe based on sample although it was an unknown population parameter. Conventional statistics is the science of conducting science to collect, organize, summarize, analyze, and draw conclusions from data. However, spatial statistics is a collection of statistical techniques designed to investigate the spatial patterns displayed by various types of phenomena. It mainly involves descriptive spatial statistics, spatial pattern analysis, spatial regression, and spatial interpolation usually categorized as geostatistics.

Conventional statistics focuses on analyzing independent observations without considering spatial relationships. It is used when the spatial context is not relevant to the analysis. For example, it can be used to analyze the heights of individuals in a population. Spatial statistics take into account the spatial context or relationships associated with each observation. It is used to analyze data that exhibit spatial patterns or dependencies. For example, it can be used to examine the incidence of disease across different regions, where the disease occurrence may be influenced by spatial proximity.

1.2 Spatial Statistics and GIS Science

Spatial statistics focuses on the statistical analysis of spatial patterns and relationships in data, while GIS science involves managing, analyzing, and visualizing geographic data using software systems. Spatial statistics helps uncover hidden patterns and relationships in spatial data, while GIS science provides tools and techniques for organizing, manipulating, and visualizing spatial data. Together, they enhance our understanding of spatial phenomena and support decision-making in various fields.

1.3 Nature of geospatial data

Geospatial data is characterized by its location, geometry, attributes, scale, resolution, and temporal dimension. It represents features on the Earth's surface and their associated information. Geospatial data can be in the form of points, lines, or polygons, and can be represented as raster or vector data.

Lecture 2 - Sampling Theory

2.1 Population, Sample and other statistical Concepts

Population consists of all subjects that are being studied and sample is a group of subjects selected from a population

Variable is the characteristics or attributes that can assume different values while the random variables is the variables whose values are determined by pure chance.

The parameter is an unknown fixed value of a population variable.

Statistics is a known value of sample variable that represents the population.

Probability is used in inferential statistics, for showing the chance of an event occurring or not occurring.

Distribution is functional probable occurences of an event when it si repeatedly infinitely. It is generally bell-shaped.

There are two types of sample which are independent sample and dependent sample.

2.2 Types of Variables

We have qualitative and quantitative variables. Qualitative can be placed into distinct categories, according to some characteristic or attribute. For example, nominal.

Quantitative is the numerical that can be ordered or ranked. For example, ordinal, interval, and ratio.

2.3 The theory of Sampling

1. Simple Random Sampling: Each individual has an equal chance of being selected.

2. Stratified Random Sampling: The population is divided into subgroups, and samples are randomly selected from each subgroup.

3. Systematic Sampling: Elements are selected at regular intervals from a randomly chosen starting point.

4. Cluster Sampling: Clusters or groups are randomly selected, and all individuals within the selected clusters are included in the sample.

5. Multi-stage Sampling: It involves a combination of different sampling techniques, often starting with cluster sampling and further sampling within selected clusters.

6. Convenience Sampling: Individuals or units are selected based on convenience and accessibility.

Lecture 3 - Point Data Analysis

3.1 Spatial Density

WhatsApp Image 2023-07-16 at 19.26.19.jpeg

3.2 Spatial Centography

Mean centre is the mean of X and Y coordinates for a set of points. Weighted mean centre is produced by weighting each x and y coordinate by another variable.

Formula of mean centre:

WhatsApp Image 2023-07-16 at 19.35.32.jpeg

Formula of weighted mean centre:

WhatsApp Image 2023-07-16 at 19.35.32 (1).jpeg.1

3.3 Spatial Pattern

Quadrat analysis is a method to assess the distribution of objects by dividing an area into grids and counting the objects in each grid to determine if the distribution is random or non-random.

If:

VMR close to 0, pattern is dispersed.

VMR around 1, pattern is random.

VMR above 1, pattern is clustered.

Nearest neighbor analysis quantifies the spatial arrangement of points by calculating the average distance to the nearest neighboring point, helping identify clustering or dispersion patterns compared to random distribution.

If:

NNI=1, random pattern.

NNI=2.149, dispersed pattern.

NNI=0, clusteres pattern.

Lecture 4 - Line Data Analysis

Strahler Stream

-All fingertip tributaries are given first order

-Order streams are defined by increasing the order of 2 streams of equal meet each time.

-A stream can receive tributaries of lower order without any change in its order.

Here's one of the example with formula:

WhatsApp Image 2023-07-30 at 23.05.36.jpeg

The next measure is based on the law of stream numbers.

Example:

Circuit Network

-Consists of a number of line segments that are interconnected.

-Each line segment is defined by a start/end node.

-Line segment may contain intermediate points of known locations.

WhatsApp Image 2023-07-30 at 22.57.58.jpeg.1

Network Connectivity

-The connectivity of a network can be examined by constructing a connectivity matrix, C.

-May increase the matrix according to the requirements.

Lecture 5 - Area Data Analysis

There are two types of spatial autocorrelation:

1. Global Spatial Autocorrelation

Global spatial autocorrelation examines the overall spatial pattern across the entire study area. It assesses whether the variable of interest exhibits spatial dependence or clustering at a global scale. This analysis considers the entire dataset as a whole and provides a single measure or statistic that represents the overall spatial pattern.
A common measure used to assess global spatial autocorrelation is Moran's I, which calculates the correlation between a variable at one location and the average value of the variable across all locations. Moran's I ranges from -1 (indicating perfect dispersion) to +1 (indicating perfect clustering), with 0 representing no spatial autocorrelation.

Global spatial autocorrelation helps identify broad-scale spatial patterns and can be useful for understanding the overall structure and processes shaping the distribution of a variable across the study area.

2. Local Spatial Autocorrelation

Local spatial autocorrelation focuses on analyzing spatial patterns at a local scale by examining individual locations within the study area. It investigates whether certain locations exhibit significant clustering or dispersion compared to the neighboring locations.
One commonly used measure for local spatial autocorrelation is the Local Moran's I, which calculates the correlation between a variable at one location and the average value of the variable in its surrounding neighborhood. Local Moran's I produces a statistic for each location, indicating whether it exhibits significant clustering (high-high or low-low) or dispersion (high-low or low-high) patterns.

Local spatial autocorrelation helps identify specific areas or hotspots of high or low values, providing insights into the presence of localized patterns or spatial outliers that may not be captured by global measures.

Joint Count Statistics

Join count statistics is a spatial analysis technique used to assess the clustering or dispersion of similar values or categories in a geographic area. It counts the number of neighboring locations that share the same category and compares it to what would be expected by chance. This analysis helps determine if there is significant spatial autocorrelation and provides insights into the pattern and relationships of categorical data.

WhatsApp Image 2023-07-30 at 23.22.59.jpeg.1

Moran's I Spatial Autocorrelation

-Mainly used for interval or ratio type of data.

Negative autocorrelation/ Disprese = -1

No autocorrelation/ Random = 0

Positive autocorrelation/Cluster = 1

Geary's C Spatial Autocorrelation

No autocorrelation/ random = 1

Positive autocorrelation/ clustered = 0

Negative autocorrelation/ dispersed = 2

Expected is always 1!

Updated on 30 July 2023, 11:28 PM; 73 page visits from 16 July 2023 to 16 January 2026