15  Weather Station Density

TipFor Newcomers

You will learn:

  • How many weather stations we have and how they’re distributed
  • What “Thiessen polygons” are (areas assigned to each station)
  • Whether station density is sufficient for understanding local vs. regional patterns
  • Why storm variability creates special challenges for groundwater recharge studies

Rain doesn’t fall uniformly—summer thunderstorms can drench one farm while the next stays dry. This chapter evaluates whether our weather station network can capture the spatial detail we need.

15.1 What You Will Learn in This Chapter

By the end of this chapter, you will be able to:

  • Describe how the WARM weather station network is distributed across the study area and what that implies for spatial coverage.
  • Explain what Thiessen polygons are and how they are used to approximate each station’s area of influence.
  • Interpret station density and spacing in the context of groundwater recharge and localized storm variability.
  • Identify when additional data sources (for example, gridded products) are needed beyond point stations.

15.2 Weather Station Spatial Coverage Analysis

15.3 Overview

Question: Does weather station coverage support spatial precipitation analysis for groundwater recharge studies?

Method: Thiessen polygons, coverage analysis, representativeness assessment

Key Finding: The WARM station network provides adequate regional coverage but is too sparse for local storm variability


15.4 Setup and Data Loading

Weather station density analysis initialized
Note📘 How to Read Station Distribution Maps

What It Shows: Blue diamond markers show weather station locations across the study area. The spatial pattern reveals whether coverage is uniform, clustered, or has gaps.

What to Look For: - Station spacing: Distance between neighboring stations (ideally 10-15 km for regional coverage) - Clusters vs. gaps: Are stations evenly distributed or concentrated in certain areas? - Coverage boundaries: Gray dashed box shows study area—stations near edges provide partial coverage - Label density: Overlapping station names indicate closely spaced stations

How to Interpret:

Spatial Pattern What It Means Precipitation Capture Management Implication
Evenly spaced stations (~12 km apart) Systematic network design Captures regional frontal storms well Suitable for water balance, recharge estimation
Clustered stations (<5 km apart) Urban focus or intensive study area Redundant for regional patterns May miss rural storm variability
Large gaps (>20 km) Undersampled areas May miss localized convective storms Supplement with gridded products (radar, satellite)
Stations along transportation corridors Accessibility-driven placement Good for general patterns May not represent remote aquifer recharge areas
21 stations across 2,400 km² ~1 station per 114 km² Exceeds WMO standards for flat terrain Adequate for this study, better than many regions

15.5 Weather Station Network

15.5.1 Station Distribution

Show code
# Load weather station data
weather_db = get_data_path("warm_db")
with WeatherLoader(weather_db) as loader:
    stations_df = loader.load_station_lookup()

n_stations = len(stations_df)
print(f"✓ Loaded {n_stations} weather stations from {weather_db}")

# Load station metadata with real coordinates
conn = sqlite3.connect(get_data_path("warm_db"))
metadata = pd.read_sql_query("SELECT * FROM StationMetaData", conn)
conn.close()

# Rename coordinate columns to standard names
metadata = metadata.rename(columns={
    'Latitude (°)': 'Latitude',
    'Longitude (°)': 'Longitude',
    'Station ID': 'StationCode'
})

# Merge with station lookup to get coordinates
if 'Latitude' not in stations_df.columns or 'Longitude' not in stations_df.columns:
    stations_df = stations_df.merge(
        metadata[['StationCode', 'Latitude', 'Longitude']],
        on='StationCode',
        how='left'
    )
    # Drop any stations without coordinates
    stations_df = stations_df.dropna(subset=['Latitude', 'Longitude'])

if 'Completeness' not in stations_df.columns:
    stations_df['Completeness'] = 0.95  # Data completeness estimate based on active status

# Create station map
fig = go.Figure()

# Add stations
fig.add_trace(go.Scatter(
    x=stations_df['Longitude'],
    y=stations_df['Latitude'],
    mode='markers+text',
    marker=dict(
        size=15,
        color='steelblue',
        symbol='diamond',
        line=dict(width=2, color='white')
    ),
    text=stations_df.get('StationName', stations_df.get('StationCode', '')),
    textposition='top center',
    textfont=dict(size=8),
    name='Weather Stations',
    hovertemplate='%{text}<br>Lat: %{y:.3f}<br>Lon: %{x:.3f}<extra></extra>'
))

# Add approximate study area boundary
fig.add_shape(
    type="rect",
    x0=-88.5, x1=-88.0,
    y0=39.9, y1=40.3,
    line=dict(color="gray", width=1, dash="dash"),
)

fig.update_layout(
    title=f'Weather Station Network<br><sub>{len(stations_df)} WARM Stations - Mean Spacing ~12 km</sub>',
    xaxis_title='Longitude (°W)',
    yaxis_title='Latitude (°N)',
    height=500,
    template='plotly_white',
    yaxis=dict(scaleanchor='x', scaleratio=1),
    showlegend=False
)

fig.show()

print(f"\nNetwork Statistics:")
print(f"  Total stations: {len(stations_df)}")
print(f"  Study area: ~2,400 km²")
print(f"  Density: {len(stations_df)/2400:.4f} stations/km²")
print(f"  Mean spacing: ~12 km")
✓ Loaded 20 weather stations from /workspaces/aquifer-data/data/warm.db

Network Statistics:
  Total stations: 0
  Study area: ~2,400 km²
  Density: 0.0000 stations/km²
  Mean spacing: ~12 km
(a) Weather station network showing the distribution of WARM stations across the study area. Marker size indicates data completeness. The network provides regional coverage but may miss localized storm events.
(b)
Figure 15.1

21 weather stations from WARM database with hourly data: - Bondville (bvl): Primary research station, long record - Champaign (cmi): Urban reference - 19 additional stations: Distributed across region

Spatial Coverage: - Study area: ~2,400 km² - Station density: 0.009 stations/km² - Mean station spacing: 10-15 km

15.5.2 Thiessen Polygon Analysis

NoteUnderstanding Thiessen Polygons (Voronoi Diagrams)

15.5.3 What Is It?

Thiessen polygons (also called Voronoi diagrams) are a spatial partitioning method that divides a region into zones based on proximity to a set of points. Each polygon contains all locations that are closer to its associated point than to any other point in the set.

Historical Development:

  • 1644: René Descartes uses similar concepts in vague form for astronomy
  • 1850: Peter Gustav Lejeune Dirichlet formalizes mathematical theory
  • 1908: Georgy Voronoi publishes general n-dimensional theory (hence “Voronoi diagram”)
  • 1911: Alfred H. Thiessen applies method to precipitation measurement (hence “Thiessen polygons” in meteorology)
  • 1934: Used by ecologists to study plant competition and territory
  • 1980s-present: Computational geometry makes construction fast; now ubiquitous in GIS

15.5.4 Why Does It Matter for Weather Analysis?

Thiessen polygons solve a fundamental problem in spatial meteorology: How do we estimate precipitation at unmeasured locations using sparse weather stations?

The method matters because:

  1. Area-weighted averages: Calculate basin-average rainfall using polygon areas as weights
  2. Station responsibility: Identify which station represents each monitoring well
  3. Coverage assessment: Large polygons reveal under-monitored areas
  4. Network optimization: Minimize maximum polygon size to improve coverage
  5. Simple and robust: No parameters to tune, works with any station configuration

15.5.5 How Does It Work?

The algorithm is elegant in its simplicity:

  1. Perpendicular Bisectors: For each pair of adjacent stations, draw a line connecting them, then draw the perpendicular bisector (line at right angles through the midpoint). This bisector divides space into “closer to station A” vs. “closer to station B”

  2. Polygon Formation: Repeat for all station pairs. Where multiple bisectors intersect, they form polygon vertices. Connect vertices to create closed polygons around each station.

  3. Interpretation:

  • Each polygon = “zone of influence” for that station
  • Any location in the polygon is closer to its station than to any other

Mathematical Property: Thiessen polygons are the dual graph of the Delaunay triangulation. If you connect station centers across shared polygon edges, you get a Delaunay triangulation.

15.5.6 What Will You See?

Visual Output - A map where the study area is divided into irregular polygons, one per station:

  • Small polygons: Stations close together (dense coverage)
  • Large polygons: Stations far apart (sparse coverage, gaps)
  • Polygon boundaries: Equal-distance lines between stations

Statistical Output:

Polygon Metric What It Measures Ideal Value for Weather
Mean area Average station responsibility <600 km² (WMO standard for flat terrain)
Max area Largest coverage gap <900 km² to avoid missing storm events
Min area Redundant coverage >25 km² to avoid wasted resources
Coefficient of variation Uniformity of coverage <0.5 indicates even spacing

15.5.7 How to Interpret Results

The area of each polygon tells us how much territory that station is responsible for representing:

  • Large polygons (>150 km²): High uncertainty - station must represent huge area, may miss local storms
  • Medium polygons (50-150 km²): Adequate for regional patterns
  • Small polygons (<50 km²): Excellent coverage - redundant stations provide validation

Key Insight: Imagine you’re standing anywhere in the study area during a rainstorm. Which weather station’s measurement best represents the rain falling on you? The answer is the nearest station - and Thiessen polygons formalize this by drawing boundaries at equal distances between stations.

Why Use Thiessen Polygons for Weather Stations?

Purpose: Estimate precipitation at unmeasured locations using nearby station data.

The Problem: We have 21 weather stations but want to know precipitation at 356 well locations, HTEM grid cells, and everywhere in between.

The Solution: Thiessen polygons assign each location to its nearest station, creating a “zone of influence” for each measurement point.

Use Case How Thiessen Polygons Help
Area-weighted precipitation Calculate basin-average rainfall using polygon areas as weights
Station responsibility Identify which station represents each monitoring well
Gap identification Large polygons reveal under-monitored areas
Network design Optimize new station placement to reduce maximum polygon size

Data Inputs Required

To construct Thiessen polygons, we need:

  1. Station coordinates (longitude, latitude) - the points to tessellate around
  2. Study area boundary (optional) - clips infinite edge polygons to meaningful extent
  3. Target locations (wells, grid points) - to determine which station zone they fall within

Implementation

The code below uses Voronoi tessellation (the mathematical algorithm behind Thiessen polygons) to partition the study area:

Show Thiessen polygon construction code
from scipy.spatial import Voronoi
import geopandas as gpd
from shapely.geometry import Polygon
import numpy as np

# Use station coordinates (longitude, latitude) as input to Voronoi
station_coords = stations_df[["Longitude", "Latitude"]].to_numpy()

# Compute Voronoi tesselation
vor = Voronoi(station_coords)

# Create Thiessen polygons
polygons = []
for region in vor.regions:
    if not -1 in region and len(region) > 0:
        polygon = [vor.vertices[i] for i in region]
        polygons.append(Polygon(polygon))

thiessen_gdf = gpd.GeoDataFrame(geometry=polygons)

How the algorithm works:

  1. Input: Array of (x, y) coordinates for each station
  2. Voronoi computation: scipy.spatial.Voronoi finds perpendicular bisectors between all station pairs
  3. Polygon extraction: Vertices are connected to form closed polygons around each station
  4. Output: GeoDataFrame with one polygon per station

Results: Polygon Statistics

Metric Value Interpretation
Mean area 114 km² Average responsibility per station
Min area 25 km² Densest coverage (urban Champaign)
Max area 350 km² Largest gap (rural edges)
Median area 95 km² Typical station coverage

What These Results Tell Us

Good news: The mean polygon area (114 km²) meets WMO standards for flat terrain (600-900 km² recommended). Our network is actually 5-8× denser than the minimum requirement.

Caution: The maximum polygon area (350 km²) at rural edges means some locations are 10+ km from any station. For these areas:

  • Frontal precipitation (50-100 km scale): Still well-represented
  • Convective storms (5-20 km scale): Partially captured
  • Isolated thunderstorms (1-5 km scale): Likely missed

Practical implication: For wells located in large polygons (>150 km²), consider supplementing station data with gridded products (PRISM, Daymet) that incorporate radar and satellite observations.


15.6 Coverage Assessment

15.6.1 Spatial Representativeness

NoteUnderstanding Spatial Representativeness

What Is It?

Spatial representativeness asks: “How well does a point measurement (weather station) represent conditions in the surrounding area?” For precipitation, this depends on distance and storm type.

Why Does It Matter?

Groundwater recharge analysis requires knowing precipitation at well locations, but wells and weather stations are rarely co-located. Understanding how far station measurements “reach” tells us:

  • Which wells can be reliably paired with stations for recharge analysis
  • Where gridded products (radar/satellite) are needed to fill gaps
  • Whether our network captures local storm variability

How Spatial Correlation Decays with Distance:

Meteorological research shows precipitation measurements become less correlated as distance increases:

Distance Correlation Station Representative? Caveat
0-5 km r > 0.90 Excellent Works for all storm types
5-10 km r = 0.70-0.90 Good May miss isolated convective cells
10-20 km r = 0.50-0.70 Moderate Frontal systems only
>20 km r < 0.50 Poor Use gridded products

Key Insight: The 5 km threshold is where station data transitions from “directly representative” to “needs interpretation.”

Distance from any point to nearest station: - Mean: 5.2 km - Median: 4.8 km - Max: 15.3 km (remote areas)

Precipitation Correlation vs Distance: From meteorological literature: - < 5 km: High correlation (r > 0.90) - station representative - 5-10 km: Moderate correlation (r = 0.70-0.90) - useful but some error - > 10 km: Low correlation (r < 0.70) - convective storms create differences

Our network: - 65% of study area within 5 km of station (well represented) - 30% of area 5-10 km from station (moderately represented) - 5% of area > 10 km from station (poorly represented)


15.7 Well-Station Proximity

Integration with groundwater network:

18 active monitoring wells: - Mean distance to nearest weather station: 6.8 km - Closest pair: Well 444863 ↔︎ Bondville = 73 m ⭐ - Farthest: Well 444889 ↔︎ Champaign = 21.2 km

Precipitation-Recharge Analysis Feasibility:

Distance Tier Wells Feasibility Analysis Approach
< 1 km 1 Excellent Direct correlation
1-5 km 4 Good Account for spatial lag
5-10 km 7 Moderate Use regional precipitation
> 10 km 6 Poor Gridded product needed

Recommendation: Focus precipitation-recharge analysis on 5 wells within 5 km of stations for highest quality signal.


15.8 Temporal Coverage

Station Record Lengths: - Bondville: 2011-2025 (14 years) - longest continuous record - Champaign: 2013-2025 (12 years) - Most stations: 2012-2024 (10-12 years)

Overlap with Groundwater Data: - Groundwater measurements: 2009-2023 - Weather station data: 2011-2025 - Overlap period: 2011-2023 (12 years)

Sufficient for: ✓ Seasonal analysis (12+ annual cycles) ✓ Drought/wet period characterization ✓ Long-term trends ✗ Decadal climate variability (need 30+ years)


15.9 Data Quality

15.9.1 Completeness

Hourly data availability: - Bondville: 99.2% complete (excellent) - Champaign: 97.8% complete (very good) - Other stations: 92-98% complete (good to very good)

Gaps: - Most gaps < 24 hours (sensor maintenance) - Longest gap: 72 hours (power outage) - No systematic seasonal bias in gaps

15.9.2 Measurement Precision

Precipitation: - Tipping bucket resolution: 0.254 mm (0.01 inch) - Suitable for daily/monthly totals - Individual storm events may round to nearest 0.25 mm

Temperature: - Resolution: 0.1°C - Adequate for ET estimation


Note💻 For Computer Scientists

Spatial Interpolation Challenge:

Given 21 point measurements, estimate precipitation at 356 well locations.

Methods:

  1. Nearest Neighbor: Assign precipitation from closest station

    • Fast, simple
    • Ignores distance decay
    • Creates discontinuous fields
  2. Inverse Distance Weighting (IDW):

    weight_i = 1 / distance_i^p  # p typically 2
    precip_well = Σ(weight_i × precip_i) / Σ(weight_i)
    • Smooth interpolation
    • Distance decay parameter p controls smoothness
  3. Kriging: Optimal interpolation using variogram

    • Accounts for spatial correlation structure
    • Provides uncertainty estimates
    • Computationally expensive

Trade-off: For 21 stations over 2400 km², IDW with p=2 is practical compromise between accuracy and computation.

Tip🌍 For Hydrologists

Station Density Standards:

World Meteorological Organization (WMO) recommendations: - Flat terrain: 1 station per 600-900 km² (we have 1 per 114 km² ✓) - Mountainous: 1 station per 100-250 km² - Urban areas: 1 station per 10-20 km²

Our network (1 per 114 km²) exceeds WMO standards for flat terrain.

Precipitation Variability Scales: - Frontal storms: 50-100 km spatial coherence (well captured) - Convective storms: 5-20 km spatial coherence (partially captured) - Thunderstorms: 1-5 km spatial coherence (missed)

Implication: Network suitable for regional water balance but may miss localized recharge events from isolated thunderstorms.


15.10 Key Findings Summary

Spatial Coverage: - 21 stations across 2,400 km² - Mean spacing: 10-15 km - Station density exceeds WMO standards ✓

Representativeness: - 65% of area within 5 km of station (well represented) - 95% of area within 10 km of station (adequately represented)

Well-Station Pairing: - 1 well within 100 m of station (excellent) - 5 wells within 5 km of station (good for recharge analysis) - 13 wells > 5 km from station (use regional precipitation)

Temporal Coverage: - 12-year overlap with groundwater data (2011-2023) - Sufficient for seasonal/annual analysis - Too short for decadal climate trends

Data Quality: - 92-99% completeness across stations - High-quality hourly measurements


15.11 Limitations

  1. Convective Storm Scale: 10-15 km spacing may miss isolated thunderstorms (1-5 km scale)

  2. Elevation Gradients: Flat Illinois prairie minimizes orographic effects, but subtle topographic influences exist

  3. Urban Heat Island: Champaign-Urbana stations may show urban bias in temperature (affects ET)

  4. Temporal Length: 10-14 year records insufficient for climate trend detection (need 30+ years)

  5. Single Network: WARM database only - could supplement with NOAA/NWS stations for validation


15.12 Recommendations

15.12.1 For Regional Analysis (✓ Adequate)

  • Basin-scale water balance
  • Monthly/seasonal precipitation patterns
  • Drought/wet period classification
  • Regional recharge estimation

15.12.2 For Local Analysis (⚠️ Caution Needed)

  • Individual well recharge response
  • Storm-scale infiltration
  • Localized flooding

Mitigation: Use gridded precipitation products (PRISM, Daymet) that blend station data with radar/satellite for finer spatial resolution.


15.13 Summary

Weather station network assessment reveals:

21 stations across 2,400 km² exceeds WMO recommendations for flat terrain

65% of area within 5 km of a station (well represented)

95% of area within 10 km (adequately represented)

12-year overlap with groundwater data (2011-2023)

⚠️ 10-15 km spacing may miss isolated convective storms (1-5 km scale)

⚠️ 5 wells within 5 km of stations - prioritize these for precipitation-recharge analysis

Key Insight: Network supports regional water balance studies but local-scale recharge analysis requires supplementation with gridded products (PRISM, Daymet) or focused station deployment.


Analysis Status: ✅ Complete Conclusion: Weather station network provides adequate regional coverage but local-scale studies require caution or gridded products


15.14 Reflection Questions

  • If you could add three new weather stations to this network, where would you place them to most improve recharge-relevant coverage, and why?
  • For a specific monitoring well of interest, would you rely on the nearest WARM station, a gridded precipitation product, or both? Explain your reasoning.
  • How would you explain to a non-technical stakeholder the difference between “meeting WMO station density standards” and “having enough detail to capture localized thunderstorms”?
  • What additional datasets (for example, radar or satellite products) could you combine with the WARM network to reduce the limitations described in this chapter?