14  Stream Proximity Analysis

TipFor Newcomers

You will learn:

  • How streams and groundwater interact (gaining vs. losing streams)
  • Why monitoring wells need to be near stream gauges for interaction studies
  • What happens when monitoring networks are designed independently
  • A real example of how data gaps block important analysis

This chapter tells an important cautionary tale: we tried to study stream-groundwater interactions and discovered we couldn’t—not because of complex science, but because the wells and stream gauges are too far apart. Sometimes the most important finding is what we can’t analyze.

14.1 What You Will Learn in This Chapter

By the end of this chapter, you will be able to:

  • Describe how the current placement of wells and stream gauges limits direct stream–groundwater interaction studies.
  • Summarize the spatial proximity between gauges and wells, and why distances of 3–25 km are too large for correlation analysis.
  • Explain why this is a data-availability/network-design problem rather than a methodological limitation.
  • Identify what kind of purpose-built monitoring network would be needed to answer stream–aquifer interaction questions in this region.

14.2 Stream-Groundwater Interaction Study

Merging well-station proximity analysis with stream-groundwater interaction assessment to understand spatial monitoring capabilities.

14.3 Executive Summary

Important🔍 Critical Discovery Spatial

We attempted to analyze stream-groundwater interactions by correlating stream stage with nearby groundwater levels. We discovered we cannot perform this analysis - not because of methodology limitations, but because of a fundamental gap in monitoring network design.

The Finding: - 41 stream-well pairs exist within 5 km (ideal for interaction studies) - NONE of these wells have measurement data (0/41) - Wells WITH data are 3-25 km away from stream gauges - Monitoring networks were designed independently for different purposes

Why This Matters: This spatial gap explains why we cannot directly test whether streams are gaining (GW→stream) or losing (stream→GW) - a fundamental question for water resource management.


14.4 The Research Question

14.4.1 What We Wanted to Test

Hypothesis: Two-aquifer system - Wells monitor deep confined aquifer (Unit D, rising) - Streams interact with shallow unconfined aquifer (declining)

Test: If hypothesis is correct, stream stage should show weak or no correlation with nearby deep well levels.

14.4.2 The Ideal Analysis

Method: Cross-correlation analysis 1. Find wells within 1-5 km of stream gauges 2. Correlate stream stage with groundwater level over time 3. Calculate lag time (how long for stream changes to affect GW, or vice versa) 4. Classify reaches: - Gaining: GW level > stream stage → groundwater feeds stream - Losing: Stream stage > GW level → stream recharges aquifer - Neutral/Disconnected: No correlation

Expected Result: If two-aquifer system exists, we’d see weak correlations because streams and wells monitor different layers.


14.5 Interactive Visualizations

Note📘 How to Read This Map

What It Shows: Red markers show USGS stream gauge locations—these are the points where we have continuous stream flow and stage measurements. The map reveals which streams are monitored vs. unmonitored.

What to Look For: - Clustered red markers: Multiple gauges on the same stream system (e.g., Boneyard Creek) - Isolated red markers: Single gauge representing a larger watershed - Gaps between markers: Unmonitored stream reaches or tributaries

How to Interpret:

Map Pattern What It Means Data Availability Study Limitations
Red markers near urban areas Flood monitoring priority (Champaign-Urbana) High-quality, high-frequency data Urban influence may bias natural stream-aquifer interaction
Red markers at watershed outlets Integrates entire upstream drainage Total discharge captured Cannot distinguish local vs. regional contributions
Large areas with no red markers Small streams ungauged No direct stream data Must use regional discharge estimates or install new gauges
Red marker + well location overlap Potential for stream-GW correlation Ideal for interaction studies Unfortunately rare in this network (see analysis below)
Show code
usgs_root = get_data_path("usgs_stream")
usgs_loader = USGSStreamLoader(usgs_root)
sites = usgs_loader.sites[
    ["site_no", "station_nm", "dec_lat_va", "dec_long_va"]
].drop_duplicates()

fig = px.scatter_mapbox(
    sites,
    lat="dec_lat_va",
    lon="dec_long_va",
    hover_name="station_nm",
    hover_data={"site_no": True, "dec_lat_va": ":.4f", "dec_long_va": ":.4f"},
    color_discrete_sequence=["red"],
    zoom=9,
    height=500,
)

fig.update_layout(
    mapbox_style="open-street-map",
    title="USGS Stream Gauge Network",
    showlegend=False,
)

fig.show()

print("\n**Stream Gauge Summary:**")
print(f"- Total USGS stream gauges: {len(sites)}")
print("- Monitoring major tributaries and urban streams")

**Stream Gauge Summary:**
- Total USGS stream gauges: 9
- Monitoring major tributaries and urban streams
(a) USGS stream gauge locations across the study area. Gauges monitor major streams including Boneyard Creek, Salt Fork, and regional tributaries.
(b)
Figure 14.1
Note📘 How to Read Time Series Discharge Patterns

What It Shows: This line graph displays stream discharge (flow rate) over time measured in cubic feet per second (cfs). Each colored line represents a different stream gauge station.

What to Look For: - Sharp spikes: Rapid discharge increases from storm events (flashy response) - Baseline flow: Low-flow periods between storms represent baseflow (groundwater contribution) - Seasonal patterns: Higher discharge in spring (snowmelt + rain), lower in summer/fall - Different colored lines: Compare how different streams respond to the same precipitation events

How to Interpret:

Discharge Pattern What It Means Stream-Aquifer Connection Implication for GW Studies
Sharp spikes with rapid return to baseline Flashy urban stream (e.g., Boneyard Creek) Minimal groundwater buffering Stream responds to surface runoff, not deep aquifer
Gradual rise and slow recession Groundwater-dominated stream Strong aquifer-stream connection Good candidate for GW-stream interaction studies
High baseline (>20 cfs) between storms Substantial baseflow contribution Aquifer discharging to stream (gaining) Deep aquifer may sustain stream during droughts
Very low baseline (<5 cfs) Limited baseflow Stream may recharge aquifer (losing) Aquifer disconnected or stream perched above water table
Baseline declining over time Decreasing groundwater contribution Potential aquifer depletion Concerning trend—investigate pumping or recharge changes
Show code
usgs_root = get_data_path("usgs_stream")
usgs_loader = USGSStreamLoader(usgs_root)

# Use up to three example sites for visualization
site_nos = usgs_loader.get_site_list()[:3]
discharge_frames = []

for site_no in site_nos:
    df = usgs_loader.load_daily_discharge(site_no)
    if not df.empty:
        df = df.copy()
        df["gauge"] = site_no
        discharge_frames.append(df)

all_discharge = pd.concat(discharge_frames, ignore_index=True)
all_discharge = all_discharge[all_discharge["date"] >= "2020-01-01"]

fig = px.line(
    all_discharge,
    x="date",
    y="discharge_cfs",
    color="gauge",
    labels={
        "date": "Date",
        "discharge_cfs": "Discharge (cubic feet/second)",
        "gauge": "Stream Gauge",
    },
    title="Stream Discharge Patterns (2020–Present)",
)

fig.update_layout(height=400, hovermode="x unified")
fig.show()

stats = all_discharge.groupby("gauge")["discharge_cfs"].agg(
    ["mean", "std", "min", "max"]
)
print("\n**Discharge Statistics (2020–Present):**")
for gauge in stats.index:
    print(f"\n{gauge}:")
    print(f"  - Mean: {stats.loc[gauge, 'mean']:.1f} cfs")
    print(f"  - Std Dev: {stats.loc[gauge, 'std']:.1f} cfs")
    print(
        f"  - Range: {stats.loc[gauge, 'min']:.1f} "
        f"- {stats.loc[gauge, 'max']:.1f} cfs"
    )
Figure 14.2: Daily discharge patterns from USGS stream gauges showing temporal variability in streamflow. High variability indicates flashy response to precipitation events.

**Discharge Statistics (2020–Present):**

03336890:
  - Mean: 31.5 cfs
  - Std Dev: 72.0 cfs
  - Range: 1.1 - 1190.0 cfs

03336900:
  - Mean: 100.8 cfs
  - Std Dev: 188.9 cfs
  - Range: 0.8 - 2420.0 cfs

14.6 Data Used

14.6.1 USGS Stream Gauge Network

9 USGS Stream Gauges with gage height (stage) data: - Boneyard Creek: 4 stations (urban Champaign-Urbana) - Salt Fork, Spoon River, Sangamon River: Regional coverage - Period of Record: 1979-2025 (46 years)

Spatial Coverage: Excellent - streams distributed across study area

14.6.2 Groundwater Monitoring Network

356 wells with valid GPS coordinates Only 18 wells (5%) have water level measurements Measurement period: 2009-2023 (overlaps with stream data)

Critical Observation: The vast majority of wells are observation points without active sensors.


14.7 Method

NoteUnderstanding Buffer Analysis for Stream-Groundwater Proximity

What Is It?

Buffer analysis is a GIS technique that creates zones of specified distance around features (points, lines, polygons). In hydrogeology, buffers identify which monitoring points are close enough to detect interactions between surface water and groundwater.

Brief History:

Buffer analysis emerged with early GIS systems in the 1970s (pioneered by Jack Dangermond at ESRI and Roger Tomlinson’s Canada Geographic Information System). It became essential for environmental studies in the 1980s when EPA began using buffers for wellhead protection zones. Today, it’s a standard tool in every GIS platform.

Why Does It Matter for Stream-Aquifer Studies?

The distance between a stream and a monitoring well determines what we can measure:

  • <100m: Direct interaction (capture daily stream stage fluctuations)
  • 100m-1km: Local interaction (detect seasonal exchange patterns)
  • 1-5km: Regional influence (see long-term trends, large events)
  • >5km: No detectable interaction (signal too weak, too delayed)

For this study: We need wells within 1-5 km of stream gauges to test whether streams interact with the deep confined aquifer (Unit D) or only the shallow unconfined system.

How Does It Work?

The buffer analysis process:

  1. Define threshold distance: Based on hydrogeological theory (typically correlation length)
  2. Calculate pairwise distances: Between all stream gauges and all wells
  3. Filter by threshold: Keep only pairs within acceptable distance
  4. Check data availability: Verify wells have temporal measurements
  5. Assess adequacy: Do we have enough pairs for statistical analysis?

What Will You See?

Proximity analysis results show spatial coverage:

Distance Category Expected Interaction Strength Analysis Capability Data Quality Needed
0-500m Very strong (hours-days lag) Direct hydraulic connection Sub-daily measurements
500m-2km Strong (days-weeks lag) Seasonal exchange patterns Daily measurements
2-5km Moderate (weeks-months lag) Long-term trends Weekly measurements
5-10km Weak (months-years lag) Regional background Monthly measurements
>10km None (signal lost in noise) No correlation expected N/A

How to Interpret Results:

  • Many pairs <1km with data: Ideal - can quantify stream-aquifer exchange rates
  • Pairs 1-5km with data: Good - can detect interaction, limited process detail
  • Pairs 3-10km with data: Marginal - correlation analysis possible but weak
  • Pairs >10km or no data: Inadequate - cannot study interaction

14.7.1 Step 1: Calculate Spatial Proximity

Objective: Find stream-well pairs close enough for interaction studies

Threshold: 5 km (literature suggests significant interaction within 1-5 km)

from scipy.spatial.distance import cdist

# Calculate distances
station_coords = stations[['dec_lat_va', 'dec_long_va']].values
well_coords = wells_all[['lat', 'lon']].values
distances_km = cdist(station_coords, well_coords, metric='euclidean') * 111

# Create pairs within 5 km
pairs_within_5km = distances_km[distances_km <= 5.0]

Result: 41 stream-well pairs within 5 km - perfect for interaction studies!

Closest Pairs: - Station 03337000 ↔︎ Well 381635: 0.56 km - Station 03337000 ↔︎ Well 381636: 0.57 km - Station 03337100 ↔︎ Well 381635: 0.67 km

Interpretation: Excellent spatial proximity - if data existed, we could analyze interaction at <1 km scale.


14.7.2 Step 2: Check Data Availability

Objective: Verify temporal overlap between stream stage and well water levels

# Check which wells have data
wells_with_measurements = pd.read_sql_query("""
    SELECT P_Number, COUNT(*) as num_measurements
    FROM OB_WELL_MEASUREMENTS_CHAMPAIGN_COUNTY
    WHERE Water_Surface_Elevation IS NOT NULL
    GROUP BY P_Number
    HAVING COUNT(*) > 365
""", conn)

# Check if any nearby wells have data
nearby_well_ids = pairs_df['well_id'].unique()
nearby_with_data = wells_with_measurements[
    wells_with_measurements['well_id'].isin(nearby_well_ids)
]
Warning⚠️ Critical Finding ZERO

Result: Of 41 wells within 5 km of streams, ZERO have measurement data.

Implication: We cannot perform stream-groundwater correlation analysis with the available monitoring network.


14.8 Findings

14.8.1 Discovery: The Spatial Monitoring Gap

Wells WITH data: Distance to Nearest Stream:

Well ID Nearest Stream Distance (km)
268557 Copper Slough 3.64
444889 Boneyard Creek 5.82
444890 Boneyard Creek 6.15
444863 Bondville 7.34
381684 Salt Fork 8.21
434983 Champaign 9.87
10-25 km

Statistics: - Mean distance: 12.8 km - Minimum distance: 3.64 km - Maximum distance: 25.05 km

Critical Insight: At these distances, stream-groundwater interaction signal is too weak to detect. Literature suggests <1-2 km for direct interaction studies.


14.8.2 Why This Gap Exists

Independent Network Design:

Stream Gauge Network (USGS): - Purpose: Flood forecasting, water supply, ecological flows - Placement: At bridges, downstream of watersheds, urban areas - Criterion: Capture total streamflow from drainage area

Groundwater Monitoring Network (ISWS): - Purpose: Regional aquifer water levels, drought monitoring, long-term trends - Placement: Representative of major aquifer units, rural areas, away from pumping - Criterion: Minimize local disturbances (wells, pumping, urbanization)

Result: Networks optimized for different objectives → spatial non-overlap

Note💻 For Computer Scientists

This is a data availability problem, not a methodology problem. Our algorithms are correct, but we lack the input data.

Analogy: Trying to train a supervised ML model when training labels exist for completely different data points than features. The fundamental prerequisite (co-located measurements) is missing.

Lesson: Always check data availability BEFORE designing complex analyses. Spatial joins don’t guarantee temporal overlap or data quality.

Tip🌍 For Hydrologists

This monitoring gap is common in water resources. Stream-aquifer interaction studies typically require:

  1. Purpose-built transects: Wells installed specifically near streams at multiple distances (10m, 100m, 500m)
  2. High-frequency measurements: Sub-daily to capture storm responses
  3. Nested piezometers: Multiple depths to identify which aquifer layer interacts

Example: USGS Groundwater/Surface-Water Interaction studies install custom monitoring networks - they don’t rely on existing regional well networks.

Our networks were never designed to answer this question.


14.9 Implications

14.9.1 What We Cannot Test

Due to the spatial monitoring gap, we CANNOT:

  1. ✗ Directly validate the two-aquifer hypothesis
  2. ✗ Identify gaining vs losing stream reaches
  3. ✗ Quantify baseflow contribution from specific aquifer layers
  4. ✗ Measure lag times between stream stage and groundwater response
  5. ✗ Test whether Boneyard Creek interacts with shallow vs deep aquifer

14.9.2 What We CAN Conclude

From the monitoring gap itself:

  1. Regional monitoring ≠ interaction studies: Existing networks serve different purposes
  2. Spatial scale mismatch: 3-25 km distances preclude direct correlation
  3. Investment needed: Purpose-built transects required for GW-SW studies
  4. Two-aquifer hypothesis remains plausible: Lack of correlation data doesn’t disprove it

Warning❌ Approach That Failed: Direct Stream-Groundwater Correlation Analysis

What we tried: Cross-correlation analysis between stream stage (from USGS gauges) and nearby groundwater levels (from monitoring wells) to quantify stream-aquifer interaction strength and lag times.

Why it failed: Spatial mismatch between monitoring networks. The 41 wells within ideal proximity (<5 km) to stream gauges have zero measurements. Wells WITH data are 3-25 km away—far beyond the correlation length for direct hydraulic interaction.

Lesson learned: Regional monitoring networks and purpose-built research networks serve fundamentally different objectives. Stream gauges optimize for flood forecasting at watershed outlets; groundwater wells optimize for aquifer trends away from local disturbances. Neither was designed for stream-aquifer interaction studies.

Better approach: Install purpose-built stream-aquifer transects—nested wells at 10m, 50m, 100m, and 500m from stream channels, with multiple depths (5m, 15m, 30m) to identify which aquifer layer interacts with surface water. Cost: ~$50K per transect, but provides data that regional networks cannot.

Key insight: Sometimes the most important scientific finding is what you cannot analyze. This negative result guides future monitoring investments more effectively than speculative analysis with inadequate data.


14.10 Alternative Approaches

Since we cannot correlate stream-well data directly, alternative methods:

14.10.1 Approach 1: Baseflow Separation

Method: Separate streamflow into baseflow (groundwater) and quickflow (surface runoff)

Result: - BFI = 50.9-61.7% (streams are groundwater-dominated) - Baseflow declining (-0.20 cfs/yr) despite rising GW

Limitation: Doesn’t identify which aquifer layer contributes baseflow

14.10.2 Approach 2: Water Balance Method

Method: Close water balance at watershed scale - Precipitation - ET - Discharge - ΔStorage = 0 - Unaccounted terms reveal unmeasured flows

14.10.3 Approach 3: Install Purpose-Built Network

Method: New monitoring wells near streams - Multiple distances: 10m, 50m, 100m, 500m from stream - Multiple depths: Shallow (5m), intermediate (15m), deep (>30m) - High-frequency logging: 15-minute intervals

Cost: ~$50K per transect (drilling + sensors) Timeline: 2-3 years to capture seasonal/hydrologic variability

Warning⚠️ Recommendation

If stream-groundwater interaction is a priority research question, targeted monitoring investments are required. Existing regional networks cannot answer this question.

Priority locations: 1. Boneyard Creek near Well 268557 (currently 11.5 km apart) 2. Copper Slough near Well 268557 (currently 3.6 km apart - closest pair) 3. Sangamon River near Well 505586 (currently 5.8 km apart)


14.11 Key Takeaways

14.11.1 1. Monitoring Network Design

Lesson: Regional networks and targeted studies serve different purposes. Don’t assume existing data can answer all questions.

14.11.2 2. Spatial Scale is Critical

Lesson: 3-25 km separation is too large for direct stream-GW interaction. Need <1-2 km for process studies.

14.11.3 3. Negative Results Have Value

Lesson: Discovering data gaps is a legitimate scientific finding. It explains limitations and guides future investments.

14.11.4 4. Multi-Line Evidence Provides Constraints

Lesson: Even without direct correlation, we can infer two-aquifer system from: - Confined aquifer characteristics (temporal analysis) - Baseflow-groundwater inconsistency (baseflow separation) - Lack of precipitation response (lag analysis)

14.11.5 5. Purpose-Built Monitoring Required

Lesson: To answer stream-aquifer interaction questions, custom transects are needed - cannot rely on existing infrastructure.


Analysis Status: ✅ Complete - Major Finding Documented Key Finding: Spatial monitoring gap prevents direct stream-GW correlation analysis


14.12 Summary

Stream proximity analysis reveals a critical spatial monitoring gap:

3-25 km separation - Wells too far from streams for direct correlation

No stream-GW transects - Cannot validate interaction processes

Regional vs. targeted mismatch - Existing data serves different purpose

Negative result has value - Discovering gaps guides future investments

Multi-line evidence - Can still infer two-aquifer system indirectly

Key Insight: To answer stream-aquifer interaction questions, custom transects are required. Cannot rely on existing regional infrastructure.


14.14 Reflection Questions

  • Given the distances between wells with data and the nearest stream gauges, which areas would you prioritize for installing purpose-built stream–groundwater transects, and why?
  • How would you explain to a non-technical stakeholder that the limitation here is network design and data availability, not a lack of analytical tools?
  • If you did install a small number of new co-located wells and gauges, what measurements and frequencies would you choose to best capture stream–aquifer interactions?