14 Stream Proximity Analysis

For Newcomers

You will learn:

How streams and groundwater interact (gaining vs. losing streams)
Why monitoring wells need to be near stream gauges for interaction studies
What happens when monitoring networks are designed independently
A real example of how data gaps block important analysis

This chapter tells an important cautionary tale: we tried to study stream-groundwater interactions and discovered we couldn’t—not because of complex science, but because the wells and stream gauges are too far apart. Sometimes the most important finding is what we can’t analyze.

14.1 What You Will Learn in This Chapter

By the end of this chapter, you will be able to:

Describe how the current placement of wells and stream gauges limits direct stream–groundwater interaction studies.
Summarize the spatial proximity between gauges and wells, and why distances of 3–25 km are too large for correlation analysis.
Explain why this is a data-availability/network-design problem rather than a methodological limitation.
Identify what kind of purpose-built monitoring network would be needed to answer stream–aquifer interaction questions in this region.

14.2 Stream-Groundwater Interaction Study

Merging well-station proximity analysis with stream-groundwater interaction assessment to understand spatial monitoring capabilities.

14.3 Executive Summary

🔍 Critical Discovery Spatial

We attempted to analyze stream-groundwater interactions by correlating stream stage with nearby groundwater levels. We discovered we cannot perform this analysis - not because of methodology limitations, but because of a fundamental gap in monitoring network design.

The Finding: - 41 stream-well pairs exist within 5 km (ideal for interaction studies) - NONE of these wells have measurement data (0/41) - Wells WITH data are 3-25 km away from stream gauges - Monitoring networks were designed independently for different purposes

Why This Matters: This spatial gap explains why we cannot directly test whether streams are gaining (GW→stream) or losing (stream→GW) - a fundamental question for water resource management.

14.4 The Research Question

14.4.1 What We Wanted to Test

Hypothesis: Two-aquifer system - Wells monitor deep confined aquifer (Unit D, rising) - Streams interact with shallow unconfined aquifer (declining)

Test: If hypothesis is correct, stream stage should show weak or no correlation with nearby deep well levels.

14.4.2 The Ideal Analysis

Method: Cross-correlation analysis 1. Find wells within 1-5 km of stream gauges 2. Correlate stream stage with groundwater level over time 3. Calculate lag time (how long for stream changes to affect GW, or vice versa) 4. Classify reaches: - Gaining: GW level > stream stage → groundwater feeds stream - Losing: Stream stage > GW level → stream recharges aquifer - Neutral/Disconnected: No correlation

Expected Result: If two-aquifer system exists, we’d see weak correlations because streams and wells monitor different layers.

14.5 Interactive Visualizations

📘 How to Read This Map

What It Shows: Red markers show USGS stream gauge locations—these are the points where we have continuous stream flow and stage measurements. The map reveals which streams are monitored vs. unmonitored.

What to Look For: - Clustered red markers: Multiple gauges on the same stream system (e.g., Boneyard Creek) - Isolated red markers: Single gauge representing a larger watershed - Gaps between markers: Unmonitored stream reaches or tributaries

How to Interpret:

Map Pattern	What It Means	Data Availability	Study Limitations
Red markers near urban areas	Flood monitoring priority (Champaign-Urbana)	High-quality, high-frequency data	Urban influence may bias natural stream-aquifer interaction
Red markers at watershed outlets	Integrates entire upstream drainage	Total discharge captured	Cannot distinguish local vs. regional contributions
Large areas with no red markers	Small streams ungauged	No direct stream data	Must use regional discharge estimates or install new gauges
Red marker + well location overlap	Potential for stream-GW correlation	Ideal for interaction studies	Unfortunately rare in this network (see analysis below)

Show code

usgs_root = get_data_path("usgs_stream")
usgs_loader = USGSStreamLoader(usgs_root)
sites = usgs_loader.sites[
    ["site_no", "station_nm", "dec_lat_va", "dec_long_va"]
].drop_duplicates()

fig = px.scatter_mapbox(
    sites,
    lat="dec_lat_va",
    lon="dec_long_va",
    hover_name="station_nm",
    hover_data={"site_no": True, "dec_lat_va": ":.4f", "dec_long_va": ":.4f"},
    color_discrete_sequence=["red"],
    zoom=9,
    height=500,
)

fig.update_layout(
    mapbox_style="open-street-map",
    title="USGS Stream Gauge Network",
    showlegend=False,
)

fig.show()

print("\n**Stream Gauge Summary:**")
print(f"- Total USGS stream gauges: {len(sites)}")
print("- Monitoring major tributaries and urban streams")


**Stream Gauge Summary:**
- Total USGS stream gauges: 9
- Monitoring major tributaries and urban streams

(a) USGS stream gauge locations across the study area. Gauges monitor major streams including Boneyard Creek, Salt Fork, and regional tributaries.

(b)

Figure 14.1

📘 How to Read Time Series Discharge Patterns

What It Shows: This line graph displays stream discharge (flow rate) over time measured in cubic feet per second (cfs). Each colored line represents a different stream gauge station.

What to Look For: - Sharp spikes: Rapid discharge increases from storm events (flashy response) - Baseline flow: Low-flow periods between storms represent baseflow (groundwater contribution) - Seasonal patterns: Higher discharge in spring (snowmelt + rain), lower in summer/fall - Different colored lines: Compare how different streams respond to the same precipitation events

How to Interpret:

Discharge Pattern	What It Means	Stream-Aquifer Connection	Implication for GW Studies
Sharp spikes with rapid return to baseline	Flashy urban stream (e.g., Boneyard Creek)	Minimal groundwater buffering	Stream responds to surface runoff, not deep aquifer
Gradual rise and slow recession	Groundwater-dominated stream	Strong aquifer-stream connection	Good candidate for GW-stream interaction studies
High baseline (>20 cfs) between storms	Substantial baseflow contribution	Aquifer discharging to stream (gaining)	Deep aquifer may sustain stream during droughts
Very low baseline (<5 cfs)	Limited baseflow	Stream may recharge aquifer (losing)	Aquifer disconnected or stream perched above water table
Baseline declining over time	Decreasing groundwater contribution	Potential aquifer depletion	Concerning trend—investigate pumping or recharge changes

Show code

usgs_root = get_data_path("usgs_stream")
usgs_loader = USGSStreamLoader(usgs_root)

# Use up to three example sites for visualization
site_nos = usgs_loader.get_site_list()[:3]
discharge_frames = []

for site_no in site_nos:
    df = usgs_loader.load_daily_discharge(site_no)
    if not df.empty:
        df = df.copy()
        df["gauge"] = site_no
        discharge_frames.append(df)

all_discharge = pd.concat(discharge_frames, ignore_index=True)
all_discharge = all_discharge[all_discharge["date"] >= "2020-01-01"]

fig = px.line(
    all_discharge,
    x="date",
    y="discharge_cfs",
    color="gauge",
    labels={
        "date": "Date",
        "discharge_cfs": "Discharge (cubic feet/second)",
        "gauge": "Stream Gauge",
    },
    title="Stream Discharge Patterns (2020–Present)",
)

fig.update_layout(height=400, hovermode="x unified")
fig.show()

stats = all_discharge.groupby("gauge")["discharge_cfs"].agg(
    ["mean", "std", "min", "max"]
)
print("\n**Discharge Statistics (2020–Present):**")
for gauge in stats.index:
    print(f"\n{gauge}:")
    print(f"  - Mean: {stats.loc[gauge, 'mean']:.1f} cfs")
    print(f"  - Std Dev: {stats.loc[gauge, 'std']:.1f} cfs")
    print(
        f"  - Range: {stats.loc[gauge, 'min']:.1f} "
        f"- {stats.loc[gauge, 'max']:.1f} cfs"
    )

Figure 14.2: Daily discharge patterns from USGS stream gauges showing temporal variability in streamflow. High variability indicates flashy response to precipitation events.


**Discharge Statistics (2020–Present):**

03336890:
  - Mean: 31.5 cfs
  - Std Dev: 72.0 cfs
  - Range: 1.1 - 1190.0 cfs

03336900:
  - Mean: 100.8 cfs
  - Std Dev: 188.9 cfs
  - Range: 0.8 - 2420.0 cfs

14.6 Data Used

14.6.1 USGS Stream Gauge Network

9 USGS Stream Gauges with gage height (stage) data: - Boneyard Creek: 4 stations (urban Champaign-Urbana) - Salt Fork, Spoon River, Sangamon River: Regional coverage - Period of Record: 1979-2025 (46 years)

Spatial Coverage: Excellent - streams distributed across study area

14.6.2 Groundwater Monitoring Network

356 wells with valid GPS coordinates Only 18 wells (5%) have water level measurements Measurement period: 2009-2023 (overlaps with stream data)

Critical Observation: The vast majority of wells are observation points without active sensors.

14.7 Method

Understanding Buffer Analysis for Stream-Groundwater Proximity

What Is It?

Buffer analysis is a GIS technique that creates zones of specified distance around features (points, lines, polygons). In hydrogeology, buffers identify which monitoring points are close enough to detect interactions between surface water and groundwater.

Brief History:

Buffer analysis emerged with early GIS systems in the 1970s (pioneered by Jack Dangermond at ESRI and Roger Tomlinson’s Canada Geographic Information System). It became essential for environmental studies in the 1980s when EPA began using buffers for wellhead protection zones. Today, it’s a standard tool in every GIS platform.

Why Does It Matter for Stream-Aquifer Studies?

The distance between a stream and a monitoring well determines what we can measure:

<100m: Direct interaction (capture daily stream stage fluctuations)
100m-1km: Local interaction (detect seasonal exchange patterns)
1-5km: Regional influence (see long-term trends, large events)
>5km: No detectable interaction (signal too weak, too delayed)

For this study: We need wells within 1-5 km of stream gauges to test whether streams interact with the deep confined aquifer (Unit D) or only the shallow unconfined system.

How Does It Work?

The buffer analysis process:

Define threshold distance: Based on hydrogeological theory (typically correlation length)
Calculate pairwise distances: Between all stream gauges and all wells
Filter by threshold: Keep only pairs within acceptable distance
Check data availability: Verify wells have temporal measurements
Assess adequacy: Do we have enough pairs for statistical analysis?

What Will You See?

Proximity analysis results show spatial coverage:

Distance Category	Expected Interaction Strength	Analysis Capability	Data Quality Needed
0-500m	Very strong (hours-days lag)	Direct hydraulic connection	Sub-daily measurements
500m-2km	Strong (days-weeks lag)	Seasonal exchange patterns	Daily measurements
2-5km	Moderate (weeks-months lag)	Long-term trends	Weekly measurements
5-10km	Weak (months-years lag)	Regional background	Monthly measurements
>10km	None (signal lost in noise)	No correlation expected	N/A

How to Interpret Results:

Many pairs <1km with data: Ideal - can quantify stream-aquifer exchange rates
Pairs 1-5km with data: Good - can detect interaction, limited process detail
Pairs 3-10km with data: Marginal - correlation analysis possible but weak
Pairs >10km or no data: Inadequate - cannot study interaction

14.7.1 Step 1: Calculate Spatial Proximity

Objective: Find stream-well pairs close enough for interaction studies

Threshold: 5 km (literature suggests significant interaction within 1-5 km)

from scipy.spatial.distance import cdist

# Calculate distances
station_coords = stations[['dec_lat_va', 'dec_long_va']].values
well_coords = wells_all[['lat', 'lon']].values
distances_km = cdist(station_coords, well_coords, metric='euclidean') * 111

# Create pairs within 5 km
pairs_within_5km = distances_km[distances_km <= 5.0]

Result: 41 stream-well pairs within 5 km - perfect for interaction studies!

Closest Pairs: - Station 03337000 ↔︎ Well 381635: 0.56 km - Station 03337000 ↔︎ Well 381636: 0.57 km - Station 03337100 ↔︎ Well 381635: 0.67 km

Interpretation: Excellent spatial proximity - if data existed, we could analyze interaction at <1 km scale.

14.7.2 Step 2: Check Data Availability

Objective: Verify temporal overlap between stream stage and well water levels

# Check which wells have data
wells_with_measurements = pd.read_sql_query("""
    SELECT P_Number, COUNT(*) as num_measurements
    FROM OB_WELL_MEASUREMENTS_CHAMPAIGN_COUNTY
    WHERE Water_Surface_Elevation IS NOT NULL
    GROUP BY P_Number
    HAVING COUNT(*) > 365
""", conn)

# Check if any nearby wells have data
nearby_well_ids = pairs_df['well_id'].unique()
nearby_with_data = wells_with_measurements[
    wells_with_measurements['well_id'].isin(nearby_well_ids)
]

⚠️ Critical Finding ZERO

Result: Of 41 wells within 5 km of streams, ZERO have measurement data.

Implication: We cannot perform stream-groundwater correlation analysis with the available monitoring network.

14.8 Findings

14.8.1 Discovery: The Spatial Monitoring Gap

Wells WITH data: Distance to Nearest Stream:

Well ID	Nearest Stream	Distance (km)
268557	Copper Slough	3.64
444889	Boneyard Creek	5.82
444890	Boneyard Creek	6.15
444863	Bondville	7.34
381684	Salt Fork	8.21
434983	Champaign	9.87
…	…	10-25 km

Statistics: - Mean distance: 12.8 km - Minimum distance: 3.64 km - Maximum distance: 25.05 km

Critical Insight: At these distances, stream-groundwater interaction signal is too weak to detect. Literature suggests <1-2 km for direct interaction studies.

14.8.2 Why This Gap Exists

Independent Network Design:

Stream Gauge Network (USGS): - Purpose: Flood forecasting, water supply, ecological flows - Placement: At bridges, downstream of watersheds, urban areas - Criterion: Capture total streamflow from drainage area

Groundwater Monitoring Network (ISWS): - Purpose: Regional aquifer water levels, drought monitoring, long-term trends - Placement: Representative of major aquifer units, rural areas, away from pumping - Criterion: Minimize local disturbances (wells, pumping, urbanization)

Result: Networks optimized for different objectives → spatial non-overlap

💻 For Computer Scientists

This is a data availability problem, not a methodology problem. Our algorithms are correct, but we lack the input data.

Analogy: Trying to train a supervised ML model when training labels exist for completely different data points than features. The fundamental prerequisite (co-located measurements) is missing.

Lesson: Always check data availability BEFORE designing complex analyses. Spatial joins don’t guarantee temporal overlap or data quality.

🌍 For Hydrologists

This monitoring gap is common in water resources. Stream-aquifer interaction studies typically require:

Purpose-built transects: Wells installed specifically near streams at multiple distances (10m, 100m, 500m)
High-frequency measurements: Sub-daily to capture storm responses
Nested piezometers: Multiple depths to identify which aquifer layer interacts

Example: USGS Groundwater/Surface-Water Interaction studies install custom monitoring networks - they don’t rely on existing regional well networks.

Our networks were never designed to answer this question.

14.9 Implications

14.9.1 What We Cannot Test

Due to the spatial monitoring gap, we CANNOT:

✗ Directly validate the two-aquifer hypothesis
✗ Identify gaining vs losing stream reaches
✗ Quantify baseflow contribution from specific aquifer layers
✗ Measure lag times between stream stage and groundwater response
✗ Test whether Boneyard Creek interacts with shallow vs deep aquifer

14.9.2 What We CAN Conclude

From the monitoring gap itself:

✓ Regional monitoring ≠ interaction studies: Existing networks serve different purposes
✓ Spatial scale mismatch: 3-25 km distances preclude direct correlation
✓ Investment needed: Purpose-built transects required for GW-SW studies
✓ Two-aquifer hypothesis remains plausible: Lack of correlation data doesn’t disprove it

❌ Approach That Failed: Direct Stream-Groundwater Correlation Analysis

What we tried: Cross-correlation analysis between stream stage (from USGS gauges) and nearby groundwater levels (from monitoring wells) to quantify stream-aquifer interaction strength and lag times.

Why it failed: Spatial mismatch between monitoring networks. The 41 wells within ideal proximity (<5 km) to stream gauges have zero measurements. Wells WITH data are 3-25 km away—far beyond the correlation length for direct hydraulic interaction.

Lesson learned: Regional monitoring networks and purpose-built research networks serve fundamentally different objectives. Stream gauges optimize for flood forecasting at watershed outlets; groundwater wells optimize for aquifer trends away from local disturbances. Neither was designed for stream-aquifer interaction studies.

Better approach: Install purpose-built stream-aquifer transects—nested wells at 10m, 50m, 100m, and 500m from stream channels, with multiple depths (5m, 15m, 30m) to identify which aquifer layer interacts with surface water. Cost: ~$50K per transect, but provides data that regional networks cannot.

Key insight: Sometimes the most important scientific finding is what you cannot analyze. This negative result guides future monitoring investments more effectively than speculative analysis with inadequate data.

14.10 Alternative Approaches

Since we cannot correlate stream-well data directly, alternative methods:

14.10.1 Approach 1: Baseflow Separation

Method: Separate streamflow into baseflow (groundwater) and quickflow (surface runoff)

Result: - BFI = 50.9-61.7% (streams are groundwater-dominated) - Baseflow declining (-0.20 cfs/yr) despite rising GW

Limitation: Doesn’t identify which aquifer layer contributes baseflow

14.10.2 Approach 2: Water Balance Method

Method: Close water balance at watershed scale - Precipitation - ET - Discharge - ΔStorage = 0 - Unaccounted terms reveal unmeasured flows

14.10.3 Approach 3: Install Purpose-Built Network

Method: New monitoring wells near streams - Multiple distances: 10m, 50m, 100m, 500m from stream - Multiple depths: Shallow (5m), intermediate (15m), deep (>30m) - High-frequency logging: 15-minute intervals

Cost: ~$50K per transect (drilling + sensors) Timeline: 2-3 years to capture seasonal/hydrologic variability

⚠️ Recommendation

If stream-groundwater interaction is a priority research question, targeted monitoring investments are required. Existing regional networks cannot answer this question.

Priority locations: 1. Boneyard Creek near Well 268557 (currently 11.5 km apart) 2. Copper Slough near Well 268557 (currently 3.6 km apart - closest pair) 3. Sangamon River near Well 505586 (currently 5.8 km apart)

14.11 Key Takeaways

14.11.1 1. Monitoring Network Design

Lesson: Regional networks and targeted studies serve different purposes. Don’t assume existing data can answer all questions.

14.11.2 2. Spatial Scale is Critical

Lesson: 3-25 km separation is too large for direct stream-GW interaction. Need <1-2 km for process studies.

14.11.3 3. Negative Results Have Value

Lesson: Discovering data gaps is a legitimate scientific finding. It explains limitations and guides future investments.

14.11.4 4. Multi-Line Evidence Provides Constraints

Lesson: Even without direct correlation, we can infer two-aquifer system from: - Confined aquifer characteristics (temporal analysis) - Baseflow-groundwater inconsistency (baseflow separation) - Lack of precipitation response (lag analysis)

14.11.5 5. Purpose-Built Monitoring Required

Lesson: To answer stream-aquifer interaction questions, custom transects are needed - cannot rely on existing infrastructure.

Analysis Status: ✅ Complete - Major Finding Documented Key Finding: Spatial monitoring gap prevents direct stream-GW correlation analysis

14.12 Summary

Stream proximity analysis reveals a critical spatial monitoring gap:

❌ 3-25 km separation - Wells too far from streams for direct correlation

❌ No stream-GW transects - Cannot validate interaction processes

❌ Regional vs. targeted mismatch - Existing data serves different purpose

✅ Negative result has value - Discovering gaps guides future investments

✅ Multi-line evidence - Can still infer two-aquifer system indirectly

Key Insight: To answer stream-aquifer interaction questions, custom transects are required. Cannot rely on existing regional infrastructure.

14.14 Reflection Questions

Given the distances between wells with data and the nearest stream gauges, which areas would you prioritize for installing purpose-built stream–groundwater transects, and why?
How would you explain to a non-technical stakeholder that the limitation here is network design and data availability, not a lack of analytical tools?
If you did install a small number of new co-located wells and gauges, what measurements and frequencies would you choose to best capture stream–aquifer interactions?

--- title: "Stream Proximity Analysis" code-fold: true --- ::: {.callout-tip icon=false} ## For Newcomers **You will learn:** - How streams and groundwater interact (gaining vs. losing streams) - Why monitoring wells need to be near stream gauges for interaction studies - What happens when monitoring networks are designed independently - A real example of how data gaps block important analysis This chapter tells an important cautionary tale: we tried to study stream-groundwater interactions and discovered we couldn't—not because of complex science, but because the wells and stream gauges are too far apart. Sometimes the most important finding is what we can't analyze. ::: ## What You Will Learn in This Chapter By the end of this chapter, you will be able to: - Describe how the current placement of wells and stream gauges limits direct stream–groundwater interaction studies. - Summarize the spatial proximity between gauges and wells, and why distances of 3–25 km are too large for correlation analysis. - Explain why this is a data-availability/network-design problem rather than a methodological limitation. - Identify what kind of purpose-built monitoring network would be needed to answer stream–aquifer interaction questions in this region. ## Stream-Groundwater Interaction Study {#sec-stream-proximity} Merging well-station proximity analysis with stream-groundwater interaction assessment to understand spatial monitoring capabilities. ## Executive Summary ::: {.callout-important icon=false} ## 🔍 Critical Discovery Spatial We attempted to analyze stream-groundwater interactions by correlating stream stage with nearby groundwater levels. **We discovered we cannot perform this analysis** - not because of methodology limitations, but because of a **fundamental gap in monitoring network design**. **The Finding:** - **41 stream-well pairs exist within 5 km** (ideal for interaction studies) - **NONE of these wells have measurement data** (0/41) - **Wells WITH data are 3-25 km away** from stream gauges - **Monitoring networks were designed independently** for different purposes **Why This Matters:** This spatial gap explains why we cannot directly test whether streams are gaining (GW→stream) or losing (stream→GW) - a fundamental question for water resource management. ::: --- ## The Research Question ### What We Wanted to Test **Hypothesis:** Two-aquifer system - Wells monitor **deep confined aquifer** (Unit D, rising) - Streams interact with **shallow unconfined aquifer** (declining) **Test:** If hypothesis is correct, stream stage should show **weak or no correlation** with nearby deep well levels. ### The Ideal Analysis **Method:** Cross-correlation analysis 1. Find wells within 1-5 km of stream gauges 2. Correlate stream stage with groundwater level over time 3. Calculate lag time (how long for stream changes to affect GW, or vice versa) 4. Classify reaches: - **Gaining:** GW level > stream stage → groundwater feeds stream - **Losing:** Stream stage > GW level → stream recharges aquifer - **Neutral/Disconnected:** No correlation **Expected Result:** If two-aquifer system exists, we'd see weak correlations because streams and wells monitor different layers. --- ## Interactive Visualizations ```{python} #| label: setup #| echo: false import os import sys from pathlib import Path import sqlite3 import pandas as pd import numpy as np import plotly.express as px import plotly.graph_objects as go from plotly.subplots import make_subplots import warnings warnings.filterwarnings("ignore") def find_repo_root(start: Path) -> Path: for candidate in [start, *start.parents]: if (candidate / "src").exists(): return candidate return start quarto_project = Path(os.environ.get("QUARTO_PROJECT_DIR", str(Path.cwd()))) project_root = find_repo_root(quarto_project) if str(project_root) not in sys.path: sys.path.append(str(project_root)) from src.data_loaders.usgs_stream_loader import USGSStreamLoader from src.utils import get_data_path ``` ::: {.callout-note icon=false} ## 📘 How to Read This Map **What It Shows:** Red markers show USGS stream gauge locations—these are the points where we have continuous stream flow and stage measurements. The map reveals which streams are monitored vs. unmonitored. **What to Look For:** - **Clustered red markers:** Multiple gauges on the same stream system (e.g., Boneyard Creek) - **Isolated red markers:** Single gauge representing a larger watershed - **Gaps between markers:** Unmonitored stream reaches or tributaries **How to Interpret:** | Map Pattern | What It Means | Data Availability | Study Limitations | |-------------|---------------|-------------------|-------------------| | Red markers near urban areas | Flood monitoring priority (Champaign-Urbana) | High-quality, high-frequency data | Urban influence may bias natural stream-aquifer interaction | | Red markers at watershed outlets | Integrates entire upstream drainage | Total discharge captured | Cannot distinguish local vs. regional contributions | | Large areas with no red markers | Small streams ungauged | No direct stream data | Must use regional discharge estimates or install new gauges | | Red marker + well location overlap | Potential for stream-GW correlation | Ideal for interaction studies | Unfortunately rare in this network (see analysis below) | ::: ```{python} #| label: fig-stream-locations #| fig-cap: "USGS stream gauge locations across the study area. Gauges monitor major streams including Boneyard Creek, Salt Fork, and regional tributaries." usgs_root = get_data_path("usgs_stream") usgs_loader = USGSStreamLoader(usgs_root) sites = usgs_loader.sites[ ["site_no", "station_nm", "dec_lat_va", "dec_long_va"] ].drop_duplicates() fig = px.scatter_mapbox( sites, lat="dec_lat_va", lon="dec_long_va", hover_name="station_nm", hover_data={"site_no": True, "dec_lat_va": ":.4f", "dec_long_va": ":.4f"}, color_discrete_sequence=["red"], zoom=9, height=500, ) fig.update_layout( mapbox_style="open-street-map", title="USGS Stream Gauge Network", showlegend=False, ) fig.show() print("\n**Stream Gauge Summary:**") print(f"- Total USGS stream gauges: {len(sites)}") print("- Monitoring major tributaries and urban streams") ``` ::: {.callout-note icon=false} ## 📘 How to Read Time Series Discharge Patterns **What It Shows:** This line graph displays stream discharge (flow rate) over time measured in cubic feet per second (cfs). Each colored line represents a different stream gauge station. **What to Look For:** - **Sharp spikes:** Rapid discharge increases from storm events (flashy response) - **Baseline flow:** Low-flow periods between storms represent baseflow (groundwater contribution) - **Seasonal patterns:** Higher discharge in spring (snowmelt + rain), lower in summer/fall - **Different colored lines:** Compare how different streams respond to the same precipitation events **How to Interpret:** | Discharge Pattern | What It Means | Stream-Aquifer Connection | Implication for GW Studies | |-------------------|---------------|---------------------------|---------------------------| | Sharp spikes with rapid return to baseline | Flashy urban stream (e.g., Boneyard Creek) | Minimal groundwater buffering | Stream responds to surface runoff, not deep aquifer | | Gradual rise and slow recession | Groundwater-dominated stream | Strong aquifer-stream connection | Good candidate for GW-stream interaction studies | | High baseline (>20 cfs) between storms | Substantial baseflow contribution | Aquifer discharging to stream (gaining) | Deep aquifer may sustain stream during droughts | | Very low baseline (<5 cfs) | Limited baseflow | Stream may recharge aquifer (losing) | Aquifer disconnected or stream perched above water table | | Baseline declining over time | Decreasing groundwater contribution | Potential aquifer depletion | Concerning trend—investigate pumping or recharge changes | ::: ```{python} #| label: fig-discharge-patterns #| fig-cap: "Daily discharge patterns from USGS stream gauges showing temporal variability in streamflow. High variability indicates flashy response to precipitation events." usgs_root = get_data_path("usgs_stream") usgs_loader = USGSStreamLoader(usgs_root) # Use up to three example sites for visualization site_nos = usgs_loader.get_site_list()[:3] discharge_frames = [] for site_no in site_nos: df = usgs_loader.load_daily_discharge(site_no) if not df.empty: df = df.copy() df["gauge"] = site_no discharge_frames.append(df) all_discharge = pd.concat(discharge_frames, ignore_index=True) all_discharge = all_discharge[all_discharge["date"] >= "2020-01-01"] fig = px.line( all_discharge, x="date", y="discharge_cfs", color="gauge", labels={ "date": "Date", "discharge_cfs": "Discharge (cubic feet/second)", "gauge": "Stream Gauge", }, title="Stream Discharge Patterns (2020–Present)", ) fig.update_layout(height=400, hovermode="x unified") fig.show() stats = all_discharge.groupby("gauge")["discharge_cfs"].agg( ["mean", "std", "min", "max"] ) print("\n**Discharge Statistics (2020–Present):**") for gauge in stats.index: print(f"\n{gauge}:") print(f" - Mean: {stats.loc[gauge, 'mean']:.1f} cfs") print(f" - Std Dev: {stats.loc[gauge, 'std']:.1f} cfs") print( f" - Range: {stats.loc[gauge, 'min']:.1f} " f"- {stats.loc[gauge, 'max']:.1f} cfs" ) ``` --- ## Data Used ### USGS Stream Gauge Network **9 USGS Stream Gauges** with gage height (stage) data: - **Boneyard Creek:** 4 stations (urban Champaign-Urbana) - **Salt Fork, Spoon River, Sangamon River:** Regional coverage - **Period of Record:** 1979-2025 (46 years) **Spatial Coverage:** Excellent - streams distributed across study area ### Groundwater Monitoring Network **356 wells** with valid GPS coordinates **Only 18 wells** (5%) have water level measurements **Measurement period:** 2009-2023 (overlaps with stream data) **Critical Observation:** The vast majority of wells are **observation points without active sensors**. --- ## Method ::: {.callout-note icon=false} ## Understanding Buffer Analysis for Stream-Groundwater Proximity **What Is It?** Buffer analysis is a GIS technique that creates zones of specified distance around features (points, lines, polygons). In hydrogeology, buffers identify which monitoring points are close enough to detect interactions between surface water and groundwater. **Brief History:** Buffer analysis emerged with early GIS systems in the 1970s (pioneered by Jack Dangermond at ESRI and Roger Tomlinson's Canada Geographic Information System). It became essential for environmental studies in the 1980s when EPA began using buffers for wellhead protection zones. Today, it's a standard tool in every GIS platform. **Why Does It Matter for Stream-Aquifer Studies?** The distance between a stream and a monitoring well determines what we can measure: - **<100m**: Direct interaction (capture daily stream stage fluctuations) - **100m-1km**: Local interaction (detect seasonal exchange patterns) - **1-5km**: Regional influence (see long-term trends, large events) - **>5km**: No detectable interaction (signal too weak, too delayed) **For this study**: We need wells within 1-5 km of stream gauges to test whether streams interact with the deep confined aquifer (Unit D) or only the shallow unconfined system. **How Does It Work?** The buffer analysis process: 1. **Define threshold distance**: Based on hydrogeological theory (typically correlation length) 2. **Calculate pairwise distances**: Between all stream gauges and all wells 3. **Filter by threshold**: Keep only pairs within acceptable distance 4. **Check data availability**: Verify wells have temporal measurements 5. **Assess adequacy**: Do we have enough pairs for statistical analysis? **What Will You See?** Proximity analysis results show spatial coverage: | Distance Category | Expected Interaction Strength | Analysis Capability | Data Quality Needed | |-------------------|------------------------------|---------------------|---------------------| | **0-500m** | Very strong (hours-days lag) | Direct hydraulic connection | Sub-daily measurements | | **500m-2km** | Strong (days-weeks lag) | Seasonal exchange patterns | Daily measurements | | **2-5km** | Moderate (weeks-months lag) | Long-term trends | Weekly measurements | | **5-10km** | Weak (months-years lag) | Regional background | Monthly measurements | | **>10km** | None (signal lost in noise) | No correlation expected | N/A | **How to Interpret Results:** - **Many pairs <1km with data**: Ideal - can quantify stream-aquifer exchange rates - **Pairs 1-5km with data**: Good - can detect interaction, limited process detail - **Pairs 3-10km with data**: Marginal - correlation analysis possible but weak - **Pairs >10km or no data**: Inadequate - cannot study interaction ::: ### Step 1: Calculate Spatial Proximity **Objective:** Find stream-well pairs close enough for interaction studies **Threshold:** 5 km (literature suggests significant interaction within 1-5 km) ```python from scipy.spatial.distance import cdist # Calculate distances station_coords = stations[['dec_lat_va', 'dec_long_va']].values well_coords = wells_all[['lat', 'lon']].values distances_km = cdist(station_coords, well_coords, metric='euclidean') * 111 # Create pairs within 5 km pairs_within_5km = distances_km[distances_km <= 5.0] ``` **Result:** **41 stream-well pairs within 5 km** - perfect for interaction studies! **Closest Pairs:** - Station 03337000 ↔ Well 381635: **0.56 km** - Station 03337000 ↔ Well 381636: **0.57 km** - Station 03337100 ↔ Well 381635: **0.67 km** **Interpretation:** Excellent spatial proximity - if data existed, we could analyze interaction at <1 km scale. --- ### Step 2: Check Data Availability **Objective:** Verify temporal overlap between stream stage and well water levels ```python # Check which wells have data wells_with_measurements = pd.read_sql_query(""" SELECT P_Number, COUNT(*) as num_measurements FROM OB_WELL_MEASUREMENTS_CHAMPAIGN_COUNTY WHERE Water_Surface_Elevation IS NOT NULL GROUP BY P_Number HAVING COUNT(*) > 365 """, conn) # Check if any nearby wells have data nearby_well_ids = pairs_df['well_id'].unique() nearby_with_data = wells_with_measurements[ wells_with_measurements['well_id'].isin(nearby_well_ids) ] ``` ::: {.callout-warning icon=false} ## ⚠️ Critical Finding ZERO **Result:** Of 41 wells within 5 km of streams, **ZERO** have measurement data. **Implication:** We cannot perform stream-groundwater correlation analysis with the available monitoring network. ::: --- ## Findings ### Discovery: The Spatial Monitoring Gap **Wells WITH data: Distance to Nearest Stream:** | Well ID | Nearest Stream | Distance (km) | |---------|---------------|---------------| | 268557 | Copper Slough | **3.64** | | 444889 | Boneyard Creek | 5.82 | | 444890 | Boneyard Creek | 6.15 | | 444863 | Bondville | 7.34 | | 381684 | Salt Fork | 8.21 | | 434983 | Champaign | 9.87 | | ... | ... | 10-25 km | **Statistics:** - **Mean distance:** 12.8 km - **Minimum distance:** 3.64 km - **Maximum distance:** 25.05 km **Critical Insight:** At these distances, stream-groundwater interaction signal is too weak to detect. Literature suggests <1-2 km for direct interaction studies. --- ### Why This Gap Exists **Independent Network Design:** **Stream Gauge Network (USGS):** - **Purpose:** Flood forecasting, water supply, ecological flows - **Placement:** At bridges, downstream of watersheds, urban areas - **Criterion:** Capture total streamflow from drainage area **Groundwater Monitoring Network (ISWS):** - **Purpose:** Regional aquifer water levels, drought monitoring, long-term trends - **Placement:** Representative of major aquifer units, rural areas, away from pumping - **Criterion:** Minimize local disturbances (wells, pumping, urbanization) **Result:** Networks optimized for different objectives → spatial non-overlap ::: {.callout-note icon=false} ## 💻 For Computer Scientists This is a **data availability problem**, not a methodology problem. Our algorithms are correct, but we lack the input data. **Analogy:** Trying to train a supervised ML model when training labels exist for completely different data points than features. The fundamental prerequisite (co-located measurements) is missing. **Lesson:** Always check data availability BEFORE designing complex analyses. Spatial joins don't guarantee temporal overlap or data quality. ::: ::: {.callout-tip icon=false} ## 🌍 For Hydrologists This monitoring gap is **common in water resources**. Stream-aquifer interaction studies typically require: 1. **Purpose-built transects:** Wells installed specifically near streams at multiple distances (10m, 100m, 500m) 2. **High-frequency measurements:** Sub-daily to capture storm responses 3. **Nested piezometers:** Multiple depths to identify which aquifer layer interacts **Example:** USGS Groundwater/Surface-Water Interaction studies install custom monitoring networks - they don't rely on existing regional well networks. **Our networks were never designed to answer this question.** ::: --- ## Implications ### What We Cannot Test **Due to the spatial monitoring gap, we CANNOT:** 1. ✗ Directly validate the two-aquifer hypothesis 2. ✗ Identify gaining vs losing stream reaches 3. ✗ Quantify baseflow contribution from specific aquifer layers 4. ✗ Measure lag times between stream stage and groundwater response 5. ✗ Test whether Boneyard Creek interacts with shallow vs deep aquifer ### What We CAN Conclude **From the monitoring gap itself:** 1. ✓ **Regional monitoring ≠ interaction studies:** Existing networks serve different purposes 2. ✓ **Spatial scale mismatch:** 3-25 km distances preclude direct correlation 3. ✓ **Investment needed:** Purpose-built transects required for GW-SW studies 4. ✓ **Two-aquifer hypothesis remains plausible:** Lack of correlation data doesn't disprove it --- ::: {.callout-warning icon=false} ## ❌ Approach That Failed: Direct Stream-Groundwater Correlation Analysis **What we tried:** Cross-correlation analysis between stream stage (from USGS gauges) and nearby groundwater levels (from monitoring wells) to quantify stream-aquifer interaction strength and lag times. **Why it failed:** Spatial mismatch between monitoring networks. The 41 wells within ideal proximity (<5 km) to stream gauges have **zero measurements**. Wells WITH data are 3-25 km away—far beyond the correlation length for direct hydraulic interaction. **Lesson learned:** Regional monitoring networks and purpose-built research networks serve fundamentally different objectives. Stream gauges optimize for flood forecasting at watershed outlets; groundwater wells optimize for aquifer trends away from local disturbances. Neither was designed for stream-aquifer interaction studies. **Better approach:** Install purpose-built stream-aquifer transects—nested wells at 10m, 50m, 100m, and 500m from stream channels, with multiple depths (5m, 15m, 30m) to identify which aquifer layer interacts with surface water. Cost: ~$50K per transect, but provides data that regional networks cannot. **Key insight**: Sometimes the most important scientific finding is **what you cannot analyze**. This negative result guides future monitoring investments more effectively than speculative analysis with inadequate data. ::: --- ## Alternative Approaches Since we cannot correlate stream-well data directly, alternative methods: ### Approach 1: Baseflow Separation **Method:** Separate streamflow into baseflow (groundwater) and quickflow (surface runoff) **Result:** - BFI = 50.9-61.7% (streams are groundwater-dominated) - Baseflow declining (-0.20 cfs/yr) despite rising GW **Limitation:** Doesn't identify which aquifer layer contributes baseflow ### Approach 2: Water Balance Method **Method:** Close water balance at watershed scale - Precipitation - ET - Discharge - ΔStorage = 0 - Unaccounted terms reveal unmeasured flows ### Approach 3: Install Purpose-Built Network **Method:** New monitoring wells near streams - Multiple distances: 10m, 50m, 100m, 500m from stream - Multiple depths: Shallow (5m), intermediate (15m), deep (>30m) - High-frequency logging: 15-minute intervals **Cost:** ~$50K per transect (drilling + sensors) **Timeline:** 2-3 years to capture seasonal/hydrologic variability ::: {.callout-warning icon=false} ## ⚠️ Recommendation If stream-groundwater interaction is a priority research question, **targeted monitoring investments are required**. Existing regional networks cannot answer this question. **Priority locations:** 1. Boneyard Creek near Well 268557 (currently 11.5 km apart) 2. Copper Slough near Well 268557 (currently 3.6 km apart - closest pair) 3. Sangamon River near Well 505586 (currently 5.8 km apart) ::: --- ## Key Takeaways ### 1. Monitoring Network Design **Lesson:** Regional networks and targeted studies serve different purposes. Don't assume existing data can answer all questions. ### 2. Spatial Scale is Critical **Lesson:** 3-25 km separation is too large for direct stream-GW interaction. Need <1-2 km for process studies. ### 3. Negative Results Have Value **Lesson:** Discovering data gaps is a legitimate scientific finding. It explains limitations and guides future investments. ### 4. Multi-Line Evidence Provides Constraints **Lesson:** Even without direct correlation, we can infer two-aquifer system from: - Confined aquifer characteristics (temporal analysis) - Baseflow-groundwater inconsistency (baseflow separation) - Lack of precipitation response (lag analysis) ### 5. Purpose-Built Monitoring Required **Lesson:** To answer stream-aquifer interaction questions, **custom transects are needed** - cannot rely on existing infrastructure. --- **Analysis Status:** ✅ Complete - Major Finding Documented **Key Finding:** Spatial monitoring gap prevents direct stream-GW correlation analysis --- ## Summary Stream proximity analysis reveals a **critical spatial monitoring gap**: ❌ **3-25 km separation** - Wells too far from streams for direct correlation ❌ **No stream-GW transects** - Cannot validate interaction processes ❌ **Regional vs. targeted mismatch** - Existing data serves different purpose ✅ **Negative result has value** - Discovering gaps guides future investments ✅ **Multi-line evidence** - Can still infer two-aquifer system indirectly **Key Insight**: To answer stream-aquifer interaction questions, **custom transects are required**. Cannot rely on existing regional infrastructure. --- ## Related Chapters - [Stream Gauge Network](../part-1-foundations/stream-gauge-network.qmd) - Available stream data - [Stream-Aquifer Exchange](../part-4-fusion/stream-aquifer-exchange.qmd) - Fusion analysis approach - [Monitoring Gap Analysis](monitoring-gap-analysis.qmd) - Overall gap assessment - [Well Network Analysis](../part-1-foundations/well-network-analysis.qmd) - Well network limitations ## Reflection Questions - Given the distances between wells with data and the nearest stream gauges, which areas would you prioritize for installing purpose-built stream–groundwater transects, and why? - How would you explain to a non-technical stakeholder that the limitation here is network design and data availability, not a lack of analytical tools? - If you did install a small number of new co-located wells and gauges, what measurements and frequencies would you choose to best capture stream–aquifer interactions?