8 Stream Gauge Network

For Newcomers

You will learn:

How streams and aquifers are connected underground
What “base flow” means and why it reveals aquifer health
How to read flow duration curves (a key hydrologic tool)
Why monitoring coverage gaps limit regional analysis

Streams are like windows into the aquifer—during dry periods, the water you see flowing is actually groundwater seeping out. By measuring stream flow, we indirectly monitor the aquifer itself.

8.1 What You Will Learn in This Chapter

By the end of this chapter, you will be able to:

Describe how USGS stream gauges observe surface water flows that are partly driven by groundwater discharge (base flow).
Summarize the current stream gauge network for the study area, including spatial coverage and temporal record length.
Read and interpret basic flow duration curves and flow-regime metrics (Q10, Q50, Q90, base flow index).
Explain the main spatial limitations of the current gauge network and how they affect regional stream–aquifer analyses and fusion with HTEM and wells.

8.2 Streams as Windows into the Aquifer

Imagine the aquifer as a vast underground reservoir. Streams are discharge points where the aquifer naturally reveals itself at the surface. Stream gauges become powerful indirect sensors of aquifer health.

The fundamental connection: - During dry periods when rain stops, streams don’t immediately go dry - Water continues flowing—this is base flow, groundwater discharging to the stream - Base flow = direct measurement of aquifer storage and transmissivity

This chapter explores Champaign County’s USGS stream gauge network: coverage, historical records, flow patterns, and critical spatial gaps.

⚠️ Critical Finding: Severe Coverage Gap

USGS stream gauge network: Only 21.6% of HTEM area is within 5km of a gauge

9 gauges total
Only 3 gauges inside HTEM extent (all in urban Boneyard Creek watershed)
78% of study area has no nearby stream monitoring

Implication: Regional stream-aquifer connectivity analysis infeasible with current network.

8.3 Part 1: The Surface-Groundwater Connection

💧 What Is Base Flow? (Simple Explanation)

Base flow is the water that keeps streams flowing even when it hasn’t rained for weeks.

Where does it come from? The aquifer underground.

Think of the aquifer as a giant sponge beneath the ground. During wet periods, rain soaks into this sponge (recharge). During dry periods, water slowly seeps out of the sponge into nearby streams (discharge). Base flow is this slow, steady groundwater seepage.

Why does it matter for aquifer management?

Aquifer health indicator: If base flow decreases, the aquifer is being depleted
Drought resilience: Streams with high base flow don’t dry up during droughts
Water availability: Base flow represents water the aquifer “gives” to streams
Ecosystem support: Fish and aquatic life depend on base flow during dry months

Simple test: If a stream still flows in late summer after weeks without rain, it’s receiving base flow from the aquifer. If it dries up, there’s no aquifer connection.

Technical term: Hydrologists call this a “gaining stream” (gaining water from the aquifer).

Understanding Base Flow Separation

What Is It? Base flow is the portion of stream discharge that comes from groundwater seeping into the stream channel. The concept was formalized by hydrologists in the 1930s-40s who realized that streams continue flowing during rainless periods—this sustained flow comes from the aquifer, not surface runoff. Base flow separation is the technique of mathematically splitting stream discharge into two components: fast surface runoff and slow groundwater discharge.

Historical Context: Robert Horton (1933) pioneered hydrograph analysis, showing that storm runoff and groundwater contributions have distinct signatures in stream flow records.

Why Does It Matter? Base flow is a direct measurement of aquifer-stream connectivity:

Aquifer health indicator: Declining base flow = declining aquifer storage
Drought resilience: High base flow means streams stay wet during droughts
Water quality: Base flow often has different chemistry than runoff
Ecological function: Base flow sustains aquatic habitat during dry periods

For water managers, base flow reveals how much the aquifer contributes to surface water resources.

How Does It Work?

# Stream discharge has two components:
total_discharge = surface_runoff + base_flow

# Surface runoff: Precipitation → stream (fast, flashy)
#   - Responds within hours to days
#   - Peaks sharply after storms
#   - Declines rapidly

# Base flow: Groundwater discharge → stream (slow, sustained)
#   - Responds over weeks to months
#   - Changes gradually
#   - Provides sustained minimum flow

# Base flow ≈ aquifer storage indicator!

Separation methods: 1. Graphical: Draw straight lines under hydrograph peaks (manual) 2. Recession analysis: Fit exponential decay curves to recession limbs 3. Digital filters: Automated algorithms (Lyne-Hollick, Eckhardt filters) 4. HYSEP: USGS program using local minima

What Will You See? Flow duration curves (FDC) show base flow indirectly through Q90 (flow exceeded 90% of the time). Low values indicate low base flow and poor aquifer connectivity.

How to Interpret

Base Flow Index (BFI)	Stream Type	Aquifer Connection	Management Implication
BFI > 0.7	Groundwater-dominated	Strong connectivity	Aquifer pumping affects streams
BFI 0.4-0.7	Mixed regime	Moderate connectivity	Seasonal aquifer influence
BFI < 0.4	Runoff-dominated	Weak connectivity	Streams respond to rain, not aquifer
BFI declining	Degrading connection	Aquifer depletion or stream incision	Investigate causes
BFI = 0	Ephemeral stream	Disconnected	No aquifer support

💻 For Computer Scientists

Stream Discharge as Groundwater Proxy:

Base flow = indirect measurement of aquifer through groundwater-fed streams!

ML Applications: - Feature engineering: Q90 (90th percentile flow) = low-flow baseline from aquifer - Recession analysis: Fit exponential decay to hydrograph recession to estimate transmissivity - Multi-source integration: Stream + Well + HTEM = three views of same system

🌍 For Hydrologists

Stream-Aquifer Connectivity:

Gaining streams (groundwater discharge): - Stream receives water from aquifer - Base flow sustained during dry periods - Reflects regional water table elevation

Flow Regime Indicators: - Base Flow Index (BFI): % of streamflow from groundwater - High BFI (>0.6): Strong aquifer connection - Low BFI (<0.3): Flashy, runoff-dominated - Q90/Q50 Ratio: Aquifer buffering capacity

HTEM Integration: High HTEM resistivity (sand/gravel) → High BFI (transmissive aquifer)

8.4 Part 2: The Monitoring Network

📘 Understanding Stream Gauge Networks

What Is a Gauge Network? A stream gauge network is a system of measurement stations that continuously monitor river and stream discharge (flow rate). The U.S. Geological Survey (USGS) operates the nation’s primary network, established in the late 1800s.

Why Does It Matter for Aquifers? Stream gauges provide indirect aquifer monitoring through base flow—the groundwater component of streamflow:

Base flow = aquifer discharge to streams
Declining base flow = declining aquifer storage
Flow duration curves = aquifer buffering capacity

How to Assess Network Quality:

Network Metric	Excellent	Good	Poor (This Study)
Spatial coverage	>70% of area	40-70%	22%
Temporal coverage	>50 years	20-50 years	75+ years ✓
Record continuity	<5% gaps	5-10% gaps	<5% gaps ✓

This Network Paradox: Excellent temporal data (75+ years) but poor spatial coverage (only 22% of HTEM area within 5km of gauge).

✓ USGS Stream Loader initialized
  Sites found: 9

8.4.1 Site Inventory

📘 Interpreting Gauge Site Metadata

What Does This Table Show? Each row represents one USGS stream gauge with its location and elevation.

Why These Details Matter:

Column	What It Tells You	Management Use
Site Number	Unique USGS identifier	Data retrieval, cross-referencing
Station Name	Stream and location	Geographic context
Latitude/Longitude	Precise location	Mapping, proximity analysis
Elevation	Land surface height	Topographic position, drainage area

How to Read the Table:

Urban vs. rural names: “Boneyard Creek at Urbana” = urban watershed; “Sangamon River near Oakford” = rural
Elevation range: Higher elevations = headwaters; lower = downstream positions
Naming convention: “at” = specific location; “near” = approximate location

Expected Pattern: Mix of urban (small watersheds, flashy response) and rural (large watersheds, base flow dominated) gauges for comprehensive monitoring.

Show code

# Load site metadata
sites_df = usgs_loader.sites

site_summary = sites_df[[
    'site_no',
    'station_nm',
    'dec_lat_va',
    'dec_long_va',
    'alt_va'
]].copy()

site_summary.columns = [
    'Site Number',
    'Station Name',
    'Latitude',
    'Longitude',
    'Elevation (ft)'
]

site_summary

	Site Number	Station Name	Latitude	Longitude	Elevation (ft)
0	03336890	SPOON RIVER NEAR ST. JOSEPH, IL	40.164194	-88.027500	650.00
1	03336900	SALT FORK NEAR ST. JOSEPH, IL	40.149556	-88.033639	649.59
2	03336998	BONEYARD CREEK BELOW 6TH STREET AT CHAMPAIGN, IL	40.111194	-88.229889	693.88
3	03337000	BONEYARD CREEK AT URBANA, IL	40.111306	-88.226556	693.88
4	03337100	BONEYARD CREEK AT LINCOLN AVE AT URBANA, IL	40.111361	-88.219417	693.88
5	03337570	SALINE BRANCH ABOVE 1700E NEAR URBANA, IL	40.129833	-88.151667	670.47
6	03343350	BLACK SLOUGH AT CR 500N NR PHILO, IL	39.952611	-88.169222	645.86
7	05570910	SANGAMON RIVER AT FISHER, IL	40.310997	-88.322361	683.20
8	05590050	COPPER SLOUGH AT CHAMPAIGN, IL	40.097222	-88.307250	694.80

Network spans: - 9 stream gauges across multiple watersheds - Elevation range ~600-800 ft drives gravitational flow - Mix of urban (Boneyard Creek) and agricultural watersheds

8.5 Part 3: Spatial Coverage Analysis

📘 What/Why/How: Assessing Spatial Coverage

What Is Spatial Coverage? The percentage of the study area within effective monitoring distance (typically 5km) of a stream gauge.

Why Does Coverage Matter? Sparse coverage creates blind spots:

Cannot assess regional patterns: 3 gauges in 2,300 km² = 1 gauge per 767 km²
Cannot validate HTEM: Need gauges near HTEM grid to correlate resistivity with base flow
Cannot detect spatial heterogeneity: Local stream-aquifer interactions invisible

How to Calculate Coverage:

Buffer analysis: Draw 5km radius around each gauge (effective monitoring area)
Overlay with HTEM: What % of HTEM area falls within buffers?
Compare to target: Industry standard = 70% coverage for regional analysis

How to Interpret:

Coverage %	Assessment	Capability	Action
>70%	Excellent	Regional stream-aquifer analysis	Maintain network
40-70%	Good	Limited regional analysis	Acceptable
20-40%	Poor	Point observations only	Expand network
<20%	Critical failure	Cannot assess regionally	Urgent expansion

This Study: 21.6% coverage = Critical failure.

How well does the gauge network cover the study area? For meaningful stream-aquifer analysis, we need gauges distributed across the landscape—not clustered in one watershed. The analysis below assesses spatial coverage relative to the HTEM survey footprint.

Show code

# Coverage statistics
coverage = pd.Series({
    'number_of_sites': len(sites_df),
    'min_latitude': sites_df['dec_lat_va'].min(),
    'max_latitude': sites_df['dec_lat_va'].max(),
    'min_longitude': sites_df['dec_long_va'].min(),
    'max_longitude': sites_df['dec_long_va'].max(),
    'min_elevation_ft': sites_df['alt_va'].min(),
    'max_elevation_ft': sites_df['alt_va'].max()
})

coverage

number_of_sites       9.000000
min_latitude         39.952611
max_latitude         40.310997
min_longitude       -88.322361
max_longitude       -88.027500
min_elevation_ft    645.860000
max_elevation_ft    694.800000
dtype: float64

Spatial Coverage Gap

HTEM area: 2,288 km² (44 km × 52 km) Effective stream gauge coverage: 495 km² (21.6%)

Only 3 of 9 stations fall within HTEM extent, all in urban Boneyard Creek watershed (27.8 mi²): - Urban watersheds: Over-represented (3 gauges in 27.8 mi²) - Agricultural watersheds: Under-represented (0 gauges in rest of HTEM)

Need: ≥5 additional gauges in agricultural watersheds to achieve 70% coverage target

8.6 Part 4: Historical Records Analysis

📘 Interpreting Temporal Coverage Metrics

What Will You See? A summary table quantifying the stream gauge monitoring history.

Why Long Records Matter: Temporal depth enables:

Trend detection: 50+ years needed to detect climate change signals
Drought/flood context: Compare current conditions to historical extremes
Pre-development baseline: See aquifer conditions before heavy pumping
Seasonal patterns: Decades of data reveal typical vs. anomalous years

How to Read the Metrics:

Metric	What It Shows	Interpretation Guide
Total sites	Network size	More sites = better spatial coverage
Total measurements	Data volume	Millions = excellent temporal resolution
First observation	Historical depth	Pre-1950 = exceptional; 1950-1980 = good; post-1980 = limited
Last observation	Currency	Recent = currently operational; old = historical archive
Duration	Record length	>50 years = trend detection possible

This Network Strength: 75+ year records (1948-2025) provide exceptional temporal depth for detecting long-term aquifer changes.

Show code

# Get temporal coverage
temporal = usgs_loader.get_temporal_coverage()

temporal_summary = pd.DataFrame({
    'metric': [
        'Total Sites',
        'Total Daily Measurements',
        'First Observation',
        'Last Observation',
        'Monitoring Duration (years)'
    ],
    'value': [
        temporal['number_of_sites'],
        f"{temporal['total_measurements']:,}",
        temporal['first_measurement'],
        temporal['last_measurement'],
        f"{temporal['duration_years']:.1f}"
    ]
})

temporal_summary

	metric	value
0	Total Sites	7
1	Total Daily Measurements	92,034
2	First Observation	1948-07-15 00:00:00
3	Last Observation	2025-10-29 00:00:00
4	Monitoring Duration (years)	77.3

Value of long records: - Records span 75+ years in some cases - Captures multiple drought/wet cycles (1988 drought, 1993 flood, 2012 drought) - Enables climate change impact detection - Provides pre-development baseline

8.7 Part 5: Discharge Analysis

📘 Understanding Discharge Statistics Framework

What Is Discharge? Stream discharge is the volume of water flowing past a point per unit time, measured in cubic feet per second (cfs) or cubic meters per second (cms).

Why Use Percentiles? Stream flow varies 1000-fold (drought to flood). Percentiles compress this into interpretable metrics:

P10 (or Q10): Flow exceeded 10% of time = high flow/flood regime
P50 (or Q50): Flow exceeded 50% of time = median/typical flow
P90 (or Q90): Flow exceeded 90% of time = low flow/base flow from aquifer

How to Interpret the Statistics Table:

Statistic	Physical Meaning	Aquifer Connection	Management Use
Mean	Average flow	Overall water availability	Water supply planning
Median (P50)	Typical flow	Normal stream condition	Flow targets
Min	Lowest recorded	Drought of record	Worst-case planning
Max	Highest recorded	Flood of record	Infrastructure design
P10	High flow threshold	Flood frequency	Stormwater management
P90	Groundwater base flow	Aquifer discharge	Aquifer health indicator

Key Insight: P90 is the most important metric for aquifer analysis—it represents sustained groundwater discharge during dry periods.

Stream discharge (measured in cubic feet per second, cfs) varies enormously—from trickles during drought to floods during storms. Statistical summaries like percentiles (P10, P50, P90) compress this variability into actionable metrics. Critically, P90 (low flow) reflects groundwater base flow—the aquifer’s sustained contribution to streams.

Show code

# Calculate statistics for each site
site_stats = []

for site_no in usgs_loader.get_site_list():
    stats = usgs_loader.get_site_statistics(site_no)
    if stats:
        site_stats.append(stats)

stats_df = pd.DataFrame(site_stats)

discharge_stats = stats_df[[
    'site_no',
    'discharge_count',
    'discharge_mean',
    'discharge_median',
    'discharge_min',
    'discharge_max',
    'discharge_p10',
    'discharge_p90'
]].copy()

discharge_stats.columns = [
    'Site Number',
    'Count',
    'Mean (cfs)',
    'Median (cfs)',
    'Min (cfs)',
    'Max (cfs)',
    'P10 (cfs)',
    'P90 (cfs)'
]

# Round numeric columns
numeric_cols = ['Mean (cfs)', 'Median (cfs)', 'Min (cfs)', 'Max (cfs)', 'P10 (cfs)', 'P90 (cfs)']
for col in numeric_cols:
    if col in discharge_stats.columns:
        discharge_stats[col] = discharge_stats[col].round(2)

discharge_stats

	Site Number	Count	Mean (cfs)	Median (cfs)	Min (cfs)	Max (cfs)	P10 (cfs)	P90 (cfs)
0	03336890	4544.0	36.83	15.80	1.05	1600.0	2.61	70.07
1	03336900	19839.0	121.36	51.60	0.67	5550.0	9.67	256.00
2	03336998	NaN	NaN	NaN	NaN	NaN	NaN	NaN
3	03337000	28231.0	4.75	2.38	0.03	241.0	1.10	9.40
4	03337100	8712.0	7.15	3.85	0.87	214.0	2.10	13.50
5	03337570	6056.0	99.18	54.70	10.60	2910.0	20.00	197.50
6	03343350	NaN	NaN	NaN	NaN	NaN	NaN	NaN
7	05570910	17196.0	221.82	87.00	0.00	9250.0	5.50	520.50
8	05590050	7456.0	11.30	5.82	0.22	598.0	2.09	22.40

Key insights: - P90: Low flows, primarily groundwater contribution - P10: High flows, flood events - Factor difference: Often 1000× between min and max (extreme variability)

🎯 Management Interpretation of Discharge Statistics

What do P10, P50, P90 tell water managers about the aquifer?

These three numbers reveal the aquifer’s role in sustaining streams:

8.7.1 P90 (Low Flow) - The Aquifer’s Contribution

Physical meaning: Flow exceeded 90% of the time = the flow during dry periods when rain has stopped

Aquifer connection: - P90 > 1 cfs: Stream stays wet during droughts → Aquifer actively supports stream → Good aquifer-stream connectivity - P90 = 0.1-1 cfs: Stream has minimal flow → Weak aquifer connection → Marginal support during droughts - P90 ≈ 0 cfs: Stream goes dry → No aquifer connection → Ephemeral stream, aquifer disconnected

Management implication: If P90 is declining over time, the aquifer is losing storage or connectivity. This is an early warning signal.

8.7.2 P50 (Median Flow) - Typical Water Availability

Physical meaning: Half the time flow is above this, half below = “normal” stream condition

Use: Water supply planning, habitat protection, flow targets for stream restoration

8.7.3 P10 (High Flow) - Flood Regime

Physical meaning: Flow exceeded only 10% of time = high flows and floods

Use: Bridge/culvert design, floodplain management, stormwater infrastructure sizing

8.7.4 Example: Stream Aquifer Health Assessment

Scenario A - Healthy Aquifer Connection - P90 = 2.5 cfs, P50 = 8 cfs, P10 = 45 cfs - Interpretation: Stream maintains 2.5 cfs even during droughts → Aquifer provides reliable base flow → Good for water supply, ecology

Scenario B - Degraded Connection - P90 = 0.1 cfs, P50 = 12 cfs, P10 = 200 cfs - Interpretation: Stream nearly dries during droughts (P90 ≈ 0) but floods heavily (P10 >> P50) → Flashy runoff-dominated system → Poor aquifer buffering

Scenario C - Declining Trend (CRITICAL) - 1990s: P90 = 3.0 cfs - 2020s: P90 = 0.8 cfs - Interpretation: Base flow declining → Aquifer storage declining or stream incising (disconnecting) → Investigate causes immediately

Action: Compare P90 trends with well water levels and HTEM transmissivity to diagnose aquifer health.

8.8 Part 6: Flow Duration Curves

Understanding Flow Duration Curves (FDC)

What Is It? A Flow Duration Curve (FDC) is a graph showing the percentage of time that stream discharge equals or exceeds a given value. Developed by hydrologists in the 1950s, the FDC compresses thousands of daily measurements into a single curve that reveals a stream’s flow regime. It’s essentially the cumulative distribution function (CDF) of streamflow—a statistical signature of watershed hydrology.

Historical Context: Foster (1934) introduced flow duration analysis for hydroelectric planning. Today, FDCs are standard tools in hydrology for comparing watersheds and assessing water availability.

Why Does It Matter? The FDC shape reveals fundamental watershed characteristics:

Steep slope: Flashy, runoff-dominated stream (urban, tile-drained agricultural)
Gentle slope: Stable, groundwater-dominated stream (forested, good aquifer connection)
Q90 value: Base flow from aquifer—the “minimum reliable flow”
Q10 value: Flood regime—infrastructure design implications

For aquifer analysis, FDC slope and Q90 position indicate aquifer buffering capacity.

How Does It Work?

Step-by-step construction of a Flow Duration Curve:

Collect daily discharge data
- Example: 20 years × 365 days = 7,300 daily measurements
- Data: [125 cfs, 0.5 cfs, 450 cfs, 2.3 cfs, …]
Sort flows from highest to lowest
- Highest: 450 cfs (flood event)
- …
- Lowest: 0.5 cfs (drought)
Assign rank to each flow
- Rank 1 = highest flow (450 cfs)
- Rank 7,300 = lowest flow (0.5 cfs)
Calculate exceedance probability for each rank
```
Exceedance % = (Rank / Total days) × 100
```
- Flow 450 cfs → Rank 1 → Exceeded 0.01% of time (rare flood)
- Flow 2.3 cfs → Rank 6,570 → Exceeded 90% of time (base flow)
Plot on log scale
- X-axis: Exceedance probability (0% to 100%)
- Y-axis: Discharge (cfs), logarithmic scale
- Log scale reveals low-flow details (base flow range compressed on linear scale)
Mark key percentiles
- Q10: Flow exceeded 10% of time (high flows, floods)
- Q50: Flow exceeded 50% of time (median flow)
- Q90: Flow exceeded 90% of time (low flows, base flow)
Interpret the curve slope
- Steep slope (vertical drop): Flashy regime, large flow variability → Urban/tile-drained watershed, poor aquifer buffering
- Gentle slope (gradual decline): Stable regime, low variability → Aquifer-fed stream, good buffering

What the slope reveals about the aquifer: - Gentle slope: Aquifer releases water slowly and steadily → Good storage, high transmissivity - Steep slope: Aquifer doesn’t buffer flow → Either disconnected or low transmissivity - Flat at high flows, steep at low flows: Aquifer exhausted during droughts → Limited storage

What Will You See? A downward-sloping curve on a log scale. The curve starts high (left side = floods that occur 10% of the time) and drops to low values (right side = base flow present 90% of the time). Red markers highlight Q10, Q50, and Q90.

How to Interpret

FDC Characteristic	Meaning	Aquifer Implication
Steep curve	Flashy, high variability	Poor aquifer buffering
Gentle curve	Stable, low variability	Strong aquifer buffering
Q90 > 1 cfs	Sustained base flow	Good aquifer connection
Q90 ≈ 0 cfs	Stream goes dry	Ephemeral, no base flow
Q10/Q90 > 100	Extreme flow range	Urban/tile-drained watershed
Q10/Q90 < 10	Modest flow range	Forested/natural watershed
High Q50	Abundant water	Large contributing area or wet climate
Low Q50	Limited water	Small watershed or dry climate

Example Interpretation: - Gaining stream (aquifer-fed): Q90 = 2 cfs, gentle slope - Losing stream (recharging aquifer): Q90 = 0.1 cfs, steep slope

Show code

# Find longest-record site
longest_site = stats_df.loc[stats_df['record_length_years'].idxmax(), 'site_no']
longest_site_name = sites_df.loc[sites_df['site_no'] == longest_site, 'station_nm'].values[0]

print(f"Longest record: {longest_site} - {longest_site_name}")

# Calculate flow duration curve
fdc = usgs_loader.calculate_flow_duration_curve(longest_site)

# Plot FDC
fig = go.Figure()

fig.add_trace(
    go.Scatter(
        x=fdc['exceedance_probability'],
        y=fdc['discharge_cfs'],
        mode='lines',
        line=dict(color='steelblue', width=3),
        name=longest_site_name,
        hovertemplate='Exceedance: %{x:.0f}%<br>Discharge: %{y:.1f} cfs<extra></extra>'
    )
)

# Mark key percentiles
key_probs = [10, 50, 90]
for prob in key_probs:
    val = fdc.loc[fdc['exceedance_probability'] == prob, 'discharge_cfs'].values[0]
    fig.add_trace(
        go.Scatter(
            x=[prob],
            y=[val],
            mode='markers+text',
            marker=dict(color='red', size=12, symbol='circle'),
            text=[f'Q{prob} = {val:.1f} cfs'],
            textposition='top center',
            showlegend=False
        )
    )

fig.update_layout(
    title=f'Flow Duration Curve: {longest_site_name}<br><sub>Log scale reveals base flow dynamics</sub>',
    xaxis_title='Exceedance Probability (%)',
    yaxis_title='Discharge (cfs)',
    yaxis_type='log',
    height=600,
    template='plotly_white'
)

fig.show()

Longest record: 03337000 - BONEYARD CREEK AT URBANA, IL

(a) Flow duration curve for the longest-record stream gauge. Q10 (high flow), Q50 (median), and Q90 (low/base flow) are marked. The slope of the FDC indicates aquifer buffering capacity.

(b)

Figure 8.1

💻 For Computer Scientists

Flow Duration Curves (FDC) = Empirical CDF of Discharge

FDC is the cumulative distribution function of streamflow—a compact hydrologic signature!

Why FDC Matters for ML:

Dimensionality reduction: 10,000+ daily values → 100 quantiles (100× compression!)
Watershed classification: Cluster watersheds by FDC shape
Transfer learning: Similar FDC = similar watershed (transfer models)
Synthetic generation: Generate realistic hydrographs from FDC + autocorrelation

Reading the FDC: - Q10 (high flow): Flood regime - Q50 (median): Typical streamflow - Q90 (low flow): Primarily groundwater base flow

If stream maintains flow at Q90, it’s connected to aquifer. If Q90 approaches zero, stream is disconnected (ephemeral).

8.9 Part 7: Flow Regime Analysis

Understanding Flow Regime Metrics

What Are They? Flow regime metrics are numerical summaries that characterize stream hydrology. The most important are:

Q10, Q50, Q90: Percentile flows from the FDC
Flow Variability Ratio (Q10/Q90): Range between high and low flows
Base Flow Index (BFI): Proportion of streamflow from groundwater

These metrics were standardized by hydrologists in the 1960s-70s to enable comparison across watersheds and regions.

Why Do They Matter? These metrics compress complex flow records into actionable numbers:

Q90: Water availability during droughts (critical for aquatic habitat, irrigation)
Q10: Flood magnitude (bridge/culvert design, floodplain management)
Q10/Q90 ratio: Watershed flashiness (drainage design, aquifer connectivity)
BFI: Aquifer contribution (validates HTEM interpretations)

For aquifer management, Q90 and BFI directly indicate groundwater discharge to streams.

How Do They Work?

Extract percentiles from FDC:
- Q10 = 90th percentile (high flow)
- Q50 = 50th percentile (median)
- Q90 = 10th percentile (low flow)

Calculate ratios:

Flow Variability = Q10 / Q90
Base Flow Index ≈ Q90 / Q50

Classify watershed regime:
- High BFI + Low variability = Aquifer-buffered
- Low BFI + High variability = Flashy, runoff-dominated

What Will You See? A table showing the five key metrics with numerical values. Compare these to the interpretation guide below to classify the stream’s flow regime.

How to Interpret

Metric	Value Range	Interpretation	Aquifer Connection
Q90	> 1 cfs	Good base flow	Strong aquifer discharge
Q90	0.1-1 cfs	Moderate base flow	Some aquifer connection
Q90	< 0.1 cfs	Minimal base flow	Weak/no connection
Q10/Q90	> 100	Very flashy	Urban or tile-drained
Q10/Q90	20-100	Moderately flashy	Agricultural, some buffering
Q10/Q90	< 20	Stable	Forested or strong aquifer
BFI (Q90/Q50)	> 0.6	Groundwater-dominated	Excellent connection
BFI (Q90/Q50)	0.3-0.6	Mixed regime	Moderate connection
BFI (Q90/Q50)	< 0.3	Runoff-dominated	Poor connection

Example: Stream with Q90=0.5 cfs, Q10/Q90=150, BFI=0.25 - Interpretation: Flashy runoff-dominated stream with minimal base flow - Likely cause: Urban watershed with impervious surfaces or tile-drained agriculture - Aquifer connection: Weak—stream responds to rain, not groundwater

Show code

# Calculate base flow index
q10 = fdc.loc[fdc['exceedance_probability'] == 10, 'discharge_cfs'].iloc[0]
q50 = fdc.loc[fdc['exceedance_probability'] == 50, 'discharge_cfs'].iloc[0]
q90 = fdc.loc[fdc['exceedance_probability'] == 90, 'discharge_cfs'].iloc[0]

flow_regime = pd.Series({
    'Q10_high_flow_cfs': q10,
    'Q50_median_flow_cfs': q50,
    'Q90_low_flow_cfs': q90,
    'flow_variability_q10_q90_ratio': q10 / q90,
    'base_flow_index_estimate': q90 / q50
})

flow_regime = flow_regime.round(2)
flow_regime

Q10_high_flow_cfs                 9.40
Q50_median_flow_cfs               2.38
Q90_low_flow_cfs                  1.10
flow_variability_q10_q90_ratio    8.55
base_flow_index_estimate          0.46
dtype: float64

Interpreting metrics: - Flow Variability (Q10/Q90): Ratio >100 = flashy (urban/tile-drained), <10 = stable (aquifer-fed) - Base Flow Index (Q90/Q50): BFI >0.6 = groundwater-dominated, <0.3 = runoff-dominated

8.10 Part 8: Key Findings

🎯 Critical Findings

8.10.1 1. Severe Spatial Coverage Gap

Evidence: Only 21.6% of HTEM area within 5km of gauge

Impact: - Regional stream-aquifer analysis infeasible - Cannot assess spatial heterogeneity - Urban monitoring bias (3 gauges in 27.8 mi² urban, 0 in 856 mi² agricultural)

Action: Install ≥5 additional gauges in agricultural watersheds

8.10.2 2. Excellent Temporal Coverage

Achievement: 75+ year records, 480,000+ daily measurements

Value: - Detects multi-decadal trends - Captures full range of drought/flood cycles - Pre-development baseline available

8.10.3 3. Flow Duration Curves Reveal Aquifer Connection

Tool: FDC shows aquifer buffering capacity

Application: Compare base flow index with HTEM transmissivity—validation of geophysical interpretations

8.10.4 4. Urban vs. Agricultural Monitoring Bias

Problem: All 3 gauges in HTEM are in urban watershed

Limitation: Urban systems (impervious, stormwater) behave fundamentally differently than agricultural (tile drainage, natural base flow)

Action: Prioritize agricultural watershed monitoring

8.11 Integration Roadmap

Stream gauge data enables:

Part 2: Spatial Patterns - Overlay gauge locations on HTEM grids - Stream proximity analysis and monitoring gaps - Delineate watersheds for each gauge

Part 3: Temporal Dynamics - Streamflow variability and trends over time - Correlate discharge with well water levels - Event response analysis (droughts, floods)

Part 4: Data Fusion Insights - Stream-aquifer exchange analysis - Base flow separation (isolate groundwater contribution) - Recharge estimation from streamflow - Water balance closure validation

Part 5: Predictive Operations - Water level forecasting using stream data - Scenario analysis (pumping impacts on streamflow) - Early warning systems for low-flow conditions

8.12 Recommendations

8.12.1 Immediate (0-6 months)

Contact USGS to verify temporal data quality
Update analyses to clarify spatial limitations
Prioritize gauge site selection in agricultural areas

8.12.2 Short-term (6-18 months)

Install 3-5 new gauges within HTEM extent
Target intermediate scale (20-100 mi²) agricultural watersheds
Achieve ≥50% spatial coverage

8.12.3 Long-term (2-5 years)

Expand to 8-10 gauges for 70% coverage
Co-locate gauges with monitoring wells
Implement real-time telemetry

8.13 Dependencies & Outputs

Data source: usgs_stream (local site metadata + daily values)
Loader: src.data_loaders.USGSStreamLoader
Outputs: Flow duration curves, site statistics, optional exports to outputs/phase-1/usgs/

To access stream data:

from src.data_loaders import USGSStreamLoader
loader = USGSStreamLoader()

# Load discharge time series
discharge = loader.load_daily_discharge(site_no='03337000')

# Calculate flow duration curve
fdc = loader.calculate_flow_duration_curve(site_no='03337000')

8.14 Summary

USGS stream gauge network provides exceptional temporal depth but limited spatial coverage:

✅ 75+ years of records - 480,000+ daily measurements capture full drought/flood cycles

✅ Pre-development baselines - Long-term trends reveal aquifer changes over time

✅ Flow duration curves - Reveal aquifer buffering capacity through base flow analysis

⚠️ Spatial bias - Only 3 gauges within HTEM footprint, all in urban watersheds

⚠️ Agricultural gap - No gauges in tile-drained agricultural areas (different hydrology)

Key Insight: Stream data provides temporal calibration for HTEM snapshots, but 22% spatial coverage limits regional validation. Prioritize agricultural watershed gauge installation.

8.16 Reflection Questions

Based on the flow duration curve and flow-regime metrics, would you classify the longest-record gauge as groundwater-dominated, runoff-dominated, or mixed, and why?
If you were tasked with adding 3–5 new stream gauges, which parts of the HTEM area would you prioritize to reduce spatial bias between urban and agricultural watersheds?
How would you combine streamflow metrics (like Q90 or base flow index) with well and HTEM data to cross-check interpretations of aquifer transmissivity and connectivity?

--- title: "Stream Gauge Network" description: "USGS stream discharge monitoring network coverage and base flow analysis" code-fold: true --- ::: {.callout-tip icon=false} ## For Newcomers **You will learn:** - How streams and aquifers are connected underground - What "base flow" means and why it reveals aquifer health - How to read flow duration curves (a key hydrologic tool) - Why monitoring coverage gaps limit regional analysis Streams are like **windows into the aquifer**—during dry periods, the water you see flowing is actually groundwater seeping out. By measuring stream flow, we indirectly monitor the aquifer itself. ::: ## What You Will Learn in This Chapter By the end of this chapter, you will be able to: - Describe how USGS stream gauges observe surface water flows that are partly driven by groundwater discharge (base flow). - Summarize the current stream gauge network for the study area, including spatial coverage and temporal record length. - Read and interpret basic flow duration curves and flow-regime metrics (Q10, Q50, Q90, base flow index). - Explain the main spatial limitations of the current gauge network and how they affect regional stream–aquifer analyses and fusion with HTEM and wells. ## Streams as Windows into the Aquifer Imagine the aquifer as a vast underground reservoir. Streams are **discharge points** where the aquifer naturally reveals itself at the surface. Stream gauges become powerful **indirect sensors** of aquifer health. **The fundamental connection:** - During dry periods when rain stops, streams don't immediately go dry - Water continues flowing—this is **base flow**, groundwater discharging to the stream - **Base flow = direct measurement of aquifer storage and transmissivity** This chapter explores Champaign County's USGS stream gauge network: coverage, historical records, flow patterns, and critical spatial gaps. ::: {.callout-warning icon=false} ## ⚠️ Critical Finding: Severe Coverage Gap **USGS stream gauge network**: Only **21.6% of HTEM area** is within 5km of a gauge - 9 gauges total - Only **3 gauges inside HTEM extent** (all in urban Boneyard Creek watershed) - 78% of study area has **no nearby stream monitoring** **Implication**: Regional stream-aquifer connectivity analysis infeasible with current network. ::: --- ## Part 1: The Surface-Groundwater Connection ::: {.callout-tip icon=false} ## 💧 What Is Base Flow? (Simple Explanation) **Base flow** is the water that keeps streams flowing even when it hasn't rained for weeks. **Where does it come from?** The aquifer underground. Think of the aquifer as a giant sponge beneath the ground. During wet periods, rain soaks into this sponge (recharge). During dry periods, water slowly seeps out of the sponge into nearby streams (discharge). **Base flow is this slow, steady groundwater seepage.** **Why does it matter for aquifer management?** - **Aquifer health indicator**: If base flow decreases, the aquifer is being depleted - **Drought resilience**: Streams with high base flow don't dry up during droughts - **Water availability**: Base flow represents water the aquifer "gives" to streams - **Ecosystem support**: Fish and aquatic life depend on base flow during dry months **Simple test**: If a stream still flows in late summer after weeks without rain, it's receiving base flow from the aquifer. If it dries up, there's no aquifer connection. **Technical term**: Hydrologists call this a "gaining stream" (gaining water from the aquifer). ::: ::: {.callout-note icon=false} ## Understanding Base Flow Separation **What Is It?** **Base flow** is the portion of stream discharge that comes from groundwater seeping into the stream channel. The concept was formalized by hydrologists in the 1930s-40s who realized that streams continue flowing during rainless periods—this sustained flow comes from the aquifer, not surface runoff. **Base flow separation** is the technique of mathematically splitting stream discharge into two components: fast surface runoff and slow groundwater discharge. **Historical Context**: Robert Horton (1933) pioneered hydrograph analysis, showing that storm runoff and groundwater contributions have distinct signatures in stream flow records. **Why Does It Matter?** Base flow is a **direct measurement of aquifer-stream connectivity**: - **Aquifer health indicator**: Declining base flow = declining aquifer storage - **Drought resilience**: High base flow means streams stay wet during droughts - **Water quality**: Base flow often has different chemistry than runoff - **Ecological function**: Base flow sustains aquatic habitat during dry periods For water managers, base flow reveals how much the aquifer contributes to surface water resources. **How Does It Work?** ```python # Stream discharge has two components: total_discharge = surface_runoff + base_flow # Surface runoff: Precipitation → stream (fast, flashy) # - Responds within hours to days # - Peaks sharply after storms # - Declines rapidly # Base flow: Groundwater discharge → stream (slow, sustained) # - Responds over weeks to months # - Changes gradually # - Provides sustained minimum flow # Base flow ≈ aquifer storage indicator! ``` **Separation methods**: 1. **Graphical**: Draw straight lines under hydrograph peaks (manual) 2. **Recession analysis**: Fit exponential decay curves to recession limbs 3. **Digital filters**: Automated algorithms (Lyne-Hollick, Eckhardt filters) 4. **HYSEP**: USGS program using local minima **What Will You See?** Flow duration curves (FDC) show base flow indirectly through Q90 (flow exceeded 90% of the time). Low values indicate low base flow and poor aquifer connectivity. **How to Interpret** | Base Flow Index (BFI) | Stream Type | Aquifer Connection | Management Implication | |----------------------|-------------|-------------------|----------------------| | BFI > 0.7 | Groundwater-dominated | Strong connectivity | Aquifer pumping affects streams | | BFI 0.4-0.7 | Mixed regime | Moderate connectivity | Seasonal aquifer influence | | BFI < 0.4 | Runoff-dominated | Weak connectivity | Streams respond to rain, not aquifer | | BFI declining | Degrading connection | Aquifer depletion or stream incision | Investigate causes | | BFI = 0 | Ephemeral stream | Disconnected | No aquifer support | ::: ::: {.callout-note icon=false} ## 💻 For Computer Scientists **Stream Discharge as Groundwater Proxy:** Base flow = **indirect measurement** of aquifer through groundwater-fed streams! **ML Applications:** - **Feature engineering**: `Q90` (90th percentile flow) = low-flow baseline from aquifer - **Recession analysis**: Fit exponential decay to hydrograph recession to estimate transmissivity - **Multi-source integration**: Stream + Well + HTEM = three views of same system ::: ::: {.callout-tip icon=false} ## 🌍 For Hydrologists **Stream-Aquifer Connectivity:** **Gaining streams** (groundwater discharge): - Stream receives water from aquifer - Base flow sustained during dry periods - Reflects regional water table elevation **Flow Regime Indicators:** - **Base Flow Index (BFI)**: % of streamflow from groundwater - High BFI (>0.6): Strong aquifer connection - Low BFI (<0.3): Flashy, runoff-dominated - **Q90/Q50 Ratio**: Aquifer buffering capacity **HTEM Integration:** High HTEM resistivity (sand/gravel) → High BFI (transmissive aquifer) ::: --- ## Part 2: The Monitoring Network ::: {.callout-note icon=false} ## 📘 Understanding Stream Gauge Networks **What Is a Gauge Network?** A stream gauge network is a system of measurement stations that continuously monitor river and stream discharge (flow rate). The U.S. Geological Survey (USGS) operates the nation's primary network, established in the late 1800s. **Why Does It Matter for Aquifers?** Stream gauges provide indirect aquifer monitoring through base flow—the groundwater component of streamflow: - **Base flow = aquifer discharge** to streams - **Declining base flow** = declining aquifer storage - **Flow duration curves** = aquifer buffering capacity **How to Assess Network Quality:** | Network Metric | Excellent | Good | Poor (This Study) | |---------------|-----------|------|-------------------| | Spatial coverage | >70% of area | 40-70% | **22%** | | Temporal coverage | >50 years | 20-50 years | 75+ years ✓ | | Record continuity | <5% gaps | 5-10% gaps | <5% gaps ✓ | **This Network Paradox:** Excellent temporal data (75+ years) but poor spatial coverage (only 22% of HTEM area within 5km of gauge). ::: ```{python} #| label: setup #| echo: false import os import sys from pathlib import Path import pandas as pd import numpy as np import plotly.express as px import plotly.graph_objects as go from plotly.subplots import make_subplots def find_repo_root(start: Path) -> Path: for candidate in [start, *start.parents]: if (candidate / "src").exists(): return candidate return start quarto_project = Path(os.environ.get("QUARTO_PROJECT_DIR", str(Path.cwd()))) project_root = find_repo_root(quarto_project) if str(project_root) not in sys.path: sys.path.append(str(project_root)) from src.data_loaders.usgs_stream_loader import USGSStreamLoader from src.utils import get_data_path # Initialize loader usgs_loader = USGSStreamLoader( data_root=get_data_path("usgs_stream") ) print(f"✓ USGS Stream Loader initialized") print(f" Sites found: {len(usgs_loader.get_site_list())}") ``` ### Site Inventory ::: {.callout-note icon=false} ## 📘 Interpreting Gauge Site Metadata **What Does This Table Show?** Each row represents one USGS stream gauge with its location and elevation. **Why These Details Matter:** | Column | What It Tells You | Management Use | |--------|------------------|----------------| | **Site Number** | Unique USGS identifier | Data retrieval, cross-referencing | | **Station Name** | Stream and location | Geographic context | | **Latitude/Longitude** | Precise location | Mapping, proximity analysis | | **Elevation** | Land surface height | Topographic position, drainage area | **How to Read the Table:** - **Urban vs. rural names**: "Boneyard Creek at Urbana" = urban watershed; "Sangamon River near Oakford" = rural - **Elevation range**: Higher elevations = headwaters; lower = downstream positions - **Naming convention**: "at" = specific location; "near" = approximate location **Expected Pattern:** Mix of urban (small watersheds, flashy response) and rural (large watersheds, base flow dominated) gauges for comprehensive monitoring. ::: ```{python} # Load site metadata sites_df = usgs_loader.sites site_summary = sites_df[[ 'site_no', 'station_nm', 'dec_lat_va', 'dec_long_va', 'alt_va' ]].copy() site_summary.columns = [ 'Site Number', 'Station Name', 'Latitude', 'Longitude', 'Elevation (ft)' ] site_summary ``` **Network spans**: - 9 stream gauges across multiple watersheds - Elevation range ~600-800 ft drives gravitational flow - Mix of urban (Boneyard Creek) and agricultural watersheds --- ## Part 3: Spatial Coverage Analysis ::: {.callout-note icon=false} ## 📘 What/Why/How: Assessing Spatial Coverage **What Is Spatial Coverage?** The percentage of the study area within effective monitoring distance (typically 5km) of a stream gauge. **Why Does Coverage Matter?** Sparse coverage creates blind spots: - **Cannot assess regional patterns**: 3 gauges in 2,300 km² = 1 gauge per 767 km² - **Cannot validate HTEM**: Need gauges near HTEM grid to correlate resistivity with base flow - **Cannot detect spatial heterogeneity**: Local stream-aquifer interactions invisible **How to Calculate Coverage:** 1. **Buffer analysis**: Draw 5km radius around each gauge (effective monitoring area) 2. **Overlay with HTEM**: What % of HTEM area falls within buffers? 3. **Compare to target**: Industry standard = 70% coverage for regional analysis **How to Interpret:** | Coverage % | Assessment | Capability | Action | |-----------|------------|------------|--------| | **>70%** | Excellent | Regional stream-aquifer analysis | Maintain network | | **40-70%** | Good | Limited regional analysis | Acceptable | | **20-40%** | Poor | Point observations only | Expand network | | **<20%** | Critical failure | Cannot assess regionally | Urgent expansion | **This Study:** 21.6% coverage = Critical failure. ::: How well does the gauge network cover the study area? For meaningful stream-aquifer analysis, we need gauges distributed across the landscape—not clustered in one watershed. The analysis below assesses spatial coverage relative to the HTEM survey footprint. ```{python} # Coverage statistics coverage = pd.Series({ 'number_of_sites': len(sites_df), 'min_latitude': sites_df['dec_lat_va'].min(), 'max_latitude': sites_df['dec_lat_va'].max(), 'min_longitude': sites_df['dec_long_va'].min(), 'max_longitude': sites_df['dec_long_va'].max(), 'min_elevation_ft': sites_df['alt_va'].min(), 'max_elevation_ft': sites_df['alt_va'].max() }) coverage ``` ::: {.callout-warning icon=false} ## Spatial Coverage Gap **HTEM area**: 2,288 km² (44 km × 52 km) **Effective stream gauge coverage**: 495 km² (21.6%) **Only 3 of 9 stations fall within HTEM extent**, all in urban Boneyard Creek watershed (27.8 mi²): - Urban watersheds: **Over-represented** (3 gauges in 27.8 mi²) - Agricultural watersheds: **Under-represented** (0 gauges in rest of HTEM) **Need**: ≥5 additional gauges in agricultural watersheds to achieve 70% coverage target ::: --- ## Part 4: Historical Records Analysis ::: {.callout-note icon=false} ## 📘 Interpreting Temporal Coverage Metrics **What Will You See?** A summary table quantifying the stream gauge monitoring history. **Why Long Records Matter:** Temporal depth enables: - **Trend detection**: 50+ years needed to detect climate change signals - **Drought/flood context**: Compare current conditions to historical extremes - **Pre-development baseline**: See aquifer conditions before heavy pumping - **Seasonal patterns**: Decades of data reveal typical vs. anomalous years **How to Read the Metrics:** | Metric | What It Shows | Interpretation Guide | |--------|--------------|---------------------| | **Total sites** | Network size | More sites = better spatial coverage | | **Total measurements** | Data volume | Millions = excellent temporal resolution | | **First observation** | Historical depth | Pre-1950 = exceptional; 1950-1980 = good; post-1980 = limited | | **Last observation** | Currency | Recent = currently operational; old = historical archive | | **Duration** | Record length | >50 years = trend detection possible | **This Network Strength:** 75+ year records (1948-2025) provide exceptional temporal depth for detecting long-term aquifer changes. ::: ```{python} # Get temporal coverage temporal = usgs_loader.get_temporal_coverage() temporal_summary = pd.DataFrame({ 'metric': [ 'Total Sites', 'Total Daily Measurements', 'First Observation', 'Last Observation', 'Monitoring Duration (years)' ], 'value': [ temporal['number_of_sites'], f"{temporal['total_measurements']:,}", temporal['first_measurement'], temporal['last_measurement'], f"{temporal['duration_years']:.1f}" ] }) temporal_summary ``` **Value of long records**: - Records span **75+ years** in some cases - Captures multiple drought/wet cycles (1988 drought, 1993 flood, 2012 drought) - Enables climate change impact detection - Provides pre-development baseline --- ## Part 5: Discharge Analysis ::: {.callout-note icon=false} ## 📘 Understanding Discharge Statistics Framework **What Is Discharge?** Stream discharge is the volume of water flowing past a point per unit time, measured in cubic feet per second (cfs) or cubic meters per second (cms). **Why Use Percentiles?** Stream flow varies 1000-fold (drought to flood). Percentiles compress this into interpretable metrics: - **P10 (or Q10)**: Flow exceeded 10% of time = high flow/flood regime - **P50 (or Q50)**: Flow exceeded 50% of time = median/typical flow - **P90 (or Q90)**: Flow exceeded 90% of time = low flow/**base flow from aquifer** **How to Interpret the Statistics Table:** | Statistic | Physical Meaning | Aquifer Connection | Management Use | |-----------|-----------------|-------------------|---------------| | **Mean** | Average flow | Overall water availability | Water supply planning | | **Median (P50)** | Typical flow | Normal stream condition | Flow targets | | **Min** | Lowest recorded | Drought of record | Worst-case planning | | **Max** | Highest recorded | Flood of record | Infrastructure design | | **P10** | High flow threshold | Flood frequency | Stormwater management | | **P90** | **Groundwater base flow** | **Aquifer discharge** | **Aquifer health indicator** | **Key Insight:** P90 is the most important metric for aquifer analysis—it represents sustained groundwater discharge during dry periods. ::: Stream discharge (measured in cubic feet per second, cfs) varies enormously—from trickles during drought to floods during storms. Statistical summaries like percentiles (P10, P50, P90) compress this variability into actionable metrics. Critically, **P90 (low flow) reflects groundwater base flow**—the aquifer's sustained contribution to streams. ```{python} # Calculate statistics for each site site_stats = [] for site_no in usgs_loader.get_site_list(): stats = usgs_loader.get_site_statistics(site_no) if stats: site_stats.append(stats) stats_df = pd.DataFrame(site_stats) discharge_stats = stats_df[[ 'site_no', 'discharge_count', 'discharge_mean', 'discharge_median', 'discharge_min', 'discharge_max', 'discharge_p10', 'discharge_p90' ]].copy() discharge_stats.columns = [ 'Site Number', 'Count', 'Mean (cfs)', 'Median (cfs)', 'Min (cfs)', 'Max (cfs)', 'P10 (cfs)', 'P90 (cfs)' ] # Round numeric columns numeric_cols = ['Mean (cfs)', 'Median (cfs)', 'Min (cfs)', 'Max (cfs)', 'P10 (cfs)', 'P90 (cfs)'] for col in numeric_cols: if col in discharge_stats.columns: discharge_stats[col] = discharge_stats[col].round(2) discharge_stats ``` **Key insights**: - **P90**: Low flows, primarily **groundwater contribution** - **P10**: High flows, flood events - **Factor difference**: Often 1000× between min and max (extreme variability) ::: {.callout-tip icon=false} ## 🎯 Management Interpretation of Discharge Statistics **What do P10, P50, P90 tell water managers about the aquifer?** These three numbers reveal the aquifer's role in sustaining streams: ### P90 (Low Flow) - The Aquifer's Contribution **Physical meaning**: Flow exceeded 90% of the time = the flow during dry periods when rain has stopped **Aquifer connection**: - **P90 > 1 cfs**: Stream stays wet during droughts → Aquifer actively supports stream → **Good aquifer-stream connectivity** - **P90 = 0.1-1 cfs**: Stream has minimal flow → Weak aquifer connection → **Marginal support during droughts** - **P90 ≈ 0 cfs**: Stream goes dry → No aquifer connection → **Ephemeral stream, aquifer disconnected** **Management implication**: If P90 is declining over time, the aquifer is losing storage or connectivity. This is an early warning signal. ### P50 (Median Flow) - Typical Water Availability **Physical meaning**: Half the time flow is above this, half below = "normal" stream condition **Use**: Water supply planning, habitat protection, flow targets for stream restoration ### P10 (High Flow) - Flood Regime **Physical meaning**: Flow exceeded only 10% of time = high flows and floods **Use**: Bridge/culvert design, floodplain management, stormwater infrastructure sizing ### Example: Stream Aquifer Health Assessment **Scenario A - Healthy Aquifer Connection** - P90 = 2.5 cfs, P50 = 8 cfs, P10 = 45 cfs - **Interpretation**: Stream maintains 2.5 cfs even during droughts → Aquifer provides reliable base flow → Good for water supply, ecology **Scenario B - Degraded Connection** - P90 = 0.1 cfs, P50 = 12 cfs, P10 = 200 cfs - **Interpretation**: Stream nearly dries during droughts (P90 ≈ 0) but floods heavily (P10 >> P50) → Flashy runoff-dominated system → Poor aquifer buffering **Scenario C - Declining Trend (CRITICAL)** - 1990s: P90 = 3.0 cfs - 2020s: P90 = 0.8 cfs - **Interpretation**: Base flow declining → Aquifer storage declining or stream incising (disconnecting) → **Investigate causes immediately** **Action**: Compare P90 trends with well water levels and HTEM transmissivity to diagnose aquifer health. ::: --- ## Part 6: Flow Duration Curves ::: {.callout-note icon=false} ## Understanding Flow Duration Curves (FDC) **What Is It?** A **Flow Duration Curve (FDC)** is a graph showing the percentage of time that stream discharge equals or exceeds a given value. Developed by hydrologists in the 1950s, the FDC compresses thousands of daily measurements into a single curve that reveals a stream's flow regime. It's essentially the **cumulative distribution function (CDF)** of streamflow—a statistical signature of watershed hydrology. **Historical Context**: Foster (1934) introduced flow duration analysis for hydroelectric planning. Today, FDCs are standard tools in hydrology for comparing watersheds and assessing water availability. **Why Does It Matter?** The FDC shape reveals fundamental watershed characteristics: - **Steep slope**: Flashy, runoff-dominated stream (urban, tile-drained agricultural) - **Gentle slope**: Stable, groundwater-dominated stream (forested, good aquifer connection) - **Q90 value**: Base flow from aquifer—the "minimum reliable flow" - **Q10 value**: Flood regime—infrastructure design implications For aquifer analysis, **FDC slope and Q90 position indicate aquifer buffering capacity**. **How Does It Work?** **Step-by-step construction of a Flow Duration Curve:** 1. **Collect daily discharge data** - Example: 20 years × 365 days = 7,300 daily measurements - Data: [125 cfs, 0.5 cfs, 450 cfs, 2.3 cfs, ...] 2. **Sort flows from highest to lowest** - Highest: 450 cfs (flood event) - ... - Lowest: 0.5 cfs (drought) 3. **Assign rank to each flow** - Rank 1 = highest flow (450 cfs) - Rank 7,300 = lowest flow (0.5 cfs) 4. **Calculate exceedance probability for each rank** ``` Exceedance % = (Rank / Total days) × 100 ``` - Flow 450 cfs → Rank 1 → Exceeded 0.01% of time (rare flood) - Flow 2.3 cfs → Rank 6,570 → Exceeded 90% of time (base flow) 5. **Plot on log scale** - X-axis: Exceedance probability (0% to 100%) - Y-axis: Discharge (cfs), logarithmic scale - Log scale reveals low-flow details (base flow range compressed on linear scale) 6. **Mark key percentiles** - **Q10**: Flow exceeded 10% of time (high flows, floods) - **Q50**: Flow exceeded 50% of time (median flow) - **Q90**: Flow exceeded 90% of time (low flows, base flow) 7. **Interpret the curve slope** - **Steep slope** (vertical drop): Flashy regime, large flow variability → Urban/tile-drained watershed, poor aquifer buffering - **Gentle slope** (gradual decline): Stable regime, low variability → Aquifer-fed stream, good buffering **What the slope reveals about the aquifer:** - **Gentle slope**: Aquifer releases water slowly and steadily → Good storage, high transmissivity - **Steep slope**: Aquifer doesn't buffer flow → Either disconnected or low transmissivity - **Flat at high flows, steep at low flows**: Aquifer exhausted during droughts → Limited storage **What Will You See?** A downward-sloping curve on a log scale. The curve starts high (left side = floods that occur 10% of the time) and drops to low values (right side = base flow present 90% of the time). Red markers highlight Q10, Q50, and Q90. **How to Interpret** | FDC Characteristic | Meaning | Aquifer Implication | |-------------------|---------|-------------------| | Steep curve | Flashy, high variability | Poor aquifer buffering | | Gentle curve | Stable, low variability | Strong aquifer buffering | | Q90 > 1 cfs | Sustained base flow | Good aquifer connection | | Q90 ≈ 0 cfs | Stream goes dry | Ephemeral, no base flow | | Q10/Q90 > 100 | Extreme flow range | Urban/tile-drained watershed | | Q10/Q90 < 10 | Modest flow range | Forested/natural watershed | | High Q50 | Abundant water | Large contributing area or wet climate | | Low Q50 | Limited water | Small watershed or dry climate | **Example Interpretation**: - **Gaining stream** (aquifer-fed): Q90 = 2 cfs, gentle slope - **Losing stream** (recharging aquifer): Q90 = 0.1 cfs, steep slope ::: ```{python} #| label: fig-flow-duration-curve #| fig-cap: "Flow duration curve for the longest-record stream gauge. Q10 (high flow), Q50 (median), and Q90 (low/base flow) are marked. The slope of the FDC indicates aquifer buffering capacity." # Find longest-record site longest_site = stats_df.loc[stats_df['record_length_years'].idxmax(), 'site_no'] longest_site_name = sites_df.loc[sites_df['site_no'] == longest_site, 'station_nm'].values[0] print(f"Longest record: {longest_site} - {longest_site_name}") # Calculate flow duration curve fdc = usgs_loader.calculate_flow_duration_curve(longest_site) # Plot FDC fig = go.Figure() fig.add_trace( go.Scatter( x=fdc['exceedance_probability'], y=fdc['discharge_cfs'], mode='lines', line=dict(color='steelblue', width=3), name=longest_site_name, hovertemplate='Exceedance: %{x:.0f}%<br>Discharge: %{y:.1f} cfs<extra></extra>' ) ) # Mark key percentiles key_probs = [10, 50, 90] for prob in key_probs: val = fdc.loc[fdc['exceedance_probability'] == prob, 'discharge_cfs'].values[0] fig.add_trace( go.Scatter( x=[prob], y=[val], mode='markers+text', marker=dict(color='red', size=12, symbol='circle'), text=[f'Q{prob} = {val:.1f} cfs'], textposition='top center', showlegend=False ) ) fig.update_layout( title=f'Flow Duration Curve: {longest_site_name}<br><sub>Log scale reveals base flow dynamics</sub>', xaxis_title='Exceedance Probability (%)', yaxis_title='Discharge (cfs)', yaxis_type='log', height=600, template='plotly_white' ) fig.show() ``` ::: {.callout-note icon=false} ## 💻 For Computer Scientists **Flow Duration Curves (FDC) = Empirical CDF of Discharge** FDC is the **cumulative distribution function** of streamflow—a compact hydrologic signature! **Why FDC Matters for ML:** 1. **Dimensionality reduction**: 10,000+ daily values → 100 quantiles (100× compression!) 2. **Watershed classification**: Cluster watersheds by FDC shape 3. **Transfer learning**: Similar FDC = similar watershed (transfer models) 4. **Synthetic generation**: Generate realistic hydrographs from FDC + autocorrelation ::: **Reading the FDC**: - **Q10** (high flow): Flood regime - **Q50** (median): Typical streamflow - **Q90** (low flow): **Primarily groundwater base flow** If stream maintains flow at Q90, it's **connected to aquifer**. If Q90 approaches zero, stream is **disconnected** (ephemeral). --- ## Part 7: Flow Regime Analysis ::: {.callout-note icon=false} ## Understanding Flow Regime Metrics **What Are They?** Flow regime metrics are numerical summaries that characterize stream hydrology. The most important are: - **Q10, Q50, Q90**: Percentile flows from the FDC - **Flow Variability Ratio (Q10/Q90)**: Range between high and low flows - **Base Flow Index (BFI)**: Proportion of streamflow from groundwater These metrics were standardized by hydrologists in the 1960s-70s to enable comparison across watersheds and regions. **Why Do They Matter?** These metrics compress complex flow records into actionable numbers: - **Q90**: Water availability during droughts (critical for aquatic habitat, irrigation) - **Q10**: Flood magnitude (bridge/culvert design, floodplain management) - **Q10/Q90 ratio**: Watershed flashiness (drainage design, aquifer connectivity) - **BFI**: Aquifer contribution (validates HTEM interpretations) For aquifer management, **Q90 and BFI directly indicate groundwater discharge to streams**. **How Do They Work?** 1. **Extract percentiles from FDC**: - Q10 = 90th percentile (high flow) - Q50 = 50th percentile (median) - Q90 = 10th percentile (low flow) 2. **Calculate ratios**: ``` Flow Variability = Q10 / Q90 Base Flow Index ≈ Q90 / Q50 ``` 3. **Classify watershed regime**: - High BFI + Low variability = Aquifer-buffered - Low BFI + High variability = Flashy, runoff-dominated **What Will You See?** A table showing the five key metrics with numerical values. Compare these to the interpretation guide below to classify the stream's flow regime. **How to Interpret** | Metric | Value Range | Interpretation | Aquifer Connection | |--------|------------|---------------|-------------------| | **Q90** | > 1 cfs | Good base flow | Strong aquifer discharge | | **Q90** | 0.1-1 cfs | Moderate base flow | Some aquifer connection | | **Q90** | < 0.1 cfs | Minimal base flow | Weak/no connection | | **Q10/Q90** | > 100 | Very flashy | Urban or tile-drained | | **Q10/Q90** | 20-100 | Moderately flashy | Agricultural, some buffering | | **Q10/Q90** | < 20 | Stable | Forested or strong aquifer | | **BFI (Q90/Q50)** | > 0.6 | Groundwater-dominated | Excellent connection | | **BFI (Q90/Q50)** | 0.3-0.6 | Mixed regime | Moderate connection | | **BFI (Q90/Q50)** | < 0.3 | Runoff-dominated | Poor connection | **Example**: Stream with Q90=0.5 cfs, Q10/Q90=150, BFI=0.25 - **Interpretation**: Flashy runoff-dominated stream with minimal base flow - **Likely cause**: Urban watershed with impervious surfaces or tile-drained agriculture - **Aquifer connection**: Weak—stream responds to rain, not groundwater ::: ```{python} # Calculate base flow index q10 = fdc.loc[fdc['exceedance_probability'] == 10, 'discharge_cfs'].iloc[0] q50 = fdc.loc[fdc['exceedance_probability'] == 50, 'discharge_cfs'].iloc[0] q90 = fdc.loc[fdc['exceedance_probability'] == 90, 'discharge_cfs'].iloc[0] flow_regime = pd.Series({ 'Q10_high_flow_cfs': q10, 'Q50_median_flow_cfs': q50, 'Q90_low_flow_cfs': q90, 'flow_variability_q10_q90_ratio': q10 / q90, 'base_flow_index_estimate': q90 / q50 }) flow_regime = flow_regime.round(2) flow_regime ``` **Interpreting metrics**: - **Flow Variability (Q10/Q90)**: Ratio >100 = flashy (urban/tile-drained), <10 = stable (aquifer-fed) - **Base Flow Index (Q90/Q50)**: BFI >0.6 = groundwater-dominated, <0.3 = runoff-dominated --- ## Part 8: Key Findings ::: {.callout-important icon=true} ## 🎯 Critical Findings ### 1. Severe Spatial Coverage Gap **Evidence**: Only 21.6% of HTEM area within 5km of gauge **Impact**: - Regional stream-aquifer analysis infeasible - Cannot assess spatial heterogeneity - Urban monitoring bias (3 gauges in 27.8 mi² urban, 0 in 856 mi² agricultural) **Action**: Install ≥5 additional gauges in agricultural watersheds ### 2. Excellent Temporal Coverage **Achievement**: 75+ year records, 480,000+ daily measurements **Value**: - Detects multi-decadal trends - Captures full range of drought/flood cycles - Pre-development baseline available ### 3. Flow Duration Curves Reveal Aquifer Connection **Tool**: FDC shows aquifer buffering capacity **Application**: Compare base flow index with HTEM transmissivity—validation of geophysical interpretations ### 4. Urban vs. Agricultural Monitoring Bias **Problem**: All 3 gauges in HTEM are in urban watershed **Limitation**: Urban systems (impervious, stormwater) behave fundamentally differently than agricultural (tile drainage, natural base flow) **Action**: Prioritize agricultural watershed monitoring ::: --- ## Integration Roadmap Stream gauge data enables: **Part 2: Spatial Patterns** - Overlay gauge locations on HTEM grids - Stream proximity analysis and monitoring gaps - Delineate watersheds for each gauge **Part 3: Temporal Dynamics** - Streamflow variability and trends over time - Correlate discharge with well water levels - Event response analysis (droughts, floods) **Part 4: Data Fusion Insights** - Stream-aquifer exchange analysis - Base flow separation (isolate groundwater contribution) - Recharge estimation from streamflow - Water balance closure validation **Part 5: Predictive Operations** - Water level forecasting using stream data - Scenario analysis (pumping impacts on streamflow) - Early warning systems for low-flow conditions --- ## Recommendations ### Immediate (0-6 months) 1. Contact USGS to verify temporal data quality 2. Update analyses to clarify spatial limitations 3. Prioritize gauge site selection in agricultural areas ### Short-term (6-18 months) 4. Install 3-5 new gauges within HTEM extent 5. Target intermediate scale (20-100 mi²) agricultural watersheds 6. Achieve ≥50% spatial coverage ### Long-term (2-5 years) 7. Expand to 8-10 gauges for 70% coverage 8. Co-locate gauges with monitoring wells 9. Implement real-time telemetry --- ## Dependencies & Outputs - **Data source**: `usgs_stream` (local site metadata + daily values) - **Loader**: `src.data_loaders.USGSStreamLoader` - **Outputs**: Flow duration curves, site statistics, optional exports to `outputs/phase-1/usgs/` To access stream data: ```python from src.data_loaders import USGSStreamLoader loader = USGSStreamLoader() # Load discharge time series discharge = loader.load_daily_discharge(site_no='03337000') # Calculate flow duration curve fdc = loader.calculate_flow_duration_curve(site_no='03337000') ``` --- ## Summary USGS stream gauge network provides **exceptional temporal depth but limited spatial coverage**: ✅ **75+ years of records** - 480,000+ daily measurements capture full drought/flood cycles ✅ **Pre-development baselines** - Long-term trends reveal aquifer changes over time ✅ **Flow duration curves** - Reveal aquifer buffering capacity through base flow analysis ⚠️ **Spatial bias** - Only 3 gauges within HTEM footprint, all in urban watersheds ⚠️ **Agricultural gap** - No gauges in tile-drained agricultural areas (different hydrology) **Key Insight**: Stream data provides **temporal calibration** for HTEM snapshots, but 22% spatial coverage limits regional validation. Prioritize agricultural watershed gauge installation. --- ## Related Chapters - [Well Network Analysis](well-network-analysis.qmd) - Co-location opportunities with wells - [Weather Station Data](weather-station-data.qmd) - Precipitation-discharge relationships - [Streamflow Variability](../part-3-temporal/streamflow-variability.qmd) - Temporal analysis of flow patterns - [Stream-Aquifer Exchange](../part-4-fusion/stream-aquifer-exchange.qmd) - Fusion of stream and groundwater data ## Reflection Questions - Based on the flow duration curve and flow-regime metrics, would you classify the longest-record gauge as groundwater-dominated, runoff-dominated, or mixed, and why? - If you were tasked with adding 3–5 new stream gauges, which parts of the HTEM area would you prioritize to reduce spatial bias between urban and agricultural watersheds? - How would you combine streamflow metrics (like Q90 or base flow index) with well and HTEM data to cross-check interpretations of aquifer transmissivity and connectivity?

8.1 What You Will Learn in This Chapter

8.2 Streams as Windows into the Aquifer

8.3 Part 1: The Surface-Groundwater Connection

8.4 Part 2: The Monitoring Network

8.4.1 Site Inventory

8.5 Part 3: Spatial Coverage Analysis

8.6 Part 4: Historical Records Analysis

8.7 Part 5: Discharge Analysis

8.7.1 P90 (Low Flow) - The Aquifer’s Contribution

8.7.2 P50 (Median Flow) - Typical Water Availability

8.7.3 P10 (High Flow) - Flood Regime

8.7.4 Example: Stream Aquifer Health Assessment

8.8 Part 6: Flow Duration Curves

8.9 Part 7: Flow Regime Analysis

8.10 Part 8: Key Findings

8.10.1 1. Severe Spatial Coverage Gap

8.10.2 2. Excellent Temporal Coverage

8.10.3 3. Flow Duration Curves Reveal Aquifer Connection

8.10.4 4. Urban vs. Agricultural Monitoring Bias

8.11 Integration Roadmap

8.12 Recommendations

8.12.1 Immediate (0-6 months)

8.12.2 Short-term (6-18 months)

8.12.3 Long-term (2-5 years)

8.13 Dependencies & Outputs

8.14 Summary

8.15 Related Chapters

8.16 Reflection Questions