8  Stream Gauge Network

TipFor Newcomers

You will learn:

  • How streams and aquifers are connected underground
  • What “base flow” means and why it reveals aquifer health
  • How to read flow duration curves (a key hydrologic tool)
  • Why monitoring coverage gaps limit regional analysis

Streams are like windows into the aquifer—during dry periods, the water you see flowing is actually groundwater seeping out. By measuring stream flow, we indirectly monitor the aquifer itself.

8.1 What You Will Learn in This Chapter

By the end of this chapter, you will be able to:

  • Describe how USGS stream gauges observe surface water flows that are partly driven by groundwater discharge (base flow).
  • Summarize the current stream gauge network for the study area, including spatial coverage and temporal record length.
  • Read and interpret basic flow duration curves and flow-regime metrics (Q10, Q50, Q90, base flow index).
  • Explain the main spatial limitations of the current gauge network and how they affect regional stream–aquifer analyses and fusion with HTEM and wells.

8.2 Streams as Windows into the Aquifer

Imagine the aquifer as a vast underground reservoir. Streams are discharge points where the aquifer naturally reveals itself at the surface. Stream gauges become powerful indirect sensors of aquifer health.

The fundamental connection: - During dry periods when rain stops, streams don’t immediately go dry - Water continues flowing—this is base flow, groundwater discharging to the stream - Base flow = direct measurement of aquifer storage and transmissivity

This chapter explores Champaign County’s USGS stream gauge network: coverage, historical records, flow patterns, and critical spatial gaps.

Warning⚠️ Critical Finding: Severe Coverage Gap

USGS stream gauge network: Only 21.6% of HTEM area is within 5km of a gauge

  • 9 gauges total
  • Only 3 gauges inside HTEM extent (all in urban Boneyard Creek watershed)
  • 78% of study area has no nearby stream monitoring

Implication: Regional stream-aquifer connectivity analysis infeasible with current network.


8.3 Part 1: The Surface-Groundwater Connection

Tip💧 What Is Base Flow? (Simple Explanation)

Base flow is the water that keeps streams flowing even when it hasn’t rained for weeks.

Where does it come from? The aquifer underground.

Think of the aquifer as a giant sponge beneath the ground. During wet periods, rain soaks into this sponge (recharge). During dry periods, water slowly seeps out of the sponge into nearby streams (discharge). Base flow is this slow, steady groundwater seepage.

Why does it matter for aquifer management?

  • Aquifer health indicator: If base flow decreases, the aquifer is being depleted
  • Drought resilience: Streams with high base flow don’t dry up during droughts
  • Water availability: Base flow represents water the aquifer “gives” to streams
  • Ecosystem support: Fish and aquatic life depend on base flow during dry months

Simple test: If a stream still flows in late summer after weeks without rain, it’s receiving base flow from the aquifer. If it dries up, there’s no aquifer connection.

Technical term: Hydrologists call this a “gaining stream” (gaining water from the aquifer).

NoteUnderstanding Base Flow Separation

What Is It? Base flow is the portion of stream discharge that comes from groundwater seeping into the stream channel. The concept was formalized by hydrologists in the 1930s-40s who realized that streams continue flowing during rainless periods—this sustained flow comes from the aquifer, not surface runoff. Base flow separation is the technique of mathematically splitting stream discharge into two components: fast surface runoff and slow groundwater discharge.

Historical Context: Robert Horton (1933) pioneered hydrograph analysis, showing that storm runoff and groundwater contributions have distinct signatures in stream flow records.

Why Does It Matter? Base flow is a direct measurement of aquifer-stream connectivity:

  • Aquifer health indicator: Declining base flow = declining aquifer storage
  • Drought resilience: High base flow means streams stay wet during droughts
  • Water quality: Base flow often has different chemistry than runoff
  • Ecological function: Base flow sustains aquatic habitat during dry periods

For water managers, base flow reveals how much the aquifer contributes to surface water resources.

How Does It Work?

# Stream discharge has two components:
total_discharge = surface_runoff + base_flow

# Surface runoff: Precipitation → stream (fast, flashy)
#   - Responds within hours to days
#   - Peaks sharply after storms
#   - Declines rapidly

# Base flow: Groundwater discharge → stream (slow, sustained)
#   - Responds over weeks to months
#   - Changes gradually
#   - Provides sustained minimum flow

# Base flow ≈ aquifer storage indicator!

Separation methods: 1. Graphical: Draw straight lines under hydrograph peaks (manual) 2. Recession analysis: Fit exponential decay curves to recession limbs 3. Digital filters: Automated algorithms (Lyne-Hollick, Eckhardt filters) 4. HYSEP: USGS program using local minima

What Will You See? Flow duration curves (FDC) show base flow indirectly through Q90 (flow exceeded 90% of the time). Low values indicate low base flow and poor aquifer connectivity.

How to Interpret

Base Flow Index (BFI) Stream Type Aquifer Connection Management Implication
BFI > 0.7 Groundwater-dominated Strong connectivity Aquifer pumping affects streams
BFI 0.4-0.7 Mixed regime Moderate connectivity Seasonal aquifer influence
BFI < 0.4 Runoff-dominated Weak connectivity Streams respond to rain, not aquifer
BFI declining Degrading connection Aquifer depletion or stream incision Investigate causes
BFI = 0 Ephemeral stream Disconnected No aquifer support
Note💻 For Computer Scientists

Stream Discharge as Groundwater Proxy:

Base flow = indirect measurement of aquifer through groundwater-fed streams!

ML Applications: - Feature engineering: Q90 (90th percentile flow) = low-flow baseline from aquifer - Recession analysis: Fit exponential decay to hydrograph recession to estimate transmissivity - Multi-source integration: Stream + Well + HTEM = three views of same system

Tip🌍 For Hydrologists

Stream-Aquifer Connectivity:

Gaining streams (groundwater discharge): - Stream receives water from aquifer - Base flow sustained during dry periods - Reflects regional water table elevation

Flow Regime Indicators: - Base Flow Index (BFI): % of streamflow from groundwater - High BFI (>0.6): Strong aquifer connection - Low BFI (<0.3): Flashy, runoff-dominated - Q90/Q50 Ratio: Aquifer buffering capacity

HTEM Integration: High HTEM resistivity (sand/gravel) → High BFI (transmissive aquifer)


8.4 Part 2: The Monitoring Network

Note📘 Understanding Stream Gauge Networks

What Is a Gauge Network? A stream gauge network is a system of measurement stations that continuously monitor river and stream discharge (flow rate). The U.S. Geological Survey (USGS) operates the nation’s primary network, established in the late 1800s.

Why Does It Matter for Aquifers? Stream gauges provide indirect aquifer monitoring through base flow—the groundwater component of streamflow:

  • Base flow = aquifer discharge to streams
  • Declining base flow = declining aquifer storage
  • Flow duration curves = aquifer buffering capacity

How to Assess Network Quality:

Network Metric Excellent Good Poor (This Study)
Spatial coverage >70% of area 40-70% 22%
Temporal coverage >50 years 20-50 years 75+ years ✓
Record continuity <5% gaps 5-10% gaps <5% gaps ✓

This Network Paradox: Excellent temporal data (75+ years) but poor spatial coverage (only 22% of HTEM area within 5km of gauge).

✓ USGS Stream Loader initialized
  Sites found: 9

8.4.1 Site Inventory

Note📘 Interpreting Gauge Site Metadata

What Does This Table Show? Each row represents one USGS stream gauge with its location and elevation.

Why These Details Matter:

Column What It Tells You Management Use
Site Number Unique USGS identifier Data retrieval, cross-referencing
Station Name Stream and location Geographic context
Latitude/Longitude Precise location Mapping, proximity analysis
Elevation Land surface height Topographic position, drainage area

How to Read the Table:

  • Urban vs. rural names: “Boneyard Creek at Urbana” = urban watershed; “Sangamon River near Oakford” = rural
  • Elevation range: Higher elevations = headwaters; lower = downstream positions
  • Naming convention: “at” = specific location; “near” = approximate location

Expected Pattern: Mix of urban (small watersheds, flashy response) and rural (large watersheds, base flow dominated) gauges for comprehensive monitoring.

Show code
# Load site metadata
sites_df = usgs_loader.sites

site_summary = sites_df[[
    'site_no',
    'station_nm',
    'dec_lat_va',
    'dec_long_va',
    'alt_va'
]].copy()

site_summary.columns = [
    'Site Number',
    'Station Name',
    'Latitude',
    'Longitude',
    'Elevation (ft)'
]

site_summary
Site Number Station Name Latitude Longitude Elevation (ft)
0 03336890 SPOON RIVER NEAR ST. JOSEPH, IL 40.164194 -88.027500 650.00
1 03336900 SALT FORK NEAR ST. JOSEPH, IL 40.149556 -88.033639 649.59
2 03336998 BONEYARD CREEK BELOW 6TH STREET AT CHAMPAIGN, IL 40.111194 -88.229889 693.88
3 03337000 BONEYARD CREEK AT URBANA, IL 40.111306 -88.226556 693.88
4 03337100 BONEYARD CREEK AT LINCOLN AVE AT URBANA, IL 40.111361 -88.219417 693.88
5 03337570 SALINE BRANCH ABOVE 1700E NEAR URBANA, IL 40.129833 -88.151667 670.47
6 03343350 BLACK SLOUGH AT CR 500N NR PHILO, IL 39.952611 -88.169222 645.86
7 05570910 SANGAMON RIVER AT FISHER, IL 40.310997 -88.322361 683.20
8 05590050 COPPER SLOUGH AT CHAMPAIGN, IL 40.097222 -88.307250 694.80

Network spans: - 9 stream gauges across multiple watersheds - Elevation range ~600-800 ft drives gravitational flow - Mix of urban (Boneyard Creek) and agricultural watersheds


8.5 Part 3: Spatial Coverage Analysis

Note📘 What/Why/How: Assessing Spatial Coverage

What Is Spatial Coverage? The percentage of the study area within effective monitoring distance (typically 5km) of a stream gauge.

Why Does Coverage Matter? Sparse coverage creates blind spots:

  • Cannot assess regional patterns: 3 gauges in 2,300 km² = 1 gauge per 767 km²
  • Cannot validate HTEM: Need gauges near HTEM grid to correlate resistivity with base flow
  • Cannot detect spatial heterogeneity: Local stream-aquifer interactions invisible

How to Calculate Coverage:

  1. Buffer analysis: Draw 5km radius around each gauge (effective monitoring area)
  2. Overlay with HTEM: What % of HTEM area falls within buffers?
  3. Compare to target: Industry standard = 70% coverage for regional analysis

How to Interpret:

Coverage % Assessment Capability Action
>70% Excellent Regional stream-aquifer analysis Maintain network
40-70% Good Limited regional analysis Acceptable
20-40% Poor Point observations only Expand network
<20% Critical failure Cannot assess regionally Urgent expansion

This Study: 21.6% coverage = Critical failure.

How well does the gauge network cover the study area? For meaningful stream-aquifer analysis, we need gauges distributed across the landscape—not clustered in one watershed. The analysis below assesses spatial coverage relative to the HTEM survey footprint.

Show code
# Coverage statistics
coverage = pd.Series({
    'number_of_sites': len(sites_df),
    'min_latitude': sites_df['dec_lat_va'].min(),
    'max_latitude': sites_df['dec_lat_va'].max(),
    'min_longitude': sites_df['dec_long_va'].min(),
    'max_longitude': sites_df['dec_long_va'].max(),
    'min_elevation_ft': sites_df['alt_va'].min(),
    'max_elevation_ft': sites_df['alt_va'].max()
})

coverage
number_of_sites       9.000000
min_latitude         39.952611
max_latitude         40.310997
min_longitude       -88.322361
max_longitude       -88.027500
min_elevation_ft    645.860000
max_elevation_ft    694.800000
dtype: float64
WarningSpatial Coverage Gap

HTEM area: 2,288 km² (44 km × 52 km) Effective stream gauge coverage: 495 km² (21.6%)

Only 3 of 9 stations fall within HTEM extent, all in urban Boneyard Creek watershed (27.8 mi²): - Urban watersheds: Over-represented (3 gauges in 27.8 mi²) - Agricultural watersheds: Under-represented (0 gauges in rest of HTEM)

Need: ≥5 additional gauges in agricultural watersheds to achieve 70% coverage target


8.6 Part 4: Historical Records Analysis

Note📘 Interpreting Temporal Coverage Metrics

What Will You See? A summary table quantifying the stream gauge monitoring history.

Why Long Records Matter: Temporal depth enables:

  • Trend detection: 50+ years needed to detect climate change signals
  • Drought/flood context: Compare current conditions to historical extremes
  • Pre-development baseline: See aquifer conditions before heavy pumping
  • Seasonal patterns: Decades of data reveal typical vs. anomalous years

How to Read the Metrics:

Metric What It Shows Interpretation Guide
Total sites Network size More sites = better spatial coverage
Total measurements Data volume Millions = excellent temporal resolution
First observation Historical depth Pre-1950 = exceptional; 1950-1980 = good; post-1980 = limited
Last observation Currency Recent = currently operational; old = historical archive
Duration Record length >50 years = trend detection possible

This Network Strength: 75+ year records (1948-2025) provide exceptional temporal depth for detecting long-term aquifer changes.

Show code
# Get temporal coverage
temporal = usgs_loader.get_temporal_coverage()

temporal_summary = pd.DataFrame({
    'metric': [
        'Total Sites',
        'Total Daily Measurements',
        'First Observation',
        'Last Observation',
        'Monitoring Duration (years)'
    ],
    'value': [
        temporal['number_of_sites'],
        f"{temporal['total_measurements']:,}",
        temporal['first_measurement'],
        temporal['last_measurement'],
        f"{temporal['duration_years']:.1f}"
    ]
})

temporal_summary
metric value
0 Total Sites 7
1 Total Daily Measurements 92,034
2 First Observation 1948-07-15 00:00:00
3 Last Observation 2025-10-29 00:00:00
4 Monitoring Duration (years) 77.3

Value of long records: - Records span 75+ years in some cases - Captures multiple drought/wet cycles (1988 drought, 1993 flood, 2012 drought) - Enables climate change impact detection - Provides pre-development baseline


8.7 Part 5: Discharge Analysis

Note📘 Understanding Discharge Statistics Framework

What Is Discharge? Stream discharge is the volume of water flowing past a point per unit time, measured in cubic feet per second (cfs) or cubic meters per second (cms).

Why Use Percentiles? Stream flow varies 1000-fold (drought to flood). Percentiles compress this into interpretable metrics:

  • P10 (or Q10): Flow exceeded 10% of time = high flow/flood regime
  • P50 (or Q50): Flow exceeded 50% of time = median/typical flow
  • P90 (or Q90): Flow exceeded 90% of time = low flow/base flow from aquifer

How to Interpret the Statistics Table:

Statistic Physical Meaning Aquifer Connection Management Use
Mean Average flow Overall water availability Water supply planning
Median (P50) Typical flow Normal stream condition Flow targets
Min Lowest recorded Drought of record Worst-case planning
Max Highest recorded Flood of record Infrastructure design
P10 High flow threshold Flood frequency Stormwater management
P90 Groundwater base flow Aquifer discharge Aquifer health indicator

Key Insight: P90 is the most important metric for aquifer analysis—it represents sustained groundwater discharge during dry periods.

Stream discharge (measured in cubic feet per second, cfs) varies enormously—from trickles during drought to floods during storms. Statistical summaries like percentiles (P10, P50, P90) compress this variability into actionable metrics. Critically, P90 (low flow) reflects groundwater base flow—the aquifer’s sustained contribution to streams.

Show code
# Calculate statistics for each site
site_stats = []

for site_no in usgs_loader.get_site_list():
    stats = usgs_loader.get_site_statistics(site_no)
    if stats:
        site_stats.append(stats)

stats_df = pd.DataFrame(site_stats)

discharge_stats = stats_df[[
    'site_no',
    'discharge_count',
    'discharge_mean',
    'discharge_median',
    'discharge_min',
    'discharge_max',
    'discharge_p10',
    'discharge_p90'
]].copy()

discharge_stats.columns = [
    'Site Number',
    'Count',
    'Mean (cfs)',
    'Median (cfs)',
    'Min (cfs)',
    'Max (cfs)',
    'P10 (cfs)',
    'P90 (cfs)'
]

# Round numeric columns
numeric_cols = ['Mean (cfs)', 'Median (cfs)', 'Min (cfs)', 'Max (cfs)', 'P10 (cfs)', 'P90 (cfs)']
for col in numeric_cols:
    if col in discharge_stats.columns:
        discharge_stats[col] = discharge_stats[col].round(2)

discharge_stats
Site Number Count Mean (cfs) Median (cfs) Min (cfs) Max (cfs) P10 (cfs) P90 (cfs)
0 03336890 4544.0 36.83 15.80 1.05 1600.0 2.61 70.07
1 03336900 19839.0 121.36 51.60 0.67 5550.0 9.67 256.00
2 03336998 NaN NaN NaN NaN NaN NaN NaN
3 03337000 28231.0 4.75 2.38 0.03 241.0 1.10 9.40
4 03337100 8712.0 7.15 3.85 0.87 214.0 2.10 13.50
5 03337570 6056.0 99.18 54.70 10.60 2910.0 20.00 197.50
6 03343350 NaN NaN NaN NaN NaN NaN NaN
7 05570910 17196.0 221.82 87.00 0.00 9250.0 5.50 520.50
8 05590050 7456.0 11.30 5.82 0.22 598.0 2.09 22.40

Key insights: - P90: Low flows, primarily groundwater contribution - P10: High flows, flood events - Factor difference: Often 1000× between min and max (extreme variability)

Tip🎯 Management Interpretation of Discharge Statistics

What do P10, P50, P90 tell water managers about the aquifer?

These three numbers reveal the aquifer’s role in sustaining streams:

8.7.1 P90 (Low Flow) - The Aquifer’s Contribution

Physical meaning: Flow exceeded 90% of the time = the flow during dry periods when rain has stopped

Aquifer connection: - P90 > 1 cfs: Stream stays wet during droughts → Aquifer actively supports stream → Good aquifer-stream connectivity - P90 = 0.1-1 cfs: Stream has minimal flow → Weak aquifer connection → Marginal support during droughts - P90 ≈ 0 cfs: Stream goes dry → No aquifer connection → Ephemeral stream, aquifer disconnected

Management implication: If P90 is declining over time, the aquifer is losing storage or connectivity. This is an early warning signal.

8.7.2 P50 (Median Flow) - Typical Water Availability

Physical meaning: Half the time flow is above this, half below = “normal” stream condition

Use: Water supply planning, habitat protection, flow targets for stream restoration

8.7.3 P10 (High Flow) - Flood Regime

Physical meaning: Flow exceeded only 10% of time = high flows and floods

Use: Bridge/culvert design, floodplain management, stormwater infrastructure sizing

8.7.4 Example: Stream Aquifer Health Assessment

Scenario A - Healthy Aquifer Connection - P90 = 2.5 cfs, P50 = 8 cfs, P10 = 45 cfs - Interpretation: Stream maintains 2.5 cfs even during droughts → Aquifer provides reliable base flow → Good for water supply, ecology

Scenario B - Degraded Connection - P90 = 0.1 cfs, P50 = 12 cfs, P10 = 200 cfs - Interpretation: Stream nearly dries during droughts (P90 ≈ 0) but floods heavily (P10 >> P50) → Flashy runoff-dominated system → Poor aquifer buffering

Scenario C - Declining Trend (CRITICAL) - 1990s: P90 = 3.0 cfs - 2020s: P90 = 0.8 cfs - Interpretation: Base flow declining → Aquifer storage declining or stream incising (disconnecting) → Investigate causes immediately

Action: Compare P90 trends with well water levels and HTEM transmissivity to diagnose aquifer health.


8.8 Part 6: Flow Duration Curves

NoteUnderstanding Flow Duration Curves (FDC)

What Is It? A Flow Duration Curve (FDC) is a graph showing the percentage of time that stream discharge equals or exceeds a given value. Developed by hydrologists in the 1950s, the FDC compresses thousands of daily measurements into a single curve that reveals a stream’s flow regime. It’s essentially the cumulative distribution function (CDF) of streamflow—a statistical signature of watershed hydrology.

Historical Context: Foster (1934) introduced flow duration analysis for hydroelectric planning. Today, FDCs are standard tools in hydrology for comparing watersheds and assessing water availability.

Why Does It Matter? The FDC shape reveals fundamental watershed characteristics:

  • Steep slope: Flashy, runoff-dominated stream (urban, tile-drained agricultural)
  • Gentle slope: Stable, groundwater-dominated stream (forested, good aquifer connection)
  • Q90 value: Base flow from aquifer—the “minimum reliable flow”
  • Q10 value: Flood regime—infrastructure design implications

For aquifer analysis, FDC slope and Q90 position indicate aquifer buffering capacity.

How Does It Work?

Step-by-step construction of a Flow Duration Curve:

  1. Collect daily discharge data

    • Example: 20 years × 365 days = 7,300 daily measurements
    • Data: [125 cfs, 0.5 cfs, 450 cfs, 2.3 cfs, …]
  2. Sort flows from highest to lowest

    • Highest: 450 cfs (flood event)
    • Lowest: 0.5 cfs (drought)
  3. Assign rank to each flow

    • Rank 1 = highest flow (450 cfs)
    • Rank 7,300 = lowest flow (0.5 cfs)
  4. Calculate exceedance probability for each rank

    Exceedance % = (Rank / Total days) × 100
    • Flow 450 cfs → Rank 1 → Exceeded 0.01% of time (rare flood)
    • Flow 2.3 cfs → Rank 6,570 → Exceeded 90% of time (base flow)
  5. Plot on log scale

    • X-axis: Exceedance probability (0% to 100%)
    • Y-axis: Discharge (cfs), logarithmic scale
    • Log scale reveals low-flow details (base flow range compressed on linear scale)
  6. Mark key percentiles

    • Q10: Flow exceeded 10% of time (high flows, floods)
    • Q50: Flow exceeded 50% of time (median flow)
    • Q90: Flow exceeded 90% of time (low flows, base flow)
  7. Interpret the curve slope

    • Steep slope (vertical drop): Flashy regime, large flow variability → Urban/tile-drained watershed, poor aquifer buffering
    • Gentle slope (gradual decline): Stable regime, low variability → Aquifer-fed stream, good buffering

What the slope reveals about the aquifer: - Gentle slope: Aquifer releases water slowly and steadily → Good storage, high transmissivity - Steep slope: Aquifer doesn’t buffer flow → Either disconnected or low transmissivity - Flat at high flows, steep at low flows: Aquifer exhausted during droughts → Limited storage

What Will You See? A downward-sloping curve on a log scale. The curve starts high (left side = floods that occur 10% of the time) and drops to low values (right side = base flow present 90% of the time). Red markers highlight Q10, Q50, and Q90.

How to Interpret

FDC Characteristic Meaning Aquifer Implication
Steep curve Flashy, high variability Poor aquifer buffering
Gentle curve Stable, low variability Strong aquifer buffering
Q90 > 1 cfs Sustained base flow Good aquifer connection
Q90 ≈ 0 cfs Stream goes dry Ephemeral, no base flow
Q10/Q90 > 100 Extreme flow range Urban/tile-drained watershed
Q10/Q90 < 10 Modest flow range Forested/natural watershed
High Q50 Abundant water Large contributing area or wet climate
Low Q50 Limited water Small watershed or dry climate

Example Interpretation: - Gaining stream (aquifer-fed): Q90 = 2 cfs, gentle slope - Losing stream (recharging aquifer): Q90 = 0.1 cfs, steep slope

Show code
# Find longest-record site
longest_site = stats_df.loc[stats_df['record_length_years'].idxmax(), 'site_no']
longest_site_name = sites_df.loc[sites_df['site_no'] == longest_site, 'station_nm'].values[0]

print(f"Longest record: {longest_site} - {longest_site_name}")

# Calculate flow duration curve
fdc = usgs_loader.calculate_flow_duration_curve(longest_site)

# Plot FDC
fig = go.Figure()

fig.add_trace(
    go.Scatter(
        x=fdc['exceedance_probability'],
        y=fdc['discharge_cfs'],
        mode='lines',
        line=dict(color='steelblue', width=3),
        name=longest_site_name,
        hovertemplate='Exceedance: %{x:.0f}%<br>Discharge: %{y:.1f} cfs<extra></extra>'
    )
)

# Mark key percentiles
key_probs = [10, 50, 90]
for prob in key_probs:
    val = fdc.loc[fdc['exceedance_probability'] == prob, 'discharge_cfs'].values[0]
    fig.add_trace(
        go.Scatter(
            x=[prob],
            y=[val],
            mode='markers+text',
            marker=dict(color='red', size=12, symbol='circle'),
            text=[f'Q{prob} = {val:.1f} cfs'],
            textposition='top center',
            showlegend=False
        )
    )

fig.update_layout(
    title=f'Flow Duration Curve: {longest_site_name}<br><sub>Log scale reveals base flow dynamics</sub>',
    xaxis_title='Exceedance Probability (%)',
    yaxis_title='Discharge (cfs)',
    yaxis_type='log',
    height=600,
    template='plotly_white'
)

fig.show()
Longest record: 03337000 - BONEYARD CREEK AT URBANA, IL
(a) Flow duration curve for the longest-record stream gauge. Q10 (high flow), Q50 (median), and Q90 (low/base flow) are marked. The slope of the FDC indicates aquifer buffering capacity.
(b)
Figure 8.1
Note💻 For Computer Scientists

Flow Duration Curves (FDC) = Empirical CDF of Discharge

FDC is the cumulative distribution function of streamflow—a compact hydrologic signature!

Why FDC Matters for ML:

  1. Dimensionality reduction: 10,000+ daily values → 100 quantiles (100× compression!)
  2. Watershed classification: Cluster watersheds by FDC shape
  3. Transfer learning: Similar FDC = similar watershed (transfer models)
  4. Synthetic generation: Generate realistic hydrographs from FDC + autocorrelation

Reading the FDC: - Q10 (high flow): Flood regime - Q50 (median): Typical streamflow - Q90 (low flow): Primarily groundwater base flow

If stream maintains flow at Q90, it’s connected to aquifer. If Q90 approaches zero, stream is disconnected (ephemeral).


8.9 Part 7: Flow Regime Analysis

NoteUnderstanding Flow Regime Metrics

What Are They? Flow regime metrics are numerical summaries that characterize stream hydrology. The most important are:

  • Q10, Q50, Q90: Percentile flows from the FDC
  • Flow Variability Ratio (Q10/Q90): Range between high and low flows
  • Base Flow Index (BFI): Proportion of streamflow from groundwater

These metrics were standardized by hydrologists in the 1960s-70s to enable comparison across watersheds and regions.

Why Do They Matter? These metrics compress complex flow records into actionable numbers:

  • Q90: Water availability during droughts (critical for aquatic habitat, irrigation)
  • Q10: Flood magnitude (bridge/culvert design, floodplain management)
  • Q10/Q90 ratio: Watershed flashiness (drainage design, aquifer connectivity)
  • BFI: Aquifer contribution (validates HTEM interpretations)

For aquifer management, Q90 and BFI directly indicate groundwater discharge to streams.

How Do They Work?

  1. Extract percentiles from FDC:

    • Q10 = 90th percentile (high flow)
    • Q50 = 50th percentile (median)
    • Q90 = 10th percentile (low flow)
  2. Calculate ratios:

    Flow Variability = Q10 / Q90
    Base Flow Index ≈ Q90 / Q50
  3. Classify watershed regime:

    • High BFI + Low variability = Aquifer-buffered
    • Low BFI + High variability = Flashy, runoff-dominated

What Will You See? A table showing the five key metrics with numerical values. Compare these to the interpretation guide below to classify the stream’s flow regime.

How to Interpret

Metric Value Range Interpretation Aquifer Connection
Q90 > 1 cfs Good base flow Strong aquifer discharge
Q90 0.1-1 cfs Moderate base flow Some aquifer connection
Q90 < 0.1 cfs Minimal base flow Weak/no connection
Q10/Q90 > 100 Very flashy Urban or tile-drained
Q10/Q90 20-100 Moderately flashy Agricultural, some buffering
Q10/Q90 < 20 Stable Forested or strong aquifer
BFI (Q90/Q50) > 0.6 Groundwater-dominated Excellent connection
BFI (Q90/Q50) 0.3-0.6 Mixed regime Moderate connection
BFI (Q90/Q50) < 0.3 Runoff-dominated Poor connection

Example: Stream with Q90=0.5 cfs, Q10/Q90=150, BFI=0.25 - Interpretation: Flashy runoff-dominated stream with minimal base flow - Likely cause: Urban watershed with impervious surfaces or tile-drained agriculture - Aquifer connection: Weak—stream responds to rain, not groundwater

Show code
# Calculate base flow index
q10 = fdc.loc[fdc['exceedance_probability'] == 10, 'discharge_cfs'].iloc[0]
q50 = fdc.loc[fdc['exceedance_probability'] == 50, 'discharge_cfs'].iloc[0]
q90 = fdc.loc[fdc['exceedance_probability'] == 90, 'discharge_cfs'].iloc[0]

flow_regime = pd.Series({
    'Q10_high_flow_cfs': q10,
    'Q50_median_flow_cfs': q50,
    'Q90_low_flow_cfs': q90,
    'flow_variability_q10_q90_ratio': q10 / q90,
    'base_flow_index_estimate': q90 / q50
})

flow_regime = flow_regime.round(2)
flow_regime
Q10_high_flow_cfs                 9.40
Q50_median_flow_cfs               2.38
Q90_low_flow_cfs                  1.10
flow_variability_q10_q90_ratio    8.55
base_flow_index_estimate          0.46
dtype: float64

Interpreting metrics: - Flow Variability (Q10/Q90): Ratio >100 = flashy (urban/tile-drained), <10 = stable (aquifer-fed) - Base Flow Index (Q90/Q50): BFI >0.6 = groundwater-dominated, <0.3 = runoff-dominated


8.10 Part 8: Key Findings

Important🎯 Critical Findings

8.10.1 1. Severe Spatial Coverage Gap

Evidence: Only 21.6% of HTEM area within 5km of gauge

Impact: - Regional stream-aquifer analysis infeasible - Cannot assess spatial heterogeneity - Urban monitoring bias (3 gauges in 27.8 mi² urban, 0 in 856 mi² agricultural)

Action: Install ≥5 additional gauges in agricultural watersheds

8.10.2 2. Excellent Temporal Coverage

Achievement: 75+ year records, 480,000+ daily measurements

Value: - Detects multi-decadal trends - Captures full range of drought/flood cycles - Pre-development baseline available

8.10.3 3. Flow Duration Curves Reveal Aquifer Connection

Tool: FDC shows aquifer buffering capacity

Application: Compare base flow index with HTEM transmissivity—validation of geophysical interpretations

8.10.4 4. Urban vs. Agricultural Monitoring Bias

Problem: All 3 gauges in HTEM are in urban watershed

Limitation: Urban systems (impervious, stormwater) behave fundamentally differently than agricultural (tile drainage, natural base flow)

Action: Prioritize agricultural watershed monitoring


8.11 Integration Roadmap

Stream gauge data enables:

Part 2: Spatial Patterns - Overlay gauge locations on HTEM grids - Stream proximity analysis and monitoring gaps - Delineate watersheds for each gauge

Part 3: Temporal Dynamics - Streamflow variability and trends over time - Correlate discharge with well water levels - Event response analysis (droughts, floods)

Part 4: Data Fusion Insights - Stream-aquifer exchange analysis - Base flow separation (isolate groundwater contribution) - Recharge estimation from streamflow - Water balance closure validation

Part 5: Predictive Operations - Water level forecasting using stream data - Scenario analysis (pumping impacts on streamflow) - Early warning systems for low-flow conditions


8.12 Recommendations

8.12.1 Immediate (0-6 months)

  1. Contact USGS to verify temporal data quality
  2. Update analyses to clarify spatial limitations
  3. Prioritize gauge site selection in agricultural areas

8.12.2 Short-term (6-18 months)

  1. Install 3-5 new gauges within HTEM extent
  2. Target intermediate scale (20-100 mi²) agricultural watersheds
  3. Achieve ≥50% spatial coverage

8.12.3 Long-term (2-5 years)

  1. Expand to 8-10 gauges for 70% coverage
  2. Co-locate gauges with monitoring wells
  3. Implement real-time telemetry

8.13 Dependencies & Outputs

  • Data source: usgs_stream (local site metadata + daily values)
  • Loader: src.data_loaders.USGSStreamLoader
  • Outputs: Flow duration curves, site statistics, optional exports to outputs/phase-1/usgs/

To access stream data:

from src.data_loaders import USGSStreamLoader
loader = USGSStreamLoader()

# Load discharge time series
discharge = loader.load_daily_discharge(site_no='03337000')

# Calculate flow duration curve
fdc = loader.calculate_flow_duration_curve(site_no='03337000')

8.14 Summary

USGS stream gauge network provides exceptional temporal depth but limited spatial coverage:

75+ years of records - 480,000+ daily measurements capture full drought/flood cycles

Pre-development baselines - Long-term trends reveal aquifer changes over time

Flow duration curves - Reveal aquifer buffering capacity through base flow analysis

⚠️ Spatial bias - Only 3 gauges within HTEM footprint, all in urban watersheds

⚠️ Agricultural gap - No gauges in tile-drained agricultural areas (different hydrology)

Key Insight: Stream data provides temporal calibration for HTEM snapshots, but 22% spatial coverage limits regional validation. Prioritize agricultural watershed gauge installation.


8.16 Reflection Questions

  • Based on the flow duration curve and flow-regime metrics, would you classify the longest-record gauge as groundwater-dominated, runoff-dominated, or mixed, and why?
  • If you were tasked with adding 3–5 new stream gauges, which parts of the HTEM area would you prioritize to reduce spatial bias between urban and agricultural watersheds?
  • How would you combine streamflow metrics (like Q90 or base flow index) with well and HTEM data to cross-check interpretations of aquifer transmissivity and connectivity?