---
title: "Stream Gauge Network"
description: "USGS stream discharge monitoring network coverage and base flow analysis"
code-fold: true
---
::: {.callout-tip icon=false}
## For Newcomers
**You will learn:**
- How streams and aquifers are connected underground
- What "base flow" means and why it reveals aquifer health
- How to read flow duration curves (a key hydrologic tool)
- Why monitoring coverage gaps limit regional analysis
Streams are like **windows into the aquifer**—during dry periods, the water you see flowing is actually groundwater seeping out. By measuring stream flow, we indirectly monitor the aquifer itself.
:::
## What You Will Learn in This Chapter
By the end of this chapter, you will be able to:
- Describe how USGS stream gauges observe surface water flows that are partly driven by groundwater discharge (base flow).
- Summarize the current stream gauge network for the study area, including spatial coverage and temporal record length.
- Read and interpret basic flow duration curves and flow-regime metrics (Q10, Q50, Q90, base flow index).
- Explain the main spatial limitations of the current gauge network and how they affect regional stream–aquifer analyses and fusion with HTEM and wells.
## Streams as Windows into the Aquifer
Imagine the aquifer as a vast underground reservoir. Streams are **discharge points** where the aquifer naturally reveals itself at the surface. Stream gauges become powerful **indirect sensors** of aquifer health.
**The fundamental connection:**
- During dry periods when rain stops, streams don't immediately go dry
- Water continues flowing—this is **base flow**, groundwater discharging to the stream
- **Base flow = direct measurement of aquifer storage and transmissivity**
This chapter explores Champaign County's USGS stream gauge network: coverage, historical records, flow patterns, and critical spatial gaps.
::: {.callout-warning icon=false}
## ⚠️ Critical Finding: Severe Coverage Gap
**USGS stream gauge network**: Only **21.6% of HTEM area** is within 5km of a gauge
- 9 gauges total
- Only **3 gauges inside HTEM extent** (all in urban Boneyard Creek watershed)
- 78% of study area has **no nearby stream monitoring**
**Implication**: Regional stream-aquifer connectivity analysis infeasible with current network.
:::
---
## Part 1: The Surface-Groundwater Connection
::: {.callout-tip icon=false}
## 💧 What Is Base Flow? (Simple Explanation)
**Base flow** is the water that keeps streams flowing even when it hasn't rained for weeks.
**Where does it come from?** The aquifer underground.
Think of the aquifer as a giant sponge beneath the ground. During wet periods, rain soaks into this sponge (recharge). During dry periods, water slowly seeps out of the sponge into nearby streams (discharge). **Base flow is this slow, steady groundwater seepage.**
**Why does it matter for aquifer management?**
- **Aquifer health indicator**: If base flow decreases, the aquifer is being depleted
- **Drought resilience**: Streams with high base flow don't dry up during droughts
- **Water availability**: Base flow represents water the aquifer "gives" to streams
- **Ecosystem support**: Fish and aquatic life depend on base flow during dry months
**Simple test**: If a stream still flows in late summer after weeks without rain, it's receiving base flow from the aquifer. If it dries up, there's no aquifer connection.
**Technical term**: Hydrologists call this a "gaining stream" (gaining water from the aquifer).
:::
::: {.callout-note icon=false}
## Understanding Base Flow Separation
**What Is It?**
**Base flow** is the portion of stream discharge that comes from groundwater seeping into the stream channel. The concept was formalized by hydrologists in the 1930s-40s who realized that streams continue flowing during rainless periods—this sustained flow comes from the aquifer, not surface runoff. **Base flow separation** is the technique of mathematically splitting stream discharge into two components: fast surface runoff and slow groundwater discharge.
**Historical Context**: Robert Horton (1933) pioneered hydrograph analysis, showing that storm runoff and groundwater contributions have distinct signatures in stream flow records.
**Why Does It Matter?**
Base flow is a **direct measurement of aquifer-stream connectivity**:
- **Aquifer health indicator**: Declining base flow = declining aquifer storage
- **Drought resilience**: High base flow means streams stay wet during droughts
- **Water quality**: Base flow often has different chemistry than runoff
- **Ecological function**: Base flow sustains aquatic habitat during dry periods
For water managers, base flow reveals how much the aquifer contributes to surface water resources.
**How Does It Work?**
```python
# Stream discharge has two components:
total_discharge = surface_runoff + base_flow
# Surface runoff: Precipitation → stream (fast, flashy)
# - Responds within hours to days
# - Peaks sharply after storms
# - Declines rapidly
# Base flow: Groundwater discharge → stream (slow, sustained)
# - Responds over weeks to months
# - Changes gradually
# - Provides sustained minimum flow
# Base flow ≈ aquifer storage indicator!
```
**Separation methods**:
1. **Graphical**: Draw straight lines under hydrograph peaks (manual)
2. **Recession analysis**: Fit exponential decay curves to recession limbs
3. **Digital filters**: Automated algorithms (Lyne-Hollick, Eckhardt filters)
4. **HYSEP**: USGS program using local minima
**What Will You See?**
Flow duration curves (FDC) show base flow indirectly through Q90 (flow exceeded 90% of the time). Low values indicate low base flow and poor aquifer connectivity.
**How to Interpret**
| Base Flow Index (BFI) | Stream Type | Aquifer Connection | Management Implication |
|----------------------|-------------|-------------------|----------------------|
| BFI > 0.7 | Groundwater-dominated | Strong connectivity | Aquifer pumping affects streams |
| BFI 0.4-0.7 | Mixed regime | Moderate connectivity | Seasonal aquifer influence |
| BFI < 0.4 | Runoff-dominated | Weak connectivity | Streams respond to rain, not aquifer |
| BFI declining | Degrading connection | Aquifer depletion or stream incision | Investigate causes |
| BFI = 0 | Ephemeral stream | Disconnected | No aquifer support |
:::
::: {.callout-note icon=false}
## 💻 For Computer Scientists
**Stream Discharge as Groundwater Proxy:**
Base flow = **indirect measurement** of aquifer through groundwater-fed streams!
**ML Applications:**
- **Feature engineering**: `Q90` (90th percentile flow) = low-flow baseline from aquifer
- **Recession analysis**: Fit exponential decay to hydrograph recession to estimate transmissivity
- **Multi-source integration**: Stream + Well + HTEM = three views of same system
:::
::: {.callout-tip icon=false}
## 🌍 For Hydrologists
**Stream-Aquifer Connectivity:**
**Gaining streams** (groundwater discharge):
- Stream receives water from aquifer
- Base flow sustained during dry periods
- Reflects regional water table elevation
**Flow Regime Indicators:**
- **Base Flow Index (BFI)**: % of streamflow from groundwater
- High BFI (>0.6): Strong aquifer connection
- Low BFI (<0.3): Flashy, runoff-dominated
- **Q90/Q50 Ratio**: Aquifer buffering capacity
**HTEM Integration:** High HTEM resistivity (sand/gravel) → High BFI (transmissive aquifer)
:::
---
## Part 2: The Monitoring Network
::: {.callout-note icon=false}
## 📘 Understanding Stream Gauge Networks
**What Is a Gauge Network?**
A stream gauge network is a system of measurement stations that continuously monitor river and stream discharge (flow rate). The U.S. Geological Survey (USGS) operates the nation's primary network, established in the late 1800s.
**Why Does It Matter for Aquifers?**
Stream gauges provide indirect aquifer monitoring through base flow—the groundwater component of streamflow:
- **Base flow = aquifer discharge** to streams
- **Declining base flow** = declining aquifer storage
- **Flow duration curves** = aquifer buffering capacity
**How to Assess Network Quality:**
| Network Metric | Excellent | Good | Poor (This Study) |
|---------------|-----------|------|-------------------|
| Spatial coverage | >70% of area | 40-70% | **22%** |
| Temporal coverage | >50 years | 20-50 years | 75+ years ✓ |
| Record continuity | <5% gaps | 5-10% gaps | <5% gaps ✓ |
**This Network Paradox:** Excellent temporal data (75+ years) but poor spatial coverage (only 22% of HTEM area within 5km of gauge).
:::
```{python}
#| label: setup
#| echo: false
import os
import sys
from pathlib import Path
import pandas as pd
import numpy as np
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots
def find_repo_root(start: Path) -> Path:
for candidate in [start, *start.parents]:
if (candidate / "src").exists():
return candidate
return start
quarto_project = Path(os.environ.get("QUARTO_PROJECT_DIR", str(Path.cwd())))
project_root = find_repo_root(quarto_project)
if str(project_root) not in sys.path:
sys.path.append(str(project_root))
from src.data_loaders.usgs_stream_loader import USGSStreamLoader
from src.utils import get_data_path
# Initialize loader
usgs_loader = USGSStreamLoader(
data_root=get_data_path("usgs_stream")
)
print(f"✓ USGS Stream Loader initialized")
print(f" Sites found: {len(usgs_loader.get_site_list())}")
```
### Site Inventory
::: {.callout-note icon=false}
## 📘 Interpreting Gauge Site Metadata
**What Does This Table Show?**
Each row represents one USGS stream gauge with its location and elevation.
**Why These Details Matter:**
| Column | What It Tells You | Management Use |
|--------|------------------|----------------|
| **Site Number** | Unique USGS identifier | Data retrieval, cross-referencing |
| **Station Name** | Stream and location | Geographic context |
| **Latitude/Longitude** | Precise location | Mapping, proximity analysis |
| **Elevation** | Land surface height | Topographic position, drainage area |
**How to Read the Table:**
- **Urban vs. rural names**: "Boneyard Creek at Urbana" = urban watershed; "Sangamon River near Oakford" = rural
- **Elevation range**: Higher elevations = headwaters; lower = downstream positions
- **Naming convention**: "at" = specific location; "near" = approximate location
**Expected Pattern:** Mix of urban (small watersheds, flashy response) and rural (large watersheds, base flow dominated) gauges for comprehensive monitoring.
:::
```{python}
# Load site metadata
sites_df = usgs_loader.sites
site_summary = sites_df[[
'site_no',
'station_nm',
'dec_lat_va',
'dec_long_va',
'alt_va'
]].copy()
site_summary.columns = [
'Site Number',
'Station Name',
'Latitude',
'Longitude',
'Elevation (ft)'
]
site_summary
```
**Network spans**:
- 9 stream gauges across multiple watersheds
- Elevation range ~600-800 ft drives gravitational flow
- Mix of urban (Boneyard Creek) and agricultural watersheds
---
## Part 3: Spatial Coverage Analysis
::: {.callout-note icon=false}
## 📘 What/Why/How: Assessing Spatial Coverage
**What Is Spatial Coverage?**
The percentage of the study area within effective monitoring distance (typically 5km) of a stream gauge.
**Why Does Coverage Matter?**
Sparse coverage creates blind spots:
- **Cannot assess regional patterns**: 3 gauges in 2,300 km² = 1 gauge per 767 km²
- **Cannot validate HTEM**: Need gauges near HTEM grid to correlate resistivity with base flow
- **Cannot detect spatial heterogeneity**: Local stream-aquifer interactions invisible
**How to Calculate Coverage:**
1. **Buffer analysis**: Draw 5km radius around each gauge (effective monitoring area)
2. **Overlay with HTEM**: What % of HTEM area falls within buffers?
3. **Compare to target**: Industry standard = 70% coverage for regional analysis
**How to Interpret:**
| Coverage % | Assessment | Capability | Action |
|-----------|------------|------------|--------|
| **>70%** | Excellent | Regional stream-aquifer analysis | Maintain network |
| **40-70%** | Good | Limited regional analysis | Acceptable |
| **20-40%** | Poor | Point observations only | Expand network |
| **<20%** | Critical failure | Cannot assess regionally | Urgent expansion |
**This Study:** 21.6% coverage = Critical failure.
:::
How well does the gauge network cover the study area? For meaningful stream-aquifer analysis, we need gauges distributed across the landscape—not clustered in one watershed. The analysis below assesses spatial coverage relative to the HTEM survey footprint.
```{python}
# Coverage statistics
coverage = pd.Series({
'number_of_sites': len(sites_df),
'min_latitude': sites_df['dec_lat_va'].min(),
'max_latitude': sites_df['dec_lat_va'].max(),
'min_longitude': sites_df['dec_long_va'].min(),
'max_longitude': sites_df['dec_long_va'].max(),
'min_elevation_ft': sites_df['alt_va'].min(),
'max_elevation_ft': sites_df['alt_va'].max()
})
coverage
```
::: {.callout-warning icon=false}
## Spatial Coverage Gap
**HTEM area**: 2,288 km² (44 km × 52 km)
**Effective stream gauge coverage**: 495 km² (21.6%)
**Only 3 of 9 stations fall within HTEM extent**, all in urban Boneyard Creek watershed (27.8 mi²):
- Urban watersheds: **Over-represented** (3 gauges in 27.8 mi²)
- Agricultural watersheds: **Under-represented** (0 gauges in rest of HTEM)
**Need**: ≥5 additional gauges in agricultural watersheds to achieve 70% coverage target
:::
---
## Part 4: Historical Records Analysis
::: {.callout-note icon=false}
## 📘 Interpreting Temporal Coverage Metrics
**What Will You See?**
A summary table quantifying the stream gauge monitoring history.
**Why Long Records Matter:**
Temporal depth enables:
- **Trend detection**: 50+ years needed to detect climate change signals
- **Drought/flood context**: Compare current conditions to historical extremes
- **Pre-development baseline**: See aquifer conditions before heavy pumping
- **Seasonal patterns**: Decades of data reveal typical vs. anomalous years
**How to Read the Metrics:**
| Metric | What It Shows | Interpretation Guide |
|--------|--------------|---------------------|
| **Total sites** | Network size | More sites = better spatial coverage |
| **Total measurements** | Data volume | Millions = excellent temporal resolution |
| **First observation** | Historical depth | Pre-1950 = exceptional; 1950-1980 = good; post-1980 = limited |
| **Last observation** | Currency | Recent = currently operational; old = historical archive |
| **Duration** | Record length | >50 years = trend detection possible |
**This Network Strength:** 75+ year records (1948-2025) provide exceptional temporal depth for detecting long-term aquifer changes.
:::
```{python}
# Get temporal coverage
temporal = usgs_loader.get_temporal_coverage()
temporal_summary = pd.DataFrame({
'metric': [
'Total Sites',
'Total Daily Measurements',
'First Observation',
'Last Observation',
'Monitoring Duration (years)'
],
'value': [
temporal['number_of_sites'],
f"{temporal['total_measurements']:,}",
temporal['first_measurement'],
temporal['last_measurement'],
f"{temporal['duration_years']:.1f}"
]
})
temporal_summary
```
**Value of long records**:
- Records span **75+ years** in some cases
- Captures multiple drought/wet cycles (1988 drought, 1993 flood, 2012 drought)
- Enables climate change impact detection
- Provides pre-development baseline
---
## Part 5: Discharge Analysis
::: {.callout-note icon=false}
## 📘 Understanding Discharge Statistics Framework
**What Is Discharge?**
Stream discharge is the volume of water flowing past a point per unit time, measured in cubic feet per second (cfs) or cubic meters per second (cms).
**Why Use Percentiles?**
Stream flow varies 1000-fold (drought to flood). Percentiles compress this into interpretable metrics:
- **P10 (or Q10)**: Flow exceeded 10% of time = high flow/flood regime
- **P50 (or Q50)**: Flow exceeded 50% of time = median/typical flow
- **P90 (or Q90)**: Flow exceeded 90% of time = low flow/**base flow from aquifer**
**How to Interpret the Statistics Table:**
| Statistic | Physical Meaning | Aquifer Connection | Management Use |
|-----------|-----------------|-------------------|---------------|
| **Mean** | Average flow | Overall water availability | Water supply planning |
| **Median (P50)** | Typical flow | Normal stream condition | Flow targets |
| **Min** | Lowest recorded | Drought of record | Worst-case planning |
| **Max** | Highest recorded | Flood of record | Infrastructure design |
| **P10** | High flow threshold | Flood frequency | Stormwater management |
| **P90** | **Groundwater base flow** | **Aquifer discharge** | **Aquifer health indicator** |
**Key Insight:** P90 is the most important metric for aquifer analysis—it represents sustained groundwater discharge during dry periods.
:::
Stream discharge (measured in cubic feet per second, cfs) varies enormously—from trickles during drought to floods during storms. Statistical summaries like percentiles (P10, P50, P90) compress this variability into actionable metrics. Critically, **P90 (low flow) reflects groundwater base flow**—the aquifer's sustained contribution to streams.
```{python}
# Calculate statistics for each site
site_stats = []
for site_no in usgs_loader.get_site_list():
stats = usgs_loader.get_site_statistics(site_no)
if stats:
site_stats.append(stats)
stats_df = pd.DataFrame(site_stats)
discharge_stats = stats_df[[
'site_no',
'discharge_count',
'discharge_mean',
'discharge_median',
'discharge_min',
'discharge_max',
'discharge_p10',
'discharge_p90'
]].copy()
discharge_stats.columns = [
'Site Number',
'Count',
'Mean (cfs)',
'Median (cfs)',
'Min (cfs)',
'Max (cfs)',
'P10 (cfs)',
'P90 (cfs)'
]
# Round numeric columns
numeric_cols = ['Mean (cfs)', 'Median (cfs)', 'Min (cfs)', 'Max (cfs)', 'P10 (cfs)', 'P90 (cfs)']
for col in numeric_cols:
if col in discharge_stats.columns:
discharge_stats[col] = discharge_stats[col].round(2)
discharge_stats
```
**Key insights**:
- **P90**: Low flows, primarily **groundwater contribution**
- **P10**: High flows, flood events
- **Factor difference**: Often 1000× between min and max (extreme variability)
::: {.callout-tip icon=false}
## 🎯 Management Interpretation of Discharge Statistics
**What do P10, P50, P90 tell water managers about the aquifer?**
These three numbers reveal the aquifer's role in sustaining streams:
### P90 (Low Flow) - The Aquifer's Contribution
**Physical meaning**: Flow exceeded 90% of the time = the flow during dry periods when rain has stopped
**Aquifer connection**:
- **P90 > 1 cfs**: Stream stays wet during droughts → Aquifer actively supports stream → **Good aquifer-stream connectivity**
- **P90 = 0.1-1 cfs**: Stream has minimal flow → Weak aquifer connection → **Marginal support during droughts**
- **P90 ≈ 0 cfs**: Stream goes dry → No aquifer connection → **Ephemeral stream, aquifer disconnected**
**Management implication**: If P90 is declining over time, the aquifer is losing storage or connectivity. This is an early warning signal.
### P50 (Median Flow) - Typical Water Availability
**Physical meaning**: Half the time flow is above this, half below = "normal" stream condition
**Use**: Water supply planning, habitat protection, flow targets for stream restoration
### P10 (High Flow) - Flood Regime
**Physical meaning**: Flow exceeded only 10% of time = high flows and floods
**Use**: Bridge/culvert design, floodplain management, stormwater infrastructure sizing
### Example: Stream Aquifer Health Assessment
**Scenario A - Healthy Aquifer Connection**
- P90 = 2.5 cfs, P50 = 8 cfs, P10 = 45 cfs
- **Interpretation**: Stream maintains 2.5 cfs even during droughts → Aquifer provides reliable base flow → Good for water supply, ecology
**Scenario B - Degraded Connection**
- P90 = 0.1 cfs, P50 = 12 cfs, P10 = 200 cfs
- **Interpretation**: Stream nearly dries during droughts (P90 ≈ 0) but floods heavily (P10 >> P50) → Flashy runoff-dominated system → Poor aquifer buffering
**Scenario C - Declining Trend (CRITICAL)**
- 1990s: P90 = 3.0 cfs
- 2020s: P90 = 0.8 cfs
- **Interpretation**: Base flow declining → Aquifer storage declining or stream incising (disconnecting) → **Investigate causes immediately**
**Action**: Compare P90 trends with well water levels and HTEM transmissivity to diagnose aquifer health.
:::
---
## Part 6: Flow Duration Curves
::: {.callout-note icon=false}
## Understanding Flow Duration Curves (FDC)
**What Is It?**
A **Flow Duration Curve (FDC)** is a graph showing the percentage of time that stream discharge equals or exceeds a given value. Developed by hydrologists in the 1950s, the FDC compresses thousands of daily measurements into a single curve that reveals a stream's flow regime. It's essentially the **cumulative distribution function (CDF)** of streamflow—a statistical signature of watershed hydrology.
**Historical Context**: Foster (1934) introduced flow duration analysis for hydroelectric planning. Today, FDCs are standard tools in hydrology for comparing watersheds and assessing water availability.
**Why Does It Matter?**
The FDC shape reveals fundamental watershed characteristics:
- **Steep slope**: Flashy, runoff-dominated stream (urban, tile-drained agricultural)
- **Gentle slope**: Stable, groundwater-dominated stream (forested, good aquifer connection)
- **Q90 value**: Base flow from aquifer—the "minimum reliable flow"
- **Q10 value**: Flood regime—infrastructure design implications
For aquifer analysis, **FDC slope and Q90 position indicate aquifer buffering capacity**.
**How Does It Work?**
**Step-by-step construction of a Flow Duration Curve:**
1. **Collect daily discharge data**
- Example: 20 years × 365 days = 7,300 daily measurements
- Data: [125 cfs, 0.5 cfs, 450 cfs, 2.3 cfs, ...]
2. **Sort flows from highest to lowest**
- Highest: 450 cfs (flood event)
- ...
- Lowest: 0.5 cfs (drought)
3. **Assign rank to each flow**
- Rank 1 = highest flow (450 cfs)
- Rank 7,300 = lowest flow (0.5 cfs)
4. **Calculate exceedance probability for each rank**
```
Exceedance % = (Rank / Total days) × 100
```
- Flow 450 cfs → Rank 1 → Exceeded 0.01% of time (rare flood)
- Flow 2.3 cfs → Rank 6,570 → Exceeded 90% of time (base flow)
5. **Plot on log scale**
- X-axis: Exceedance probability (0% to 100%)
- Y-axis: Discharge (cfs), logarithmic scale
- Log scale reveals low-flow details (base flow range compressed on linear scale)
6. **Mark key percentiles**
- **Q10**: Flow exceeded 10% of time (high flows, floods)
- **Q50**: Flow exceeded 50% of time (median flow)
- **Q90**: Flow exceeded 90% of time (low flows, base flow)
7. **Interpret the curve slope**
- **Steep slope** (vertical drop): Flashy regime, large flow variability → Urban/tile-drained watershed, poor aquifer buffering
- **Gentle slope** (gradual decline): Stable regime, low variability → Aquifer-fed stream, good buffering
**What the slope reveals about the aquifer:**
- **Gentle slope**: Aquifer releases water slowly and steadily → Good storage, high transmissivity
- **Steep slope**: Aquifer doesn't buffer flow → Either disconnected or low transmissivity
- **Flat at high flows, steep at low flows**: Aquifer exhausted during droughts → Limited storage
**What Will You See?**
A downward-sloping curve on a log scale. The curve starts high (left side = floods that occur 10% of the time) and drops to low values (right side = base flow present 90% of the time). Red markers highlight Q10, Q50, and Q90.
**How to Interpret**
| FDC Characteristic | Meaning | Aquifer Implication |
|-------------------|---------|-------------------|
| Steep curve | Flashy, high variability | Poor aquifer buffering |
| Gentle curve | Stable, low variability | Strong aquifer buffering |
| Q90 > 1 cfs | Sustained base flow | Good aquifer connection |
| Q90 ≈ 0 cfs | Stream goes dry | Ephemeral, no base flow |
| Q10/Q90 > 100 | Extreme flow range | Urban/tile-drained watershed |
| Q10/Q90 < 10 | Modest flow range | Forested/natural watershed |
| High Q50 | Abundant water | Large contributing area or wet climate |
| Low Q50 | Limited water | Small watershed or dry climate |
**Example Interpretation**:
- **Gaining stream** (aquifer-fed): Q90 = 2 cfs, gentle slope
- **Losing stream** (recharging aquifer): Q90 = 0.1 cfs, steep slope
:::
```{python}
#| label: fig-flow-duration-curve
#| fig-cap: "Flow duration curve for the longest-record stream gauge. Q10 (high flow), Q50 (median), and Q90 (low/base flow) are marked. The slope of the FDC indicates aquifer buffering capacity."
# Find longest-record site
longest_site = stats_df.loc[stats_df['record_length_years'].idxmax(), 'site_no']
longest_site_name = sites_df.loc[sites_df['site_no'] == longest_site, 'station_nm'].values[0]
print(f"Longest record: {longest_site} - {longest_site_name}")
# Calculate flow duration curve
fdc = usgs_loader.calculate_flow_duration_curve(longest_site)
# Plot FDC
fig = go.Figure()
fig.add_trace(
go.Scatter(
x=fdc['exceedance_probability'],
y=fdc['discharge_cfs'],
mode='lines',
line=dict(color='steelblue', width=3),
name=longest_site_name,
hovertemplate='Exceedance: %{x:.0f}%<br>Discharge: %{y:.1f} cfs<extra></extra>'
)
)
# Mark key percentiles
key_probs = [10, 50, 90]
for prob in key_probs:
val = fdc.loc[fdc['exceedance_probability'] == prob, 'discharge_cfs'].values[0]
fig.add_trace(
go.Scatter(
x=[prob],
y=[val],
mode='markers+text',
marker=dict(color='red', size=12, symbol='circle'),
text=[f'Q{prob} = {val:.1f} cfs'],
textposition='top center',
showlegend=False
)
)
fig.update_layout(
title=f'Flow Duration Curve: {longest_site_name}<br><sub>Log scale reveals base flow dynamics</sub>',
xaxis_title='Exceedance Probability (%)',
yaxis_title='Discharge (cfs)',
yaxis_type='log',
height=600,
template='plotly_white'
)
fig.show()
```
::: {.callout-note icon=false}
## 💻 For Computer Scientists
**Flow Duration Curves (FDC) = Empirical CDF of Discharge**
FDC is the **cumulative distribution function** of streamflow—a compact hydrologic signature!
**Why FDC Matters for ML:**
1. **Dimensionality reduction**: 10,000+ daily values → 100 quantiles (100× compression!)
2. **Watershed classification**: Cluster watersheds by FDC shape
3. **Transfer learning**: Similar FDC = similar watershed (transfer models)
4. **Synthetic generation**: Generate realistic hydrographs from FDC + autocorrelation
:::
**Reading the FDC**:
- **Q10** (high flow): Flood regime
- **Q50** (median): Typical streamflow
- **Q90** (low flow): **Primarily groundwater base flow**
If stream maintains flow at Q90, it's **connected to aquifer**. If Q90 approaches zero, stream is **disconnected** (ephemeral).
---
## Part 7: Flow Regime Analysis
::: {.callout-note icon=false}
## Understanding Flow Regime Metrics
**What Are They?**
Flow regime metrics are numerical summaries that characterize stream hydrology. The most important are:
- **Q10, Q50, Q90**: Percentile flows from the FDC
- **Flow Variability Ratio (Q10/Q90)**: Range between high and low flows
- **Base Flow Index (BFI)**: Proportion of streamflow from groundwater
These metrics were standardized by hydrologists in the 1960s-70s to enable comparison across watersheds and regions.
**Why Do They Matter?**
These metrics compress complex flow records into actionable numbers:
- **Q90**: Water availability during droughts (critical for aquatic habitat, irrigation)
- **Q10**: Flood magnitude (bridge/culvert design, floodplain management)
- **Q10/Q90 ratio**: Watershed flashiness (drainage design, aquifer connectivity)
- **BFI**: Aquifer contribution (validates HTEM interpretations)
For aquifer management, **Q90 and BFI directly indicate groundwater discharge to streams**.
**How Do They Work?**
1. **Extract percentiles from FDC**:
- Q10 = 90th percentile (high flow)
- Q50 = 50th percentile (median)
- Q90 = 10th percentile (low flow)
2. **Calculate ratios**:
```
Flow Variability = Q10 / Q90
Base Flow Index ≈ Q90 / Q50
```
3. **Classify watershed regime**:
- High BFI + Low variability = Aquifer-buffered
- Low BFI + High variability = Flashy, runoff-dominated
**What Will You See?**
A table showing the five key metrics with numerical values. Compare these to the interpretation guide below to classify the stream's flow regime.
**How to Interpret**
| Metric | Value Range | Interpretation | Aquifer Connection |
|--------|------------|---------------|-------------------|
| **Q90** | > 1 cfs | Good base flow | Strong aquifer discharge |
| **Q90** | 0.1-1 cfs | Moderate base flow | Some aquifer connection |
| **Q90** | < 0.1 cfs | Minimal base flow | Weak/no connection |
| **Q10/Q90** | > 100 | Very flashy | Urban or tile-drained |
| **Q10/Q90** | 20-100 | Moderately flashy | Agricultural, some buffering |
| **Q10/Q90** | < 20 | Stable | Forested or strong aquifer |
| **BFI (Q90/Q50)** | > 0.6 | Groundwater-dominated | Excellent connection |
| **BFI (Q90/Q50)** | 0.3-0.6 | Mixed regime | Moderate connection |
| **BFI (Q90/Q50)** | < 0.3 | Runoff-dominated | Poor connection |
**Example**: Stream with Q90=0.5 cfs, Q10/Q90=150, BFI=0.25
- **Interpretation**: Flashy runoff-dominated stream with minimal base flow
- **Likely cause**: Urban watershed with impervious surfaces or tile-drained agriculture
- **Aquifer connection**: Weak—stream responds to rain, not groundwater
:::
```{python}
# Calculate base flow index
q10 = fdc.loc[fdc['exceedance_probability'] == 10, 'discharge_cfs'].iloc[0]
q50 = fdc.loc[fdc['exceedance_probability'] == 50, 'discharge_cfs'].iloc[0]
q90 = fdc.loc[fdc['exceedance_probability'] == 90, 'discharge_cfs'].iloc[0]
flow_regime = pd.Series({
'Q10_high_flow_cfs': q10,
'Q50_median_flow_cfs': q50,
'Q90_low_flow_cfs': q90,
'flow_variability_q10_q90_ratio': q10 / q90,
'base_flow_index_estimate': q90 / q50
})
flow_regime = flow_regime.round(2)
flow_regime
```
**Interpreting metrics**:
- **Flow Variability (Q10/Q90)**: Ratio >100 = flashy (urban/tile-drained), <10 = stable (aquifer-fed)
- **Base Flow Index (Q90/Q50)**: BFI >0.6 = groundwater-dominated, <0.3 = runoff-dominated
---
## Part 8: Key Findings
::: {.callout-important icon=true}
## 🎯 Critical Findings
### 1. Severe Spatial Coverage Gap
**Evidence**: Only 21.6% of HTEM area within 5km of gauge
**Impact**:
- Regional stream-aquifer analysis infeasible
- Cannot assess spatial heterogeneity
- Urban monitoring bias (3 gauges in 27.8 mi² urban, 0 in 856 mi² agricultural)
**Action**: Install ≥5 additional gauges in agricultural watersheds
### 2. Excellent Temporal Coverage
**Achievement**: 75+ year records, 480,000+ daily measurements
**Value**:
- Detects multi-decadal trends
- Captures full range of drought/flood cycles
- Pre-development baseline available
### 3. Flow Duration Curves Reveal Aquifer Connection
**Tool**: FDC shows aquifer buffering capacity
**Application**: Compare base flow index with HTEM transmissivity—validation of geophysical interpretations
### 4. Urban vs. Agricultural Monitoring Bias
**Problem**: All 3 gauges in HTEM are in urban watershed
**Limitation**: Urban systems (impervious, stormwater) behave fundamentally differently than agricultural (tile drainage, natural base flow)
**Action**: Prioritize agricultural watershed monitoring
:::
---
## Integration Roadmap
Stream gauge data enables:
**Part 2: Spatial Patterns**
- Overlay gauge locations on HTEM grids
- Stream proximity analysis and monitoring gaps
- Delineate watersheds for each gauge
**Part 3: Temporal Dynamics**
- Streamflow variability and trends over time
- Correlate discharge with well water levels
- Event response analysis (droughts, floods)
**Part 4: Data Fusion Insights**
- Stream-aquifer exchange analysis
- Base flow separation (isolate groundwater contribution)
- Recharge estimation from streamflow
- Water balance closure validation
**Part 5: Predictive Operations**
- Water level forecasting using stream data
- Scenario analysis (pumping impacts on streamflow)
- Early warning systems for low-flow conditions
---
## Recommendations
### Immediate (0-6 months)
1. Contact USGS to verify temporal data quality
2. Update analyses to clarify spatial limitations
3. Prioritize gauge site selection in agricultural areas
### Short-term (6-18 months)
4. Install 3-5 new gauges within HTEM extent
5. Target intermediate scale (20-100 mi²) agricultural watersheds
6. Achieve ≥50% spatial coverage
### Long-term (2-5 years)
7. Expand to 8-10 gauges for 70% coverage
8. Co-locate gauges with monitoring wells
9. Implement real-time telemetry
---
## Dependencies & Outputs
- **Data source**: `usgs_stream` (local site metadata + daily values)
- **Loader**: `src.data_loaders.USGSStreamLoader`
- **Outputs**: Flow duration curves, site statistics, optional exports to `outputs/phase-1/usgs/`
To access stream data:
```python
from src.data_loaders import USGSStreamLoader
loader = USGSStreamLoader()
# Load discharge time series
discharge = loader.load_daily_discharge(site_no='03337000')
# Calculate flow duration curve
fdc = loader.calculate_flow_duration_curve(site_no='03337000')
```
---
## Summary
USGS stream gauge network provides **exceptional temporal depth but limited spatial coverage**:
✅ **75+ years of records** - 480,000+ daily measurements capture full drought/flood cycles
✅ **Pre-development baselines** - Long-term trends reveal aquifer changes over time
✅ **Flow duration curves** - Reveal aquifer buffering capacity through base flow analysis
⚠️ **Spatial bias** - Only 3 gauges within HTEM footprint, all in urban watersheds
⚠️ **Agricultural gap** - No gauges in tile-drained agricultural areas (different hydrology)
**Key Insight**: Stream data provides **temporal calibration** for HTEM snapshots, but 22% spatial coverage limits regional validation. Prioritize agricultural watershed gauge installation.
---
## Related Chapters
- [Well Network Analysis](well-network-analysis.qmd) - Co-location opportunities with wells
- [Weather Station Data](weather-station-data.qmd) - Precipitation-discharge relationships
- [Streamflow Variability](../part-3-temporal/streamflow-variability.qmd) - Temporal analysis of flow patterns
- [Stream-Aquifer Exchange](../part-4-fusion/stream-aquifer-exchange.qmd) - Fusion of stream and groundwater data
## Reflection Questions
- Based on the flow duration curve and flow-regime metrics, would you classify the longest-record gauge as groundwater-dominated, runoff-dominated, or mixed, and why?
- If you were tasked with adding 3–5 new stream gauges, which parts of the HTEM area would you prioritize to reduce spatial bias between urban and agricultural watersheds?
- How would you combine streamflow metrics (like Q90 or base flow index) with well and HTEM data to cross-check interpretations of aquifer transmissivity and connectivity?