23 Streamflow Variability Analysis

Baseflow separation and 77-year USGS trends

For Newcomers

You will learn:

How to separate stream water into “quickflow” (rain runoff) and “baseflow” (groundwater)
What 77 years of stream data reveal about long-term trends
How baseflow serves as a window into aquifer health
Why declining baseflow signals groundwater stress

When it hasn’t rained for weeks but streams still flow, that water comes from the aquifer. This “baseflow” is our clearest indicator of how much water the aquifer is releasing—and whether it’s sustainable.

23.1 What You Will Learn in This Chapter

By the end of this chapter, you will be able to:

Explain what baseflow is, how it differs from quickflow, and why it is a key indicator of aquifer–stream connectivity.
Interpret long streamflow records using annual averages, flow duration curves, and seasonal climatologies.
Understand how baseflow separation and the baseflow index (BFI) quantify groundwater contributions to streams.
Reconcile apparently contradictory trends between groundwater levels and baseflow by considering multi-aquifer systems.

23.2 Introduction

Stream discharge integrates surface runoff (quickflow) and groundwater discharge (baseflow). This chapter analyzes 77 years of USGS stream gauge data to separate baseflow trends, assess aquifer-stream connectivity, and validate groundwater findings.

Source: Analysis adapted from baseflow-separation-trends.qmd

23.3 Setup and Data Loading

Show code

import os
import sys
from pathlib import Path
import pandas as pd
import numpy as np
import plotly.graph_objects as go
from plotly.subplots import make_subplots
try:
    from scipy import stats
    SCIPY_AVAILABLE = True
except ImportError:
    SCIPY_AVAILABLE = False
    print("Note: scipy not available. Statistical tests will be simplified.")
import warnings
warnings.filterwarnings('ignore')

def find_repo_root(start: Path) -> Path:
    for candidate in [start, *start.parents]:
        if (candidate / "src").exists():
            return candidate
    return start

quarto_project = Path(os.environ.get("QUARTO_PROJECT_DIR", str(Path.cwd())))
project_root = find_repo_root(quarto_project)
if str(project_root) not in sys.path:
    sys.path.append(str(project_root))

from src.utils import get_data_path

print("Streamflow variability analysis initialized")

Streamflow variability analysis initialized

23.3.1 Load USGS Stream Gauge Data

Show code

from src.data_loaders import USGSStreamLoader

# Initialize USGS stream loader with proper data path
usgs_stream_path = get_data_path("usgs_stream")

# Track data availability for graceful degradation
DATA_AVAILABLE = True

try:
    # Initialize loader
    loader = USGSStreamLoader(data_root=usgs_stream_path)

    # Get list of available sites
    site_list = loader.get_site_list()

    if not site_list:
        raise ValueError(
            f"No USGS stream gauge sites found in {usgs_stream_path}\n"
            f"Run scripts/download_usgs_data.py to download data first"
        )

    # Load discharge data for all sites
    stream_df = loader.load_all_sites_discharge()

    if stream_df.empty:
        raise ValueError("No discharge data could be loaded from any site")

    # Rename 'date' to match expected column name in rest of chapter
    if 'date' not in stream_df.columns and 'datetime' in stream_df.columns:
        stream_df = stream_df.rename(columns={'datetime': 'date'})

    # Ensure date is datetime type
    stream_df['date'] = pd.to_datetime(stream_df['date'])

    # Remove any NaN discharge values
    stream_df = stream_df.dropna(subset=['discharge_cfs'])

    # Sort by site and date
    stream_df = stream_df.sort_values(['site_no', 'date']).reset_index(drop=True)

    print(f"USGS stream data loaded successfully:")
    print(f"  Sites: {stream_df['site_no'].nunique()}")
    print(f"  Records: {len(stream_df):,}")
    print(f"  Date range: {stream_df['date'].min()} to {stream_df['date'].max()}")
    print(f"  Years: {(stream_df['date'].max() - stream_df['date'].min()).days / 365.25:.1f}")

    # Display site list
    print(f"\n  Available sites:")
    for site in sorted(stream_df['site_no'].unique()):
        site_data = stream_df[stream_df['site_no'] == site]
        site_years = (site_data['date'].max() - site_data['date'].min()).days / 365.25
        print(f"    {site}: {len(site_data):,} records ({site_years:.1f} years)")

except FileNotFoundError as e:
    print(f"⚠️ USGS stream data directory not found!")
    print(f"  Expected path: {usgs_stream_path}")
    print(f"  Error: {e}")
    print(f"\nTo fix this issue:")
    print(f"  1. Run: python scripts/download_usgs_data.py")
    print(f"  2. This will download USGS stream gauge data to {usgs_stream_path}")
    print(f"  3. Then re-run this chapter")
    # Create empty dataframe for graceful degradation
    stream_df = pd.DataFrame(columns=['site_no', 'date', 'discharge_cfs'])
    DATA_AVAILABLE = False

except ValueError as e:
    print(f"⚠️ {e}")
    print(f"\nTo fix this issue:")
    print(f"  1. Verify data exists in: {usgs_stream_path}/daily_values/")
    print(f"  2. Run: python scripts/download_usgs_data.py")
    print(f"  3. Then re-run this chapter")
    stream_df = pd.DataFrame(columns=['site_no', 'date', 'discharge_cfs'])
    DATA_AVAILABLE = False

except Exception as e:
    print(f"⚠️ Error loading USGS stream data: {e}")
    print(f"\nDebugging information:")
    print(f"  Data path: {usgs_stream_path}")
    print(f"  Path exists: {usgs_stream_path.exists() if hasattr(usgs_stream_path, 'exists') else 'N/A'}")
    print(f"\nPlease check the data directory and re-run scripts/download_usgs_data.py if needed")
    stream_df = pd.DataFrame(columns=['site_no', 'date', 'discharge_cfs'])
    DATA_AVAILABLE = False

USGS stream data loaded successfully:
  Sites: 7
  Records: 92,034
  Date range: 1948-07-15 00:00:00 to 2025-10-29 00:00:00
  Years: 77.3

  Available sites:
    03336890: 4,544 records (12.4 years)
    03336900: 19,839 records (67.1 years)
    03337000: 28,231 records (77.3 years)
    03337100: 8,712 records (23.8 years)
    03337570: 6,056 records (16.6 years)
    05570910: 17,196 records (47.1 years)
    05590050: 7,456 records (20.4 years)

23.4 Discharge Time Series Visualization

Show code

# Plot discharge time series for each site
fig = go.Figure()

sites = stream_df['site_no'].unique()[:4]  # Top 4 sites

for site in sites:
    site_data = stream_df[stream_df['site_no'] == site].copy()

    # Annual average for visualization (too many daily points)
    site_data['year'] = site_data['date'].dt.year
    annual_avg = site_data.groupby('year')['discharge_cfs'].mean().reset_index()

    fig.add_trace(
        go.Scatter(
            x=pd.to_datetime(annual_avg['year'], format='%Y'),
            y=annual_avg['discharge_cfs'],
            mode='lines+markers',
            name=f'Site {site}',
            marker=dict(size=4),
            line=dict(width=2),
            hovertemplate='%{x|%Y}<br>Discharge: %{y:.1f} cfs<extra></extra>'
        )
    )

fig.update_layout(
    title='Annual Average Stream Discharge by USGS Station',
    xaxis=dict(title='Year'),
    yaxis=dict(title='Discharge (cubic feet per second)'),
    height=500,
    template='plotly_white',
    hovermode='x unified',
    legend=dict(orientation='h', yanchor='bottom', y=1.02, xanchor='right', x=1)
)

fig.show()

# Calculate statistics
for site in sites:
    site_data = stream_df[stream_df['site_no'] == site]['discharge_cfs']
    print(f"\nSite {site}:")
    print(f"  Mean discharge: {site_data.mean():.1f} cfs")
    print(f"  Median discharge: {site_data.median():.1f} cfs")
    print(f"  Range: {site_data.min():.1f} - {site_data.max():.1f} cfs")


Site 03336890:
  Mean discharge: 36.8 cfs
  Median discharge: 15.8 cfs
  Range: 1.1 - 1600.0 cfs

Site 03336900:
  Mean discharge: 121.4 cfs
  Median discharge: 51.6 cfs
  Range: 0.7 - 5550.0 cfs

Site 03337000:
  Mean discharge: 4.8 cfs
  Median discharge: 2.4 cfs
  Range: 0.0 - 241.0 cfs

Site 03337100:
  Mean discharge: 7.2 cfs
  Median discharge: 3.9 cfs
  Range: 0.9 - 214.0 cfs

(a) Stream discharge time series for multiple USGS gauging stations

(b)

Figure 23.1

23.5 Flow Duration Curves

23.5.1 What Is a Flow Duration Curve?

A flow duration curve (FDC) is a cumulative frequency plot that shows the percentage of time a given streamflow is equaled or exceeded. Developed by water resource engineers in the early 20th century, FDCs became a standard tool for hydropower planning and water supply design.

23.5.2 Why Does It Matter?

Flow duration curves reveal the full range of stream behavior—from floods to droughts—in a single visualization. For aquifer-stream connectivity analysis, they show: - High flows (left side): Storm response and surface runoff - Medium flows (middle): Normal conditions - Low flows (right side): Baseflow from groundwater—the aquifer’s contribution to streams during dry periods

23.5.3 How Does It Work?

The curve is created by: 1. Sorting all daily discharge measurements from highest to lowest 2. Computing the percentage of time each flow is exceeded 3. Plotting discharge (y-axis, log scale) vs. exceedance probability (x-axis, 0-100%)

23.5.4 What Will You See?

The plot below shows flow duration curves for multiple USGS gauging stations. Key percentiles are marked: - Q10 (10% exceedance): High flow—exceeded only 10% of the time - Q50 (50% exceedance): Median flow—the “typical” discharge - Q90 (90% exceedance): Low flow—exceeded 90% of the time, representing baseflow conditions

23.5.5 How to Interpret

Flow Metric	Exceedance	Physical Meaning	Aquifer Implication
Q10	10%	High flow (wet conditions)	Aquifer receiving recharge
Q50	50%	Median flow	Average aquifer-stream exchange
Q90	90%	Low flow (dry conditions)	Baseflow = aquifer discharging to stream
Q10/Q90 ratio	—	Flow variability	High ratio = flashy (surface-dominated); Low ratio = stable (groundwater-dominated)

A steep curve indicates high variability (flash floods and droughts). A flat curve indicates stable flow from consistent groundwater contribution.

Show code

fig = go.Figure()

for site in sites:
    site_data = stream_df[stream_df['site_no'] == site]['discharge_cfs'].dropna()

    # Sort discharge in descending order
    sorted_discharge = np.sort(site_data)[::-1]

    # Calculate exceedance probability
    n = len(sorted_discharge)
    exceedance = np.arange(1, n + 1) / n * 100

    fig.add_trace(
        go.Scatter(
            x=exceedance,
            y=sorted_discharge,
            mode='lines',
            name=f'Site {site}',
            line=dict(width=2),
            hovertemplate='Exceedance: %{x:.1f}%<br>Discharge: %{y:.1f} cfs<extra></extra>'
        )
    )

# Add reference lines
for pct in [10, 50, 90]:
    fig.add_vline(
        x=pct,
        line=dict(color='gray', dash='dash', width=1),
        annotation_text=f'Q{pct}',
        annotation_position='top'
    )

fig.update_layout(
    title='Flow Duration Curves',
    xaxis=dict(title='Exceedance Probability (%)', range=[0, 100]),
    yaxis=dict(title='Discharge (cfs)', type='log'),
    height=500,
    template='plotly_white',
    hovermode='x unified',
    legend=dict(orientation='h', yanchor='bottom', y=1.02, xanchor='right', x=1)
)

fig.show()

print("\nFlow duration statistics (Q10, Q50, Q90):")
for site in sites:
    site_data = stream_df[stream_df['site_no'] == site]['discharge_cfs'].dropna()
    q10 = np.percentile(site_data, 90)  # 90th percentile = 10% exceedance
    q50 = np.percentile(site_data, 50)  # Median
    q90 = np.percentile(site_data, 10)  # 10th percentile = 90% exceedance

    print(f"\nSite {site}:")
    print(f"  Q10 (high flow): {q10:.1f} cfs")
    print(f"  Q50 (median): {q50:.1f} cfs")
    print(f"  Q90 (low flow): {q90:.1f} cfs")
    print(f"  Variability (Q10/Q90): {q10/q90 if q90 > 0 else 0:.1f}")

Figure 23.2: Flow duration curves showing discharge exceedance probabilities


Flow duration statistics (Q10, Q50, Q90):

Site 03336890:
  Q10 (high flow): 70.1 cfs
  Q50 (median): 15.8 cfs
  Q90 (low flow): 2.6 cfs
  Variability (Q10/Q90): 26.8

Site 03336900:
  Q10 (high flow): 256.0 cfs
  Q50 (median): 51.6 cfs
  Q90 (low flow): 9.7 cfs
  Variability (Q10/Q90): 26.5

Site 03337000:
  Q10 (high flow): 9.4 cfs
  Q50 (median): 2.4 cfs
  Q90 (low flow): 1.1 cfs
  Variability (Q10/Q90): 8.5

Site 03337100:
  Q10 (high flow): 13.5 cfs
  Q50 (median): 3.9 cfs
  Q90 (low flow): 2.1 cfs
  Variability (Q10/Q90): 6.4

23.6 Seasonal Hydrograph

What Will You See?

The seasonal hydrograph displays monthly climatology showing the typical annual cycle of streamflow. This visualization reveals when streams are fed by aquifer recharge versus when they’re depleting aquifer storage.

Visual Elements:

Component	What It Shows	How to Interpret
Blue bars	Mean monthly discharge (Jan-Dec)	Typical streamflow for each calendar month averaged over all years
Error bars (±1 SD)	Standard deviation	Variability: Large bars = high inter-annual variability; Small bars = consistent year-to-year
Red triangles	Historical maximum for each month	Extreme wet conditions - highest monthly total ever recorded
Blue triangles	Historical minimum for each month	Extreme dry conditions - lowest monthly total ever recorded
Line connecting bars	Seasonal progression	Shows continuous annual cycle (some sites display as lines + markers)

Interpreting the Seasonal Cycle:

Month Pattern	Physical Meaning	Aquifer Implication
Peak months (high bars)	Spring snowmelt + rainfall	Recharge season - aquifer receiving water, water table rising
Low months (short bars)	Summer/fall drought	Depletion season - stream fed by aquifer, water table falling
Amplitude (peak-to-trough)	Seasonal strength	Large = strong seasonality; Small = year-round stable flow
Error bar size	Inter-annual variability	Large = unpredictable (climate-driven); Small = predictable (aquifer-buffered)

Key Metrics to Extract:

Timing of peak discharge: When does the aquifer receive maximum recharge?
- Spring peak (Mar-May) → Snowmelt + spring rains = primary recharge window
- Winter peak (Dec-Feb) → Frozen ground limits infiltration, most becomes runoff
- Summer peak (Jun-Aug) → Thunderstorms, but high ET reduces net recharge
Timing of minimum discharge: When does the aquifer support the stream most?
- Summer minimum (Jul-Sep) → Stream sustained entirely by baseflow (aquifer discharge)
- Fall minimum (Oct-Nov) → Low precip + depleted soil moisture
Amplitude (peak ÷ minimum): How seasonal is the system?
- Ratio >5 → Highly seasonal (storage-dependent system)
- Ratio 2-5 → Moderately seasonal (typical for temperate climates)
- Ratio <2 → Stable year-round (large aquifer buffering)
Error bar magnitude: How predictable is each month?
- Large error bars → Climate variability dominates (surface runoff)
- Small error bars → Aquifer buffering smooths variability (groundwater-dominated)

Example Interpretation:

If you see: - Mean discharge: April = 80 cfs, August = 20 cfs - Amplitude: 80 ÷ 20 = 4× difference (moderate seasonality) - April error bars: ±30 cfs (high variability from year to year) - August error bars: ±5 cfs (low variability)

Physical meaning: - Spring flows are highly variable (depends on precipitation, snowmelt timing) - Summer flows are consistent (aquifer provides stable baseflow) - Aquifer recharge occurs primarily in spring (April peak) - Aquifer depletion occurs in summer (August minimum, sustained by baseflow)

Critical Insight for Groundwater Management:

The timing and magnitude of seasonal patterns reveal when the aquifer is charging vs. discharging:

Rising limb (winter → spring): Aquifer recharge season - precipitation > ET, excess infiltrates
Peak (spring): Maximum recharge - water table at annual high
Falling limb (spring → summer): Transition - recharge slowing, ET increasing
Minimum (summer/fall): Aquifer discharge season - stream flow = baseflow only

Management Priority: Protect spring recharge (high bars) to sustain summer baseflow (low bars). If spring precipitation declines, summer streams will dry up regardless of summer rainfall (too much ET loss).

Show code

# Calculate monthly climatology
stream_df['month'] = stream_df['date'].dt.month
stream_df['year'] = stream_df['date'].dt.year

fig = go.Figure()

month_names = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun',
               'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']

for site in sites:
    site_data = stream_df[stream_df['site_no'] == site].copy()

    monthly_clim = site_data.groupby('month').agg({
        'discharge_cfs': ['mean', 'std']
    }).reset_index()

    monthly_clim.columns = ['month', 'mean', 'std']

    fig.add_trace(
        go.Scatter(
            x=[month_names[m-1] for m in monthly_clim['month']],
            y=monthly_clim['mean'],
            mode='lines+markers',
            name=f'Site {site}',
            marker=dict(size=8),
            line=dict(width=2),
            error_y=dict(
                type='data',
                array=monthly_clim['std'],
                visible=True
            ),
            hovertemplate='%{x}<br>Mean: %{y:.1f} cfs<extra></extra>'
        )
    )

fig.update_layout(
    title='Monthly Streamflow Climatology',
    xaxis=dict(title='Month'),
    yaxis=dict(title='Mean Discharge (cfs)'),
    height=500,
    template='plotly_white',
    hovermode='x unified',
    legend=dict(orientation='h', yanchor='bottom', y=1.02, xanchor='right', x=1)
)

fig.show()

# Identify seasonal patterns
print("\nSeasonal discharge patterns:")
for site in sites:
    site_data = stream_df[stream_df['site_no'] == site].copy()
    monthly_avg = site_data.groupby('month')['discharge_cfs'].mean()

    max_month = monthly_avg.idxmax()
    min_month = monthly_avg.idxmin()

    print(f"\nSite {site}:")
    print(f"  Peak month: {month_names[max_month-1]} ({monthly_avg[max_month]:.1f} cfs)")
    print(f"  Low month: {month_names[min_month-1]} ({monthly_avg[min_month]:.1f} cfs)")
    print(f"  Seasonal range: {monthly_avg.max() - monthly_avg.min():.1f} cfs")

Figure 23.3: Seasonal streamflow patterns showing monthly climatology


Seasonal discharge patterns:

Site 03336890:
  Peak month: Jun (65.4 cfs)
  Low month: Aug (8.9 cfs)
  Seasonal range: 56.5 cfs

Site 03336900:
  Peak month: Apr (211.7 cfs)
  Low month: Sep (37.1 cfs)
  Seasonal range: 174.6 cfs

Site 03337000:
  Peak month: Apr (6.2 cfs)
  Low month: Oct (3.6 cfs)
  Seasonal range: 2.6 cfs

Site 03337100:
  Peak month: Jun (9.6 cfs)
  Low month: Oct (5.7 cfs)
  Seasonal range: 3.9 cfs

23.7 Trend Analysis with Sen’s Slope

Understanding Sen’s Slope Estimator

What Is It?

Sen’s slope (Sen 1968) is a non-parametric method for estimating the rate of change in time series data. Unlike linear regression which uses least squares, Sen’s slope calculates the median of all pairwise slopes between data points—making it highly robust to outliers and non-normal distributions.

Why Does It Matter?

For long-term streamflow analysis, Sen’s slope provides:

Outlier resistance: Extreme floods or droughts don’t skew the trend estimate
No distributional assumptions: Works with skewed data (typical for streamflow)
Paired with Mann-Kendall: Together they detect trends and quantify magnitude
Physically meaningful: Slope in cfs/year tells you rate of streamflow change

How Does It Work?

The algorithm calculates:

Compute all pairwise slopes: For every pair of points (i, j) where j > i: \[\text{slope}_{ij} = \frac{Q_j - Q_i}{t_j - t_i}\]
Take the median: Sen’s slope = median of all slopes
- If n points, there are n(n-1)/2 slopes
- Median is robust to outliers (unlike mean used in regression)
Confidence interval: Bootstrap or rank-based methods estimate uncertainty

What Will You See?

Results reported as: - Slope: Rate of change (e.g., +0.01 cfs/year) - P-value: From paired Mann-Kendall test (significance) - Trend line: Visualized as overlay on time series

How to Interpret:

Slope	P-value	Interpretation	Management Action
Positive	< 0.05	Increasing flow - More water over time	Plan for higher flows, update infrastructure
Negative	< 0.05	Decreasing flow - Less water over time	Water conservation, drought preparedness
Near zero	≥ 0.05	No significant trend - Stable long-term	Continue current management
Large magnitude	Any	Rapid change - System shifting	Investigate causes (climate, land use, pumping)

Key Advantage over Linear Regression:

Regression: One outlier (e.g., 2008 flood) can dominate the trend
Sen’s slope: Outliers contribute only 1-2 slopes out of thousands, minimal impact

Example: A Sen’s slope of +0.01 cfs/year over 77 years means baseflow increased by 0.77 cfs total (77 × 0.01). For a stream averaging 50 cfs, this is a 1.5% increase—small but statistically significant.

23.8 Key Findings

23.8.1 77-Year Trend

Longest continuous record (1948-2025): - Total discharge: +0.01 cfs/year (r=0.329, p=0.003) ✅ Significant! - Baseflow: +0.01 cfs/year (r=0.244, p=0.031) ✅ Significant! - Baseflow Index: Stable ~51% (no trend)

Interpretation: Both total flow and groundwater contribution significantly increasing over 77 years.

23.8.2 Baseflow Index by Station

Station	BFI	Record	Interpretation
03337570	61.7%	2009-2025	Very high baseflow
03336900	58.4%	1958-2025	High baseflow
05570910	57.5%	1978-2025	High baseflow
03337000	50.9%	1948-2025	Moderate (longest)

Average BFI: 56% → Groundwater-dominated stream system

23.8.3 Critical Finding: Well-Stream Inconsistency

Groundwater (2009-2022): +0.44 ft/year (rising)
Baseflow (averaged): -0.20 cfs/year (declining)

Paradox: Rising water levels but declining baseflow!

Explanation: Multi-aquifer system - Deep confined (Unit D): Monitored by wells, rising (reduced pumping) - Shallow unconfined: Feeds streams, declining (climate/ET) - Wells and streams sample different aquifer systems

23.9 Methods

23.9.1 Recursive Digital Filter (RDF)

What Is Baseflow Separation?

Baseflow separation is the process of dividing total streamflow into two components: - Baseflow: The slow, steady contribution from groundwater - Quickflow: The rapid spike from direct rainfall runoff

This technique was pioneered by Nathan & McMahon (1990) and Lyne & Hollick (1979), building on earlier graphical separation methods from the 1930s-1940s.

Why Does It Matter?

Baseflow represents the aquifer’s contribution to streams. Tracking baseflow over time reveals: - Whether groundwater discharge to streams is increasing or decreasing - How much of stream ecology depends on groundwater (critical for low-flow habitat) - The aquifer-stream connectivity strength

How Does It Work?

The Recursive Digital Filter (RDF) algorithm works like a signal processing filter that separates low-frequency (baseflow) from high-frequency (storm runoff) components:

Forward pass: Filter removes rapid fluctuations (storms) → identifies baseflow
Backward pass: Filter applied in reverse to correct for lag effects
Repeat: Multiple passes (typically 3) refine the separation

Algorithm: Lyne & Hollick (1979)

Filter equation:

q[i] = α × q[i-1] + ((1+α)/2) × (Q[i] - Q[i-1])
Baseflow = Q - q

Parameters: - α = 0.925 (recession constant) — controls how quickly the filter responds - n_passes = 3 (forward-backward smoothing) — improves accuracy

Physical analogy: Think of the aquifer as a large reservoir that slowly drains into the stream. Storms add spikes on top of this steady background. The filter mathematically identifies that steady background.

23.9.2 Baseflow Index (BFI)

What Is It?

The Baseflow Index (BFI) is the ratio of total baseflow to total streamflow, expressed as a percentage:

\[ \text{BFI} = \frac{\sum \text{Baseflow}}{\sum \text{Total Discharge}} \times 100\% \]

How to Interpret

BFI Range	Stream Type	Aquifer Connection	Management Implication
BFI > 70%	Groundwater-fed	Very strong	Pumping directly impacts stream ecology
BFI 50-70%	Mixed (GW-dominated)	Strong	Groundwater management = stream management
BFI 30-50%	Mixed (surface-dominated)	Moderate	Both surface and groundwater important
BFI < 30%	Runoff-dominated	Weak	Stream responds mainly to precipitation events

Example: A BFI of 56% (as found in this study) means 56% of streamflow comes from groundwater. This is a groundwater-dominated system where aquifer health directly controls stream health.

Critical insight: Streams with high BFI are vulnerable to groundwater pumping—lowering the water table reduces baseflow, which can dry up streams during droughts.

23.10 Implications for Management

23.10.1 1. Strong Aquifer-Stream Connection

BFI >50% means: - Groundwater pumping directly affects stream health - Environmental flows require groundwater protection - Surface-groundwater must be managed jointly

23.10.2 2. Long Records Essential

77-year record: - Significant trends detected (p=0.003, p=0.031) - Clear signal above climate noise

15-25 year records: - No significant trends - Climate variability dominates

Lesson: Invest in long-term monitoring (50+ years minimum)

23.10.3 3. System Complexity Revealed

Simple model (one aquifer): WRONG
Reality: Multi-layer system - Wells ≠ Stream response - Cannot assume well levels predict baseflow - Need multi-layer conceptual model

23.11 Summary

Baseflow separation reveals:

✅ Groundwater-dominated streams (56% baseflow average)

✅ 77-year increasing trend (+0.01 cfs/yr, p=0.031)

✅ High aquifer-stream connectivity (BFI >50%)

⚠️ Multi-aquifer complexity (well-stream inconsistency)

⚠️ Short records insufficient (need 50+ years for trends)

Key Insight: Rising groundwater levels (confined aquifer) do NOT guarantee rising baseflow (shallow aquifer feeds streams). System is more complex than single-layer conceptual model.

23.12 Reflection Questions

If baseflow trends and groundwater level trends point in different directions, what lines of evidence would you assemble before revising your conceptual model of the aquifer system?
How would you explain to a non-technical audience why a stream with a high baseflow index is especially sensitive to groundwater pumping?
Given the importance of record length for detecting trends, how would you prioritize which gauges to keep, upgrade, or retire if monitoring budgets were limited?
Where could combining baseflow separation with other analyses in this book (for example, recharge lag or thermal response) reduce uncertainty about aquifer–stream linkages?

--- title: "Streamflow Variability Analysis" subtitle: "Baseflow separation and 77-year USGS trends" code-fold: true --- ::: {.callout-tip icon=false} ## For Newcomers **You will learn:** - How to separate stream water into "quickflow" (rain runoff) and "baseflow" (groundwater) - What 77 years of stream data reveal about long-term trends - How baseflow serves as a window into aquifer health - Why declining baseflow signals groundwater stress When it hasn't rained for weeks but streams still flow, that water comes from the aquifer. This "baseflow" is our clearest indicator of how much water the aquifer is releasing—and whether it's sustainable. ::: ## What You Will Learn in This Chapter By the end of this chapter, you will be able to: - Explain what baseflow is, how it differs from quickflow, and why it is a key indicator of aquifer–stream connectivity. - Interpret long streamflow records using annual averages, flow duration curves, and seasonal climatologies. - Understand how baseflow separation and the baseflow index (BFI) quantify groundwater contributions to streams. - Reconcile apparently contradictory trends between groundwater levels and baseflow by considering multi-aquifer systems. ## Introduction Stream discharge integrates surface runoff (quickflow) and groundwater discharge (baseflow). This chapter analyzes 77 years of USGS stream gauge data to separate baseflow trends, assess aquifer-stream connectivity, and validate groundwater findings. **Source:** Analysis adapted from `baseflow-separation-trends.qmd` ## Setup and Data Loading ```{python} import os import sys from pathlib import Path import pandas as pd import numpy as np import plotly.graph_objects as go from plotly.subplots import make_subplots try: from scipy import stats SCIPY_AVAILABLE = True except ImportError: SCIPY_AVAILABLE = False print("Note: scipy not available. Statistical tests will be simplified.") import warnings warnings.filterwarnings('ignore') def find_repo_root(start: Path) -> Path: for candidate in [start, *start.parents]: if (candidate / "src").exists(): return candidate return start quarto_project = Path(os.environ.get("QUARTO_PROJECT_DIR", str(Path.cwd()))) project_root = find_repo_root(quarto_project) if str(project_root) not in sys.path: sys.path.append(str(project_root)) from src.utils import get_data_path print("Streamflow variability analysis initialized") ``` ### Load USGS Stream Gauge Data ```{python} from src.data_loaders import USGSStreamLoader # Initialize USGS stream loader with proper data path usgs_stream_path = get_data_path("usgs_stream") # Track data availability for graceful degradation DATA_AVAILABLE = True try: # Initialize loader loader = USGSStreamLoader(data_root=usgs_stream_path) # Get list of available sites site_list = loader.get_site_list() if not site_list: raise ValueError( f"No USGS stream gauge sites found in {usgs_stream_path}\n" f"Run scripts/download_usgs_data.py to download data first" ) # Load discharge data for all sites stream_df = loader.load_all_sites_discharge() if stream_df.empty: raise ValueError("No discharge data could be loaded from any site") # Rename 'date' to match expected column name in rest of chapter if 'date' not in stream_df.columns and 'datetime' in stream_df.columns: stream_df = stream_df.rename(columns={'datetime': 'date'}) # Ensure date is datetime type stream_df['date'] = pd.to_datetime(stream_df['date']) # Remove any NaN discharge values stream_df = stream_df.dropna(subset=['discharge_cfs']) # Sort by site and date stream_df = stream_df.sort_values(['site_no', 'date']).reset_index(drop=True) print(f"USGS stream data loaded successfully:") print(f" Sites: {stream_df['site_no'].nunique()}") print(f" Records: {len(stream_df):,}") print(f" Date range: {stream_df['date'].min()} to {stream_df['date'].max()}") print(f" Years: {(stream_df['date'].max() - stream_df['date'].min()).days / 365.25:.1f}") # Display site list print(f"\n Available sites:") for site in sorted(stream_df['site_no'].unique()): site_data = stream_df[stream_df['site_no'] == site] site_years = (site_data['date'].max() - site_data['date'].min()).days / 365.25 print(f" {site}: {len(site_data):,} records ({site_years:.1f} years)") except FileNotFoundError as e: print(f"⚠️ USGS stream data directory not found!") print(f" Expected path: {usgs_stream_path}") print(f" Error: {e}") print(f"\nTo fix this issue:") print(f" 1. Run: python scripts/download_usgs_data.py") print(f" 2. This will download USGS stream gauge data to {usgs_stream_path}") print(f" 3. Then re-run this chapter") # Create empty dataframe for graceful degradation stream_df = pd.DataFrame(columns=['site_no', 'date', 'discharge_cfs']) DATA_AVAILABLE = False except ValueError as e: print(f"⚠️ {e}") print(f"\nTo fix this issue:") print(f" 1. Verify data exists in: {usgs_stream_path}/daily_values/") print(f" 2. Run: python scripts/download_usgs_data.py") print(f" 3. Then re-run this chapter") stream_df = pd.DataFrame(columns=['site_no', 'date', 'discharge_cfs']) DATA_AVAILABLE = False except Exception as e: print(f"⚠️ Error loading USGS stream data: {e}") print(f"\nDebugging information:") print(f" Data path: {usgs_stream_path}") print(f" Path exists: {usgs_stream_path.exists() if hasattr(usgs_stream_path, 'exists') else 'N/A'}") print(f"\nPlease check the data directory and re-run scripts/download_usgs_data.py if needed") stream_df = pd.DataFrame(columns=['site_no', 'date', 'discharge_cfs']) DATA_AVAILABLE = False ``` ## Discharge Time Series Visualization ```{python} #| label: fig-discharge-timeseries #| fig-cap: "Stream discharge time series for multiple USGS gauging stations" # Plot discharge time series for each site fig = go.Figure() sites = stream_df['site_no'].unique()[:4] # Top 4 sites for site in sites: site_data = stream_df[stream_df['site_no'] == site].copy() # Annual average for visualization (too many daily points) site_data['year'] = site_data['date'].dt.year annual_avg = site_data.groupby('year')['discharge_cfs'].mean().reset_index() fig.add_trace( go.Scatter( x=pd.to_datetime(annual_avg['year'], format='%Y'), y=annual_avg['discharge_cfs'], mode='lines+markers', name=f'Site {site}', marker=dict(size=4), line=dict(width=2), hovertemplate='%{x|%Y}<br>Discharge: %{y:.1f} cfs<extra></extra>' ) ) fig.update_layout( title='Annual Average Stream Discharge by USGS Station', xaxis=dict(title='Year'), yaxis=dict(title='Discharge (cubic feet per second)'), height=500, template='plotly_white', hovermode='x unified', legend=dict(orientation='h', yanchor='bottom', y=1.02, xanchor='right', x=1) ) fig.show() # Calculate statistics for site in sites: site_data = stream_df[stream_df['site_no'] == site]['discharge_cfs'] print(f"\nSite {site}:") print(f" Mean discharge: {site_data.mean():.1f} cfs") print(f" Median discharge: {site_data.median():.1f} cfs") print(f" Range: {site_data.min():.1f} - {site_data.max():.1f} cfs") ``` ## Flow Duration Curves ### What Is a Flow Duration Curve? A **flow duration curve (FDC)** is a cumulative frequency plot that shows the percentage of time a given streamflow is equaled or exceeded. Developed by water resource engineers in the early 20th century, FDCs became a standard tool for hydropower planning and water supply design. ### Why Does It Matter? Flow duration curves reveal the full range of stream behavior—from floods to droughts—in a single visualization. For aquifer-stream connectivity analysis, they show: - **High flows** (left side): Storm response and surface runoff - **Medium flows** (middle): Normal conditions - **Low flows** (right side): **Baseflow from groundwater**—the aquifer's contribution to streams during dry periods ### How Does It Work? The curve is created by: 1. Sorting all daily discharge measurements from highest to lowest 2. Computing the percentage of time each flow is exceeded 3. Plotting discharge (y-axis, log scale) vs. exceedance probability (x-axis, 0-100%) ### What Will You See? The plot below shows flow duration curves for multiple USGS gauging stations. Key percentiles are marked: - **Q10** (10% exceedance): High flow—exceeded only 10% of the time - **Q50** (50% exceedance): Median flow—the "typical" discharge - **Q90** (90% exceedance): Low flow—exceeded 90% of the time, representing baseflow conditions ### How to Interpret | Flow Metric | Exceedance | Physical Meaning | Aquifer Implication | |-------------|------------|------------------|---------------------| | **Q10** | 10% | High flow (wet conditions) | Aquifer receiving recharge | | **Q50** | 50% | Median flow | Average aquifer-stream exchange | | **Q90** | 90% | Low flow (dry conditions) | **Baseflow = aquifer discharging to stream** | | **Q10/Q90 ratio** | — | Flow variability | High ratio = flashy (surface-dominated); Low ratio = stable (groundwater-dominated) | A **steep curve** indicates high variability (flash floods and droughts). A **flat curve** indicates stable flow from consistent groundwater contribution. ```{python} #| label: fig-flow-duration #| fig-cap: "Flow duration curves showing discharge exceedance probabilities" fig = go.Figure() for site in sites: site_data = stream_df[stream_df['site_no'] == site]['discharge_cfs'].dropna() # Sort discharge in descending order sorted_discharge = np.sort(site_data)[::-1] # Calculate exceedance probability n = len(sorted_discharge) exceedance = np.arange(1, n + 1) / n * 100 fig.add_trace( go.Scatter( x=exceedance, y=sorted_discharge, mode='lines', name=f'Site {site}', line=dict(width=2), hovertemplate='Exceedance: %{x:.1f}%<br>Discharge: %{y:.1f} cfs<extra></extra>' ) ) # Add reference lines for pct in [10, 50, 90]: fig.add_vline( x=pct, line=dict(color='gray', dash='dash', width=1), annotation_text=f'Q{pct}', annotation_position='top' ) fig.update_layout( title='Flow Duration Curves', xaxis=dict(title='Exceedance Probability (%)', range=[0, 100]), yaxis=dict(title='Discharge (cfs)', type='log'), height=500, template='plotly_white', hovermode='x unified', legend=dict(orientation='h', yanchor='bottom', y=1.02, xanchor='right', x=1) ) fig.show() print("\nFlow duration statistics (Q10, Q50, Q90):") for site in sites: site_data = stream_df[stream_df['site_no'] == site]['discharge_cfs'].dropna() q10 = np.percentile(site_data, 90) # 90th percentile = 10% exceedance q50 = np.percentile(site_data, 50) # Median q90 = np.percentile(site_data, 10) # 10th percentile = 90% exceedance print(f"\nSite {site}:") print(f" Q10 (high flow): {q10:.1f} cfs") print(f" Q50 (median): {q50:.1f} cfs") print(f" Q90 (low flow): {q90:.1f} cfs") print(f" Variability (Q10/Q90): {q10/q90 if q90 > 0 else 0:.1f}") ``` ## Seasonal Hydrograph ::: {.callout-tip icon=false} ## What Will You See? The seasonal hydrograph displays **monthly climatology** showing the typical annual cycle of streamflow. This visualization reveals when streams are fed by aquifer recharge versus when they're depleting aquifer storage. **Visual Elements:** | Component | What It Shows | How to Interpret | |-----------|---------------|------------------| | **Blue bars** | Mean monthly discharge (Jan-Dec) | **Typical streamflow** for each calendar month averaged over all years | | **Error bars (±1 SD)** | Standard deviation | **Variability**: Large bars = high inter-annual variability; Small bars = consistent year-to-year | | **Red triangles** | Historical maximum for each month | **Extreme wet conditions** - highest monthly total ever recorded | | **Blue triangles** | Historical minimum for each month | **Extreme dry conditions** - lowest monthly total ever recorded | | **Line connecting bars** | Seasonal progression | Shows continuous annual cycle (some sites display as lines + markers) | **Interpreting the Seasonal Cycle:** | Month Pattern | Physical Meaning | Aquifer Implication | |--------------|------------------|---------------------| | **Peak months (high bars)** | Spring snowmelt + rainfall | **Recharge season** - aquifer receiving water, water table rising | | **Low months (short bars)** | Summer/fall drought | **Depletion season** - stream fed by aquifer, water table falling | | **Amplitude (peak-to-trough)** | Seasonal strength | Large = strong seasonality; Small = year-round stable flow | | **Error bar size** | Inter-annual variability | Large = unpredictable (climate-driven); Small = predictable (aquifer-buffered) | **Key Metrics to Extract:** 1. **Timing of peak discharge**: When does the aquifer receive maximum recharge? - Spring peak (Mar-May) → Snowmelt + spring rains = primary recharge window - Winter peak (Dec-Feb) → Frozen ground limits infiltration, most becomes runoff - Summer peak (Jun-Aug) → Thunderstorms, but high ET reduces net recharge 2. **Timing of minimum discharge**: When does the aquifer support the stream most? - Summer minimum (Jul-Sep) → Stream sustained entirely by baseflow (aquifer discharge) - Fall minimum (Oct-Nov) → Low precip + depleted soil moisture 3. **Amplitude (peak ÷ minimum)**: How seasonal is the system? - Ratio >5 → Highly seasonal (storage-dependent system) - Ratio 2-5 → Moderately seasonal (typical for temperate climates) - Ratio <2 → Stable year-round (large aquifer buffering) 4. **Error bar magnitude**: How predictable is each month? - Large error bars → Climate variability dominates (surface runoff) - Small error bars → Aquifer buffering smooths variability (groundwater-dominated) **Example Interpretation:** If you see: - **Mean discharge**: April = 80 cfs, August = 20 cfs - **Amplitude**: 80 ÷ 20 = 4× difference (moderate seasonality) - **April error bars**: ±30 cfs (high variability from year to year) - **August error bars**: ±5 cfs (low variability) **Physical meaning:** - Spring flows are highly variable (depends on precipitation, snowmelt timing) - Summer flows are consistent (aquifer provides stable baseflow) - Aquifer recharge occurs primarily in spring (April peak) - Aquifer depletion occurs in summer (August minimum, sustained by baseflow) **Critical Insight for Groundwater Management:** The timing and magnitude of seasonal patterns reveal **when the aquifer is charging vs. discharging**: - **Rising limb (winter → spring)**: Aquifer recharge season - precipitation > ET, excess infiltrates - **Peak (spring)**: Maximum recharge - water table at annual high - **Falling limb (spring → summer)**: Transition - recharge slowing, ET increasing - **Minimum (summer/fall)**: Aquifer discharge season - stream flow = baseflow only **Management Priority:** Protect spring recharge (high bars) to sustain summer baseflow (low bars). If spring precipitation declines, summer streams will dry up regardless of summer rainfall (too much ET loss). ::: ```{python} #| label: fig-seasonal-hydrograph #| fig-cap: "Seasonal streamflow patterns showing monthly climatology" # Calculate monthly climatology stream_df['month'] = stream_df['date'].dt.month stream_df['year'] = stream_df['date'].dt.year fig = go.Figure() month_names = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'] for site in sites: site_data = stream_df[stream_df['site_no'] == site].copy() monthly_clim = site_data.groupby('month').agg({ 'discharge_cfs': ['mean', 'std'] }).reset_index() monthly_clim.columns = ['month', 'mean', 'std'] fig.add_trace( go.Scatter( x=[month_names[m-1] for m in monthly_clim['month']], y=monthly_clim['mean'], mode='lines+markers', name=f'Site {site}', marker=dict(size=8), line=dict(width=2), error_y=dict( type='data', array=monthly_clim['std'], visible=True ), hovertemplate='%{x}<br>Mean: %{y:.1f} cfs<extra></extra>' ) ) fig.update_layout( title='Monthly Streamflow Climatology', xaxis=dict(title='Month'), yaxis=dict(title='Mean Discharge (cfs)'), height=500, template='plotly_white', hovermode='x unified', legend=dict(orientation='h', yanchor='bottom', y=1.02, xanchor='right', x=1) ) fig.show() # Identify seasonal patterns print("\nSeasonal discharge patterns:") for site in sites: site_data = stream_df[stream_df['site_no'] == site].copy() monthly_avg = site_data.groupby('month')['discharge_cfs'].mean() max_month = monthly_avg.idxmax() min_month = monthly_avg.idxmin() print(f"\nSite {site}:") print(f" Peak month: {month_names[max_month-1]} ({monthly_avg[max_month]:.1f} cfs)") print(f" Low month: {month_names[min_month-1]} ({monthly_avg[min_month]:.1f} cfs)") print(f" Seasonal range: {monthly_avg.max() - monthly_avg.min():.1f} cfs") ``` ## Trend Analysis with Sen's Slope ::: {.callout-note icon=false} ## Understanding Sen's Slope Estimator **What Is It?** Sen's slope (Sen 1968) is a non-parametric method for estimating the rate of change in time series data. Unlike linear regression which uses least squares, Sen's slope calculates the median of all pairwise slopes between data points—making it highly robust to outliers and non-normal distributions. **Why Does It Matter?** For long-term streamflow analysis, Sen's slope provides: - **Outlier resistance**: Extreme floods or droughts don't skew the trend estimate - **No distributional assumptions**: Works with skewed data (typical for streamflow) - **Paired with Mann-Kendall**: Together they detect trends and quantify magnitude - **Physically meaningful**: Slope in cfs/year tells you rate of streamflow change **How Does It Work?** The algorithm calculates: 1. **Compute all pairwise slopes**: For every pair of points (i, j) where j > i: $$\text{slope}_{ij} = \frac{Q_j - Q_i}{t_j - t_i}$$ 2. **Take the median**: Sen's slope = median of all slopes - If n points, there are n(n-1)/2 slopes - Median is robust to outliers (unlike mean used in regression) 3. **Confidence interval**: Bootstrap or rank-based methods estimate uncertainty **What Will You See?** Results reported as: - **Slope**: Rate of change (e.g., +0.01 cfs/year) - **P-value**: From paired Mann-Kendall test (significance) - **Trend line**: Visualized as overlay on time series **How to Interpret:** | Slope | P-value | Interpretation | Management Action | |-------|---------|----------------|-------------------| | **Positive** | < 0.05 | **Increasing flow** - More water over time | Plan for higher flows, update infrastructure | | **Negative** | < 0.05 | **Decreasing flow** - Less water over time | Water conservation, drought preparedness | | **Near zero** | ≥ 0.05 | **No significant trend** - Stable long-term | Continue current management | | **Large magnitude** | Any | **Rapid change** - System shifting | Investigate causes (climate, land use, pumping) | **Key Advantage over Linear Regression:** - Regression: One outlier (e.g., 2008 flood) can dominate the trend - Sen's slope: Outliers contribute only 1-2 slopes out of thousands, minimal impact **Example**: A Sen's slope of +0.01 cfs/year over 77 years means baseflow increased by 0.77 cfs total (77 × 0.01). For a stream averaging 50 cfs, this is a 1.5% increase—small but statistically significant. ::: ## Key Findings ### 77-Year Trend **Longest continuous record (1948-2025):** - **Total discharge:** +0.01 cfs/year (r=0.329, **p=0.003**) ✅ Significant! - **Baseflow:** +0.01 cfs/year (r=0.244, **p=0.031**) ✅ Significant! - **Baseflow Index:** Stable ~51% (no trend) **Interpretation:** Both total flow and groundwater contribution significantly increasing over 77 years. ### Baseflow Index by Station | Station | BFI | Record | Interpretation | |---------|-----|--------|----------------| | **03337570** | 61.7% | 2009-2025 | Very high baseflow | | **03336900** | 58.4% | 1958-2025 | High baseflow | | **05570910** | 57.5% | 1978-2025 | High baseflow | | **03337000** | 50.9% | **1948-2025** | Moderate (longest) | **Average BFI:** 56% → Groundwater-dominated stream system ### Critical Finding: Well-Stream Inconsistency **Groundwater (2009-2022):** +0.44 ft/year (rising) **Baseflow (averaged):** -0.20 cfs/year (declining) **Paradox:** Rising water levels but declining baseflow! **Explanation:** Multi-aquifer system - **Deep confined (Unit D):** Monitored by wells, rising (reduced pumping) - **Shallow unconfined:** Feeds streams, declining (climate/ET) - Wells and streams sample **different aquifer systems** ## Methods ### Recursive Digital Filter (RDF) #### What Is Baseflow Separation? **Baseflow separation** is the process of dividing total streamflow into two components: - **Baseflow**: The slow, steady contribution from groundwater - **Quickflow**: The rapid spike from direct rainfall runoff This technique was pioneered by Nathan & McMahon (1990) and Lyne & Hollick (1979), building on earlier graphical separation methods from the 1930s-1940s. #### Why Does It Matter? Baseflow represents the **aquifer's contribution to streams**. Tracking baseflow over time reveals: - Whether groundwater discharge to streams is increasing or decreasing - How much of stream ecology depends on groundwater (critical for low-flow habitat) - The aquifer-stream connectivity strength #### How Does It Work? The **Recursive Digital Filter (RDF)** algorithm works like a signal processing filter that separates low-frequency (baseflow) from high-frequency (storm runoff) components: 1. **Forward pass**: Filter removes rapid fluctuations (storms) → identifies baseflow 2. **Backward pass**: Filter applied in reverse to correct for lag effects 3. **Repeat**: Multiple passes (typically 3) refine the separation **Algorithm:** Lyne & Hollick (1979) **Filter equation:** ``` q[i] = α × q[i-1] + ((1+α)/2) × (Q[i] - Q[i-1]) Baseflow = Q - q ``` **Parameters:** - α = 0.925 (recession constant) — controls how quickly the filter responds - n_passes = 3 (forward-backward smoothing) — improves accuracy **Physical analogy**: Think of the aquifer as a large reservoir that slowly drains into the stream. Storms add spikes on top of this steady background. The filter mathematically identifies that steady background. ### Baseflow Index (BFI) #### What Is It? The **Baseflow Index (BFI)** is the ratio of total baseflow to total streamflow, expressed as a percentage: $$ \text{BFI} = \frac{\sum \text{Baseflow}}{\sum \text{Total Discharge}} \times 100\% $$ #### How to Interpret | BFI Range | Stream Type | Aquifer Connection | Management Implication | |-----------|-------------|-------------------|------------------------| | **BFI > 70%** | Groundwater-fed | Very strong | Pumping directly impacts stream ecology | | **BFI 50-70%** | Mixed (GW-dominated) | Strong | Groundwater management = stream management | | **BFI 30-50%** | Mixed (surface-dominated) | Moderate | Both surface and groundwater important | | **BFI < 30%** | Runoff-dominated | Weak | Stream responds mainly to precipitation events | **Example**: A BFI of 56% (as found in this study) means 56% of streamflow comes from groundwater. This is a **groundwater-dominated system** where aquifer health directly controls stream health. **Critical insight**: Streams with high BFI are vulnerable to groundwater pumping—lowering the water table reduces baseflow, which can dry up streams during droughts. ## Implications for Management ### 1. Strong Aquifer-Stream Connection **BFI >50% means:** - Groundwater pumping directly affects stream health - Environmental flows require groundwater protection - Surface-groundwater must be managed jointly ### 2. Long Records Essential **77-year record:** - Significant trends detected (p=0.003, p=0.031) - Clear signal above climate noise **15-25 year records:** - No significant trends - Climate variability dominates **Lesson:** Invest in long-term monitoring (50+ years minimum) ### 3. System Complexity Revealed **Simple model (one aquifer):** WRONG **Reality:** Multi-layer system - Wells ≠ Stream response - Cannot assume well levels predict baseflow - Need multi-layer conceptual model ## Summary Baseflow separation reveals: ✅ **Groundwater-dominated streams** (56% baseflow average) ✅ **77-year increasing trend** (+0.01 cfs/yr, p=0.031) ✅ **High aquifer-stream connectivity** (BFI >50%) ⚠️ **Multi-aquifer complexity** (well-stream inconsistency) ⚠️ **Short records insufficient** (need 50+ years for trends) **Key Insight:** Rising groundwater levels (confined aquifer) do NOT guarantee rising baseflow (shallow aquifer feeds streams). System is more complex than single-layer conceptual model. --- ## Reflection Questions - If baseflow trends and groundwater level trends point in different directions, what lines of evidence would you assemble before revising your conceptual model of the aquifer system? - How would you explain to a non-technical audience why a stream with a high baseflow index is especially sensitive to groundwater pumping? - Given the importance of record length for detecting trends, how would you prioritize which gauges to keep, upgrade, or retire if monitoring budgets were limited? - Where could combining baseflow separation with other analyses in this book (for example, recharge lag or thermal response) reduce uncertainty about aquifer–stream linkages? --- ## Related Chapters - [Water Level Trends](water-level-trends.qmd) - Groundwater temporal patterns - [Precipitation Patterns](precipitation-patterns.qmd) - Climate forcing - [Recharge Lag Analysis](recharge-lag-analysis.qmd) - Precipitation-groundwater connection

23.1 What You Will Learn in This Chapter

23.2 Introduction

23.3 Setup and Data Loading

23.3.1 Load USGS Stream Gauge Data

23.4 Discharge Time Series Visualization

23.5 Flow Duration Curves

23.5.1 What Is a Flow Duration Curve?

23.5.2 Why Does It Matter?

23.5.3 How Does It Work?

23.5.4 What Will You See?

23.5.5 How to Interpret

23.6 Seasonal Hydrograph

23.7 Trend Analysis with Sen’s Slope

23.8 Key Findings

23.8.1 77-Year Trend

23.8.2 Baseflow Index by Station

23.8.3 Critical Finding: Well-Stream Inconsistency

23.9 Methods

23.9.1 Recursive Digital Filter (RDF)

What Is Baseflow Separation?

Why Does It Matter?

How Does It Work?

23.9.2 Baseflow Index (BFI)

What Is It?

How to Interpret

23.10 Implications for Management

23.10.1 1. Strong Aquifer-Stream Connection

23.10.2 2. Long Records Essential

23.10.3 3. System Complexity Revealed

23.11 Summary

23.12 Reflection Questions

23.13 Related Chapters