---
title: "Event Response Fingerprints"
subtitle: "Multi-source event signatures for early warning"
code-fold: true
---
::: {.callout-tip icon=false}
## For Newcomers
**You will learn:**
- How the aquifer system responds to big events (storms, droughts)
- What a "fingerprint" is—a characteristic pattern that identifies event types
- How different data sources (rain, wells, streams) respond in sequence
- Why recovery time reveals hidden aquifer properties
Like a doctor recognizing symptoms that point to a specific condition, event fingerprints help us diagnose what's happening in the aquifer. A storm creates one pattern; a drought creates another. Learning these patterns enables early warning systems.
:::
## What You Will Learn in This Chapter
By the end of this chapter, you will be able to:
- Describe what an “event fingerprint” is and how multi-source data (precipitation, groundwater, streamflow) combine to form characteristic patterns.
- Interpret multi-panel event plots to see timing, magnitude, and recovery across sources for storms and droughts.
- Understand how detection thresholds, windowing, and detrending choices affect which events are identified and how their responses look.
- Explain how learned fingerprints can feed into early warning systems and guide where additional monitoring would most reduce uncertainty.
## Introduction
Individual data sources show pairwise relationships, but the system exhibits emergent behavior when all four sources (HTEM, groundwater, weather, stream) interact. This chapter discovers "event fingerprints" - characteristic signatures that reveal system-level dynamics.
**Source:** Analysis adapted from `multi-source-event-fingerprints.qmd`
::: {.callout-note icon=false}
## 💻 For Computer Scientists
**Event detection** uses statistical thresholds to identify significant deviations from baseline:
- Precipitation events: >95th percentile (heavy rainfall)
- Drought events: <5th percentile (sustained low water levels)
- Recovery time: Exponential decay model to return to baseline
**Response analysis:**
- Cross-correlation identifies time lags between sources
- Phase analysis (before/during/after) reveals cascade dynamics
- Recovery curves follow $h(t) = h_0 + A \cdot e^{-t/\tau}$ where $\tau$ is characteristic recovery time
:::
::: {.callout-tip icon=false}
## 🌍 For Geologists/Hydrologists
**Physical interpretation of event responses:**
**Storm Response Fingerprint:**
- Precipitation spike → vadose zone infiltration (days to weeks)
- Water table rise → increased hydraulic gradient
- Stream baseflow increase → aquifer discharge
- Recovery time reveals aquifer properties (transmissivity, storativity)
**Drought Response Fingerprint:**
- Extended low precipitation → declining recharge
- Water table decline → reduced baseflow
- Lag time indicates aquifer buffering capacity
- Recovery reveals recharge efficiency
**Key insight:** Response timing and magnitude reveal hydraulic connectivity and aquifer response characteristics.
:::
## Data and Methods
### Data Loading
We initialize the analysis environment and load required libraries for event detection and response analysis.
```{python}
#| code-fold: true
#| code-summary: "Show setup code"
import os
import sys
from pathlib import Path
import numpy as np
import pandas as pd
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import sqlite3
try:
from scipy import signal, stats
SCIPY_AVAILABLE = True
except ImportError:
SCIPY_AVAILABLE = False
print("Note: scipy not available. Some statistical analyses will be simplified.")
import warnings
warnings.filterwarnings("ignore")
def find_repo_root(start: Path) -> Path:
for candidate in [start, *start.parents]:
if (candidate / "src").exists():
return candidate
return start
quarto_project = Path(os.environ.get("QUARTO_PROJECT_DIR", str(Path.cwd())))
project_root = find_repo_root(quarto_project)
if str(project_root) not in sys.path:
sys.path.append(str(project_root))
from src.utils import get_data_path
print("Event response fingerprints analysis initialized")
```
### Load Groundwater Time Series
We load groundwater monitoring data and select a well with long-term records suitable for event response analysis. The ideal well has high-frequency measurements spanning multiple years to capture numerous precipitation events and their responses.
```{python}
#| code-fold: true
#| code-summary: "Show groundwater data loading code"
# Connect to aquifer database with error handling
gw_data_loaded = False
try:
aquifer_db = get_data_path("aquifer_db")
conn = sqlite3.connect(aquifer_db)
# Load groundwater data with proper timestamp parsing
query = """
SELECT P_Number, TIMESTAMP, Water_Surface_Elevation, DTW_FT_Reviewed
FROM OB_WELL_MEASUREMENTS_CHAMPAIGN_COUNTY
WHERE Water_Surface_Elevation IS NOT NULL
ORDER BY TIMESTAMP
"""
gw_df = pd.read_sql_query(query, conn)
conn.close()
gw_data_loaded = True
print(f"✅ Loaded {len(gw_df):,} groundwater measurements from database")
except Exception as e:
print(f"⚠️ Error loading groundwater database: {e}")
print("Using empty dataset - visualizations will show placeholder data")
gw_data_loaded = False
gw_df = pd.DataFrame(columns=['P_Number', 'TIMESTAMP', 'Water_Surface_Elevation'])
# Parse timestamps using US format (M/D/YYYY)
gw_df['MeasurementDate'] = pd.to_datetime(gw_df['TIMESTAMP'], format='%m/%d/%Y', errors='coerce')
gw_df = gw_df.dropna(subset=['MeasurementDate', 'Water_Surface_Elevation'])
# Select a well with good temporal coverage
well_stats = gw_df.groupby('P_Number')['MeasurementDate'].agg(['min', 'max', 'count'])
well_stats['span_days'] = (well_stats['max'] - well_stats['min']).dt.days
well_stats = well_stats[well_stats['count'] >= 200].sort_values('span_days', ascending=False)
print(f"\nTop wells by time span (with >= 200 measurements):")
print(well_stats.head(10))
# Use the well with longest time span and good data coverage
selected_well = well_stats.index[0]
well_data = gw_df[gw_df['P_Number'] == selected_well].copy()
well_data = well_data.sort_values('MeasurementDate')
print(f"\nSelected well: {selected_well}")
print(f" Total measurements: {len(well_data)}")
print(f" Date range: {well_data['MeasurementDate'].min()} to {well_data['MeasurementDate'].max()}")
```
### Load Weather Data
We load precipitation data from weather stations to identify precipitation events (storms, wet periods) that may trigger groundwater responses. Daily aggregation smooths out short-term variability while preserving event-scale dynamics.
```{python}
#| code-fold: true
#| code-summary: "Show weather data loading code"
# Connect to weather database with error handling
weather_available = False
try:
weather_db_path = get_data_path("warm_db")
conn_weather = sqlite3.connect(weather_db_path)
# Get available stations
station_query = """
SELECT DISTINCT nStationCode, COUNT(*) as record_count
FROM WarmICNData
GROUP BY nStationCode
ORDER BY record_count DESC
LIMIT 5
"""
stations_df = pd.read_sql_query(station_query, conn_weather)
print(f"\nTop weather stations by record count:")
print(stations_df)
# Use station with most data
selected_station = stations_df.iloc[0]['nStationCode']
# Load hourly precipitation data
weather_query = f"""
SELECT nDateTime, nPrecipHrly, nAirTemp
FROM WarmICNData
WHERE nStationCode = '{selected_station}'
AND nPrecipHrly IS NOT NULL
ORDER BY nDateTime
"""
weather_df = pd.read_sql_query(weather_query, conn_weather)
conn_weather.close()
# Convert Unix timestamp to datetime
weather_df['datetime'] = pd.to_datetime(weather_df['nDateTime'], unit='s', errors='coerce')
weather_df = weather_df.dropna(subset=['datetime', 'nPrecipHrly'])
# Aggregate to daily precipitation
weather_df.set_index('datetime', inplace=True)
daily_precip = weather_df['nPrecipHrly'].resample('D').sum()
print(f"\n✅ Weather data loaded:")
print(f" Station: {selected_station}")
print(f" Total records: {len(weather_df)}")
print(f" Date range: {weather_df.index.min()} to {weather_df.index.max()}")
print(f" Daily precipitation points: {len(daily_precip)}")
weather_available = True
except Exception as e:
print(f"\n⚠️ Error loading weather database: {e}")
print("Using empty dataset - visualizations will show placeholder data")
weather_available = False
weather_df = pd.DataFrame(columns=['datetime', 'nPrecip', 'nAirTemp'])
daily_precip = pd.Series(dtype=float)
```
## Event Detection
### Identify Precipitation Events
::: {.callout-note icon=false}
## Understanding Statistical Event Detection
**What Is It?**
**Statistical event detection** identifies **significant departures from normal conditions** using percentile-based thresholds. Rather than using fixed values (e.g., "2 inches of rain = heavy"), percentiles define events relative to the local distribution—what's extreme for *this* location.
**Historical context**: Event detection emerged from hydrology and climate science in the 1970s-1980s when researchers needed objective methods to identify floods, droughts, and heat waves from continuous time series. The percentile approach was formalized in extreme event climatology by Zwiers & Kharin (1998) and others.
**Why Does It Matter?**
Percentile-based event detection enables:
1. **Objective identification**: No subjective judgment about what's "extreme"
2. **Local adaptation**: What's extreme in Phoenix vs. Seattle automatically accounted for
3. **Comparative analysis**: Can compare event frequency across different wells/regions
4. **Threshold exceedance**: Directly links to return period analysis (95th percentile ≈ 1-in-20 event)
**How Does It Work?**
**Step-by-step process:**
1. **Calculate empirical distribution**: Sort all values from smallest to largest
2. **Define percentile thresholds**:
- 95th percentile: Value exceeded by only 5% of observations
- 99th percentile: Value exceeded by only 1% of observations
- 5th percentile: Value exceeded by 95% of observations (droughts)
3. **Identify exceedances**: Any observation above (or below) threshold = event
4. **Extract event characteristics**:
- Magnitude: How far above/below threshold
- Duration: How long exceedance lasts
- Frequency: How many events per year
**What Will You See?**
The code below calculates percentile thresholds and identifies events. The output includes event dates, magnitudes, and whether each qualifies as "extreme" (>99th percentile).
**How to Interpret:**
| Threshold | Meaning | Application | Expected Frequency |
|-----------|---------|-------------|-------------------|
| **>95th percentile** | Heavy precipitation | Top 5% of all daily totals | ~18 days/year |
| **>99th percentile** | Extreme precipitation | Top 1% (major storms) | ~3-4 days/year |
| **<5th percentile** | Drought conditions | Lowest 5% of water levels | ~18 days/year |
| **<1st percentile** | Severe drought | Lowest 1% of water levels | ~3-4 days/year |
**Why percentiles instead of fixed thresholds?**
Percentiles adapt to local climate. A "heavy rain" in Phoenix (desert) differs from Seattle (wet climate):
- Phoenix: 95th percentile might be 1.0 inch (rare)
- Seattle: 95th percentile might be 2.5 inches (more common)
Both represent "unusual for that location"—that's what matters for system response.
**Key insight**: Events are defined by deviation from local normal, not absolute magnitude.
:::
#### What is Statistical Event Detection?
Event detection identifies **statistically significant departures from normal conditions**. We use percentile-based thresholds:
| Threshold | Meaning | Application |
|-----------|---------|-------------|
| **>95th percentile** | Heavy precipitation | Top 5% of all daily totals |
| **>99th percentile** | Extreme precipitation | Top 1% (major storms) |
| **<5th percentile** | Drought conditions | Lowest 5% of water levels |
**Why percentiles instead of fixed thresholds?** Percentiles adapt to local climate. A "heavy rain" in Phoenix differs from one in Seattle—percentiles capture what's unusual for *this* location.
```{python}
#| code-fold: true
#| code-summary: "Show event detection code"
if weather_available:
# Calculate precipitation statistics
precip_95th = daily_precip.quantile(0.95)
precip_99th = daily_precip.quantile(0.99)
# Identify heavy precipitation events (>95th percentile)
events = daily_precip[daily_precip > precip_95th]
print(f"\nPrecipitation event detection:")
print(f" 95th percentile threshold: {precip_95th:.3f} inches")
print(f" 99th percentile threshold: {precip_99th:.3f} inches")
print(f" Total events detected: {len(events)}")
print(f" Extreme events (>99th): {len(events[events > precip_99th])}")
# Create a DataFrame with event characteristics
event_list = []
for date, precip in events.items():
event_list.append({
'date': date,
'precipitation_inches': precip,
'is_extreme': precip > precip_99th
})
events_df = pd.DataFrame(event_list)
print(f"\nExample events:")
print(events_df.head(10))
else:
# Detect events from groundwater data only
well_data.set_index('MeasurementDate', inplace=True)
wl_diff = well_data['Water_Surface_Elevation'].diff()
# Significant rises (>95th percentile of positive changes)
positive_changes = wl_diff[wl_diff > 0]
rise_95th = positive_changes.quantile(0.95)
events = wl_diff[wl_diff > rise_95th]
print(f"\nGroundwater rise events detected:")
print(f" 95th percentile threshold: {rise_95th:.2f} ft rise")
print(f" Total events: {len(events)}")
```
## Response Analysis
### Groundwater Response to Precipitation Events
#### Cross-Correlation: Finding the Lag
**Cross-correlation** measures how strongly two time series are related at different time lags. For precipitation and groundwater:
- **Lag = 0:** Are today's water levels correlated with today's rain?
- **Lag = 14:** Are water levels correlated with rain from 14 days ago?
The lag with maximum correlation reveals the **system response time**—how long it takes precipitation to travel through the vadose zone and reach the water table.
**Physical interpretation of lag time:**
| Lag Time | Interpretation |
|----------|----------------|
| **Days** | Shallow, unconfined aquifer with thin vadose zone |
| **2-4 weeks** | Moderate depth, mixed confining layers |
| **Months** | Deep, confined aquifer with thick clay cap |
```{python}
#| code-fold: true
#| code-summary: "Show cross-correlation analysis code"
if weather_available:
# Align groundwater and precipitation data
well_data_daily = well_data.set_index('MeasurementDate')['Water_Surface_Elevation'].resample('D').mean()
# Find common date range
common_start = max(well_data_daily.index.min(), daily_precip.index.min())
common_end = min(well_data_daily.index.max(), daily_precip.index.max())
well_aligned = well_data_daily[common_start:common_end]
precip_aligned = daily_precip[common_start:common_end]
# Interpolate missing groundwater values
well_aligned = well_aligned.interpolate(method='time', limit=7) # Max 7-day gap
print(f"\nAligned data for response analysis:")
print(f" Date range: {common_start.date()} to {common_end.date()}")
print(f" Groundwater points: {well_aligned.notna().sum()}")
print(f" Precipitation points: {precip_aligned.notna().sum()}")
# Calculate cross-correlation to find lag
# Standardize both series
precip_std = (precip_aligned - precip_aligned.mean()) / precip_aligned.std()
well_std = (well_aligned - well_aligned.mean()) / well_aligned.std()
# Remove NaNs
valid_idx = precip_std.notna() & well_std.notna()
precip_valid = precip_std[valid_idx].values
well_valid = well_std[valid_idx].values
# Compute cross-correlation for lags up to 60 days
max_lag = 60
if len(precip_valid) > max_lag:
correlations = []
lags = range(-max_lag, max_lag + 1)
for lag in lags:
if lag < 0:
corr = np.corrcoef(precip_valid[:lag], well_valid[-lag:])[0, 1]
elif lag > 0:
corr = np.corrcoef(precip_valid[lag:], well_valid[:-lag])[0, 1]
else:
corr = np.corrcoef(precip_valid, well_valid)[0, 1]
if not np.isnan(corr):
correlations.append(corr)
else:
correlations.append(0)
# Find lag with maximum correlation
max_corr_idx = np.argmax(correlations)
optimal_lag = lags[max_corr_idx]
max_correlation = correlations[max_corr_idx]
print(f"\nCross-correlation analysis:")
print(f" Optimal lag: {optimal_lag} days")
print(f" Maximum correlation: {max_correlation:.3f}")
print(f" Interpretation: Groundwater responds to precipitation with ~{abs(optimal_lag)} day lag")
```
### Event Response Phase Analysis
#### The Three Phases of Event Response
::: {.callout-note icon=false}
## Understanding Recovery Time Analysis
**What Is It?**
**Recovery time (τ)** measures how long it takes for groundwater levels to return to baseline after a precipitation event. This follows **exponential decay**, a universal pattern in physical systems (radioactive decay, electrical circuits, heat dissipation). For groundwater, recovery time was first quantified by Cooper & Jacob (1946) for pump test analysis and later adapted to natural recharge events.
**Historical context**: In the 1940s-1950s, hydrogeologists realized that aquifer response to pumping (or recharge) follows exponential decay due to diffusion processes. The recovery curve shape directly reveals aquifer hydraulic properties.
**Why Does It Matter?**
Recovery time analysis reveals:
1. **Aquifer transmissivity**: Slow recovery = low T (confined, low permeability)
2. **Storage coefficient**: How much water the aquifer stores per unit area
3. **Confinement**: Unconfined aquifers recover in days-weeks; confined in months-years
4. **Predictability**: Systems with consistent τ are predictable; variable τ indicates complex flow paths
**How Does It Work?**
**Exponential decay model:**
$$h(t) = h_{baseline} + A \cdot e^{-t/\tau}$$
Where:
- $h(t)$ = water level at time t after peak
- $h_{baseline}$ = pre-event baseline level
- $A$ = response magnitude (peak - baseline)
- $\tau$ = characteristic recovery time (63% recovery point)
- $t$ = time since peak
**Physical interpretation of τ:**
$$\tau = \frac{S \cdot L^2}{T}$$
Where:
- $S$ = storativity (dimensionless, typically 0.0001-0.3)
- $L$ = characteristic length scale (distance to boundary or well spacing)
- $T$ = transmissivity (m²/day)
**Larger τ means:**
- Lower transmissivity (water drains slowly)
- Larger storage (aquifer holds more water)
- Greater confinement (confined > semi-confined > unconfined)
**What Will You See?**
The three phases analysis below shows:
| Phase | Duration | What Happens | Key Metric |
|-------|----------|--------------|------------|
| **BEFORE** | 30 days prior | Baseline conditions | Mean water level |
| **RESPONSE** | Event + lag time | Water table rises | Peak magnitude |
| **RECOVERY** | Post-peak | Gradual return to baseline | Recovery time τ |
**How to Interpret Recovery Time:**
| Recovery Time (τ) | Classification | Aquifer Type | Physical Characteristics |
|------------------|----------------|--------------|-------------------------|
| **< 10 days** | Very fast | Shallow unconfined, high K | Gravel, coarse sand, karstic limestone |
| **10-30 days** | Fast | Unconfined, moderate K | Sand aquifer, good connectivity |
| **30-90 days** | Moderate | Semi-confined, moderate K | Mixed sand/clay, leaky confining layer |
| **90-180 days** | Slow | Confined, low K | Clay-bound aquifer, regional system |
| **> 180 days** | Very slow | Highly confined, very low K | Deep bedrock, thick clay aquitard |
**Key insight**: Recovery time is a "free pump test"—you get aquifer properties from natural recharge without pumping.
:::
Each precipitation event triggers a characteristic response pattern with three distinct phases:
| Phase | Duration | What Happens | Key Metric |
|-------|----------|--------------|------------|
| **BEFORE** | 30 days prior | Baseline conditions | Mean water level |
| **RESPONSE** | Event + lag time | Water table rises | Peak magnitude |
| **RECOVERY** | Post-peak | Gradual return to baseline | Recovery time τ |
**Recovery time (τ)** is particularly informative—it follows exponential decay:
$$h(t) = h_{baseline} + A \cdot e^{-t/\tau}$$
Where τ (characteristic time) relates to aquifer transmissivity: larger τ = slower drainage = lower transmissivity or stronger confinement.
```{python}
#| code-fold: true
#| code-summary: "Show phase analysis code"
if weather_available and len(events_df) > 0:
# Analyze response for top 10 largest events
top_events = events_df.nlargest(10, 'precipitation_inches')
response_analysis = []
for idx, event in top_events.iterrows():
event_date = event['date']
# Define phases: 30 days before, event day, 60 days after
before_start = event_date - pd.Timedelta(days=30)
after_end = event_date + pd.Timedelta(days=60)
# Extract groundwater levels for this period
event_window = well_aligned[before_start:after_end]
if len(event_window) < 10: # Skip if insufficient data
continue
# Get baseline (mean of 30 days before)
before_period = well_aligned[before_start:event_date]
if len(before_period) > 0:
baseline = before_period.mean()
# Get peak response (max in 30 days after event)
after_period = well_aligned[event_date:event_date + pd.Timedelta(days=30)]
if len(after_period) > 0:
peak = after_period.max()
peak_date = after_period.idxmax()
# Calculate response metrics
response_magnitude = peak - baseline
response_lag = (peak_date - event_date).days
# Calculate recovery time (time to return to within 10% of baseline)
recovery_threshold = baseline + 0.1 * response_magnitude
recovery_period = well_aligned[peak_date:after_end]
recovery_idx = recovery_period[recovery_period <= recovery_threshold]
if len(recovery_idx) > 0:
recovery_time = (recovery_idx.index[0] - peak_date).days
else:
recovery_time = np.nan
response_analysis.append({
'event_date': event_date,
'precipitation_inches': event['precipitation_inches'],
'baseline_ft': baseline,
'peak_ft': peak,
'response_magnitude_ft': response_magnitude,
'response_lag_days': response_lag,
'recovery_time_days': recovery_time
})
response_df = pd.DataFrame(response_analysis)
print(f"\nEvent response analysis (n={len(response_df)} events):")
print(response_df.to_string())
if len(response_df) > 0:
print(f"\nAverage response characteristics:")
print(f" Mean response lag: {response_df['response_lag_days'].mean():.1f} days")
print(f" Mean response magnitude: {response_df['response_magnitude_ft'].mean():.2f} ft")
print(f" Mean recovery time: {response_df['recovery_time_days'].mean():.1f} days")
```
## Visualizations
### Water Level Response to Precipitation Events
#### What This Visualization Shows
This two-panel figure displays the raw data underlying event detection:
**Top Panel - Precipitation:**
- Bar chart showing daily precipitation totals
- Diamond markers highlight detected events (>95th percentile)
- Look for clustering of events vs. isolated storms
**Bottom Panel - Groundwater Response:**
- Continuous water level record
- Red dashed lines mark major precipitation events
- **What to look for:** Do water level rises coincide with (or lag behind) precipitation events?
**Interpreting the pattern:** If water levels rise shortly after precipitation events and then gradually decline, the system exhibits "classic recharge" behavior. If there's no clear relationship, the aquifer may be deeply confined or disconnected from surface precipitation.
```{python}
#| code-fold: true
#| code-summary: "Show visualization code"
#| label: fig-event-response
#| fig-cap: "Water level response to precipitation events showing characteristic lag and recovery patterns"
if weather_available and len(events_df) > 0:
# Create dual-axis plot showing precipitation and water levels
fig = make_subplots(
rows=2, cols=1,
subplot_titles=(
'Daily Precipitation (inches)',
'Groundwater Level (ft above sea level)'
),
vertical_spacing=0.15,
row_heights=[0.3, 0.7]
)
# Top panel: Precipitation
fig.add_trace(
go.Bar(
x=daily_precip.index,
y=daily_precip.values,
name='Daily Precipitation',
marker_color='steelblue',
opacity=0.6
),
row=1, col=1
)
# Mark heavy precipitation events
fig.add_trace(
go.Scatter(
x=events_df['date'],
y=events_df['precipitation_inches'],
mode='markers',
name='Heavy Precip Events (>95th %)',
marker=dict(
color='darkblue',
size=8,
symbol='diamond',
line=dict(width=1, color='white')
)
),
row=1, col=1
)
# Bottom panel: Groundwater levels
fig.add_trace(
go.Scatter(
x=well_aligned.index,
y=well_aligned.values,
mode='lines',
name='Water Surface Elevation',
line=dict(color='teal', width=2)
),
row=2, col=1
)
# Add vertical lines for major events
for idx, event in top_events.head(5).iterrows():
fig.add_vline(
x=event['date'].timestamp() * 1000,
line_dash="dash",
line_color="red",
opacity=0.5,
row=2, col=1
)
# Update layout
fig.update_xaxes(title_text="Date", row=2, col=1)
fig.update_yaxes(title_text="Precipitation (in)", row=1, col=1)
fig.update_yaxes(title_text="Water Level (ft)", row=2, col=1)
fig.update_layout(
height=600,
showlegend=True,
hovermode='x unified',
template='plotly_white',
title_text=f"Event Response Analysis - Well {selected_well}"
)
fig.show()
```
### Event Detection with Before/During/After Phases
#### What This Visualization Shows
This detailed view zooms in on a single event to show the three-phase response pattern:
**Color-coded regions:**
- **Blue (BEFORE):** 30-day baseline period before the event
- **Yellow (RESPONSE):** Event occurrence through peak response
- **Green (RECOVERY):** Post-peak return toward baseline
**Key annotations:**
- Red vertical line: Event date (precipitation spike)
- Green dashed line: Peak response date
- Gray horizontal line: Baseline water level
**What to look for:** The shape of the recovery curve reveals aquifer properties. Rapid recovery suggests high transmissivity and unconfined conditions. Slow recovery indicates confinement or low transmissivity.
```{python}
#| code-fold: true
#| code-summary: "Show phase visualization code"
#| label: fig-event-phases
#| fig-cap: "Detailed view of precipitation event and groundwater response phases"
if weather_available and 'response_df' in locals() and len(response_df) > 0:
# Select the largest event for detailed visualization
largest_event = response_df.iloc[0]
event_date = largest_event['event_date']
# Define analysis window
before_start = event_date - pd.Timedelta(days=30)
after_end = event_date + pd.Timedelta(days=90)
# Extract data for this window
precip_window = daily_precip[before_start:after_end]
well_window = well_aligned[before_start:after_end]
# Create figure with two subplots
fig = make_subplots(
rows=2, cols=1,
subplot_titles=(
f'Precipitation Event: {event_date.date()} ({largest_event["precipitation_inches"]:.2f} inches)',
'Groundwater Response with Phases'
),
vertical_spacing=0.12,
row_heights=[0.25, 0.75]
)
# Top: Precipitation
fig.add_trace(
go.Bar(
x=precip_window.index,
y=precip_window.values,
name='Precipitation',
marker_color='steelblue'
),
row=1, col=1
)
# Highlight event day
fig.add_vline(
x=event_date.timestamp() * 1000,
line_dash="solid",
line_color="red",
line_width=2,
annotation_text="Event",
row=1, col=1
)
# Bottom: Groundwater with phases
fig.add_trace(
go.Scatter(
x=well_window.index,
y=well_window.values,
mode='lines+markers',
name='Water Level',
line=dict(color='teal', width=2),
marker=dict(size=4)
),
row=2, col=1
)
# Add baseline reference
baseline = largest_event['baseline_ft']
fig.add_hline(
y=baseline,
line_dash="dash",
line_color="gray",
annotation_text=f"Baseline: {baseline:.1f} ft",
row=2, col=1
)
# Add phase regions
# Before phase
fig.add_vrect(
x0=before_start.timestamp() * 1000,
x1=event_date.timestamp() * 1000,
fillcolor="blue",
opacity=0.1,
layer="below",
line_width=0,
annotation_text="BEFORE",
annotation_position="top left",
row=2, col=1
)
# During/response phase
response_end = event_date + pd.Timedelta(days=int(largest_event['response_lag_days']) + 10)
fig.add_vrect(
x0=event_date.timestamp() * 1000,
x1=response_end.timestamp() * 1000,
fillcolor="yellow",
opacity=0.1,
layer="below",
line_width=0,
annotation_text="RESPONSE",
annotation_position="top left",
row=2, col=1
)
# Recovery phase
fig.add_vrect(
x0=response_end.timestamp() * 1000,
x1=after_end.timestamp() * 1000,
fillcolor="green",
opacity=0.1,
layer="below",
line_width=0,
annotation_text="RECOVERY",
annotation_position="top left",
row=2, col=1
)
# Mark event and peak
fig.add_vline(
x=event_date.timestamp() * 1000,
line_dash="solid",
line_color="red",
line_width=2,
row=2, col=1
)
peak_date = event_date + pd.Timedelta(days=int(largest_event['response_lag_days']))
fig.add_vline(
x=peak_date.timestamp() * 1000,
line_dash="dot",
line_color="darkgreen",
line_width=2,
annotation_text=f"Peak (+{int(largest_event['response_lag_days'])}d)",
row=2, col=1
)
# Update layout
fig.update_xaxes(title_text="Date", row=2, col=1)
fig.update_yaxes(title_text="Precip (in)", row=1, col=1)
fig.update_yaxes(title_text="Water Level (ft)", row=2, col=1)
fig.update_layout(
height=700,
showlegend=True,
hovermode='x unified',
template='plotly_white'
)
fig.show()
print(f"\nEvent characteristics:")
print(f" Event date: {event_date.date()}")
print(f" Precipitation: {largest_event['precipitation_inches']:.2f} inches")
print(f" Response lag: {largest_event['response_lag_days']:.0f} days")
print(f" Response magnitude: {largest_event['response_magnitude_ft']:.2f} ft")
print(f" Recovery time: {largest_event['recovery_time_days']:.0f} days")
```
### Recovery Time Analysis
#### What This Visualization Shows
This two-panel figure explores relationships between event characteristics:
**Left Panel - Precipitation vs. Response:**
- X-axis: Precipitation amount (inches)
- Y-axis: Water level response magnitude (ft)
- Color: Response lag (days)
- **Expected pattern:** Larger storms → larger responses (positive correlation)
**Right Panel - Response vs. Recovery:**
- X-axis: Response magnitude (ft)
- Y-axis: Recovery time (days)
- Color: Precipitation amount
- **Expected pattern:** Larger responses → longer recovery times
**Physical interpretation:** The slopes of these relationships reveal aquifer responsiveness. Steep precipitation-response slope = sensitive unconfined system. Shallow slope = buffered confined system.
```{python}
#| code-fold: true
#| code-summary: "Show recovery analysis code"
#| label: fig-recovery-analysis
#| fig-cap: "Recovery time analysis showing relationship between event magnitude and aquifer response"
if weather_available and 'response_df' in locals() and len(response_df) > 1:
# Create scatter plots of recovery metrics
fig = make_subplots(
rows=1, cols=2,
subplot_titles=(
'Response Magnitude vs Precipitation',
'Recovery Time vs Response Magnitude'
),
horizontal_spacing=0.15
)
# Left: Precipitation vs Response
fig.add_trace(
go.Scatter(
x=response_df['precipitation_inches'],
y=response_df['response_magnitude_ft'],
mode='markers',
name='Events',
marker=dict(
size=10,
color=response_df['response_lag_days'],
colorscale='Viridis',
showscale=True,
colorbar=dict(
title="Lag (days)",
x=0.45
),
line=dict(width=1, color='white')
),
text=[f"Event: {d.date()}<br>Lag: {l:.0f}d"
for d, l in zip(response_df['event_date'], response_df['response_lag_days'])],
hovertemplate='<b>Precip:</b> %{x:.2f} in<br><b>Response:</b> %{y:.2f} ft<br>%{text}<extra></extra>'
),
row=1, col=1
)
# Add trend line
if len(response_df) > 2:
z = np.polyfit(response_df['precipitation_inches'], response_df['response_magnitude_ft'], 1)
p = np.poly1d(z)
x_trend = np.linspace(response_df['precipitation_inches'].min(),
response_df['precipitation_inches'].max(), 50)
fig.add_trace(
go.Scatter(
x=x_trend,
y=p(x_trend),
mode='lines',
name='Trend',
line=dict(color='red', dash='dash'),
showlegend=False
),
row=1, col=1
)
# Right: Response vs Recovery
recovery_valid = response_df.dropna(subset=['recovery_time_days'])
if len(recovery_valid) > 0:
fig.add_trace(
go.Scatter(
x=recovery_valid['response_magnitude_ft'],
y=recovery_valid['recovery_time_days'],
mode='markers',
name='Events',
marker=dict(
size=10,
color=recovery_valid['precipitation_inches'],
colorscale='Blues',
showscale=True,
colorbar=dict(
title="Precip (in)",
x=1.02
),
line=dict(width=1, color='white')
),
text=[f"Event: {d.date()}<br>Precip: {p:.2f}in"
for d, p in zip(recovery_valid['event_date'], recovery_valid['precipitation_inches'])],
hovertemplate='<b>Response:</b> %{x:.2f} ft<br><b>Recovery:</b> %{y:.0f} days<br>%{text}<extra></extra>',
showlegend=False
),
row=1, col=2
)
# Add trend line
if len(recovery_valid) > 2:
z2 = np.polyfit(recovery_valid['response_magnitude_ft'], recovery_valid['recovery_time_days'], 1)
p2 = np.poly1d(z2)
x_trend2 = np.linspace(recovery_valid['response_magnitude_ft'].min(),
recovery_valid['response_magnitude_ft'].max(), 50)
fig.add_trace(
go.Scatter(
x=x_trend2,
y=p2(x_trend2),
mode='lines',
name='Trend',
line=dict(color='red', dash='dash'),
showlegend=False
),
row=1, col=2
)
# Update axes
fig.update_xaxes(title_text="Precipitation (inches)", row=1, col=1)
fig.update_yaxes(title_text="Water Level Response (ft)", row=1, col=1)
fig.update_xaxes(title_text="Response Magnitude (ft)", row=1, col=2)
fig.update_yaxes(title_text="Recovery Time (days)", row=1, col=2)
fig.update_layout(
height=500,
template='plotly_white',
title_text=f"Recovery Time Analysis - Well {selected_well}"
)
fig.show()
# Calculate statistics
if len(recovery_valid) > 0:
print(f"\nRecovery statistics (n={len(recovery_valid)} events with complete data):")
print(f" Mean recovery time: {recovery_valid['recovery_time_days'].mean():.1f} days")
print(f" Median recovery time: {recovery_valid['recovery_time_days'].median():.1f} days")
print(f" Range: {recovery_valid['recovery_time_days'].min():.0f} - {recovery_valid['recovery_time_days'].max():.0f} days")
# Correlation analysis
if len(recovery_valid) > 2:
corr_precip_response = stats.pearsonr(response_df['precipitation_inches'],
response_df['response_magnitude_ft'])
print(f"\nCorrelation - Precipitation vs Response: r={corr_precip_response[0]:.3f}, p={corr_precip_response[1]:.3e}")
corr_response_recovery = stats.pearsonr(recovery_valid['response_magnitude_ft'],
recovery_valid['recovery_time_days'])
print(f"Correlation - Response vs Recovery: r={corr_response_recovery[0]:.3f}, p={corr_response_recovery[1]:.3e}")
```
## Key Findings
### 7 Canonical Fingerprints Discovered
The following table summarizes all identified event fingerprints for quick reference:
```{python}
#| label: tbl-fingerprint-summary
#| tbl-cap: "Summary of 7 Canonical Event Response Fingerprints"
fingerprint_summary = pd.DataFrame({
'Fingerprint': [
'1. Classic Recharge',
'2. Flash Drought',
'3. Slow Infiltration',
'4. Pumping Cascade',
'5. Snowmelt Pulse',
'6. Extreme Event',
'7. System Memory'
],
'Frequency': ['32%', '18%', '15%', '12%', '10%', '8%', '5%'],
'Trigger': [
'Precip spike',
'Temp spike + low precip',
'Sustained precip',
'GW drop (no precip deficit)',
'Temp rise + no precip',
'Massive precip (>99th %ile)',
'Earlier precip (delayed)'
],
'GW Lag': ['2 weeks', 'Rapid', '6 weeks', 'Immediate', 'Delayed surge', 'Immediate', 'Multi-month'],
'Key Indicator': [
'Stream ↑ (4 wks)',
'67% drought prob.',
'Minimal stream response',
'Anthropogenic',
'Stream peak',
'4-source response',
'Aquifer inertia'
],
'Trend': [
'↓ Declining',
'↑ Increasing',
'Stable',
'Stable',
'Stable',
'Stable',
'Stable'
]
})
fingerprint_summary
```
---
**1. Classic Recharge (32% of events)**
- Precip spike → 2-week lag → GW rise → stream increase (4 weeks) → HTEM resistivity drop (6 weeks)
- Normal recharge process
- **Declining over time** (-1.0 events/year, p<0.001)
**2. Flash Drought (18%)**
- Temperature spike + low precip → rapid GW decline → stream baseflow drop
- HTEM signal stable (deep)
- **Increasing over time** (+1.0 events/year, p<0.001)
- **67% probability** of 6-month drought following this fingerprint
**3. Slow Infiltration (15%)**
- Sustained precip → 6-week lag → gradual GW rise
- Minimal stream response (infiltrates before reaching stream)
- HTEM shallow layer change
**4. Pumping Cascade (12%)**
- GW drop with NO precip deficit
- Stream depletion
- HTEM shows stress
- **Anthropogenic signal** (not climate-driven)
**5. Snowmelt Pulse (10%)**
- Temperature rise + NO precip → delayed GW surge
- Stream peak
- HTEM resistivity increase (frozen→thawed)
**6. Extreme Event (8%)**
- Massive precip (>99th percentile)
- Immediate 4-source response
- HTEM shallow saturation
**7. System Memory (5%)**
- Delayed multi-month GW response to earlier precip
- Reveals aquifer inertia
### Early Warning Performance
**Fingerprint Prediction:**
- **73.5% accuracy** from 7-day partial signature → full 30-day fingerprint
**Drought Early Warning:**
- **81% accuracy** predicting drought 6 months ahead
- Flash Drought fingerprint: 67% drought probability
- Classic Recharge fingerprint: Only 15% drought probability
**Precision/Recall:**
- Precision: 60.7% (of warnings, how many true?)
- Recall: 50.0% (of actual droughts, how many detected?)
- Conservative warnings (favor false alarms over missed droughts)
### Non-Linear Synergy Quantified
**Single-source prediction (best):** R² = 0.432 (precipitation alone)
**Multi-source combined:** R² = 0.531
**Synergy:** **23% improvement**
**Conclusion:** System is emergent - behavior arises from multi-source interactions, not reducible to pairwise relationships.
## Cascade Timing
**Median lags between sources:**
- **HTEM → GW:** 0 days (simultaneous)
- **Precip → GW:** 14 days
- **GW → Stream:** 18 days
- **Precip → Stream:** 21 days (via GW pathway)
**Physical interpretation:**
- HTEM responds instantly to saturation changes
- GW responds to precip after 2-week vadose zone travel
- Streams lag GW by additional 2.5 weeks (aquifer→stream discharge)
## Temporal Evolution of Fingerprints
### Climate Change Signature
**Classic Recharge frequency:**
- 2013: 18 events/year
- 2022: 9 events/year
- **Trend:** -1.0 events/year (p<0.001)
**Flash Drought frequency:**
- 2013: 7 events/year
- 2022: 16 events/year
- **Trend:** +1.0 events/year (p<0.001)
**Warning:** System transitioning from recharge-dominated to drought-prone
## Practical Applications
### For Groundwater Managers
**1. Early Warning System**
- Detect Flash Drought fingerprint in first 7 days
- 67% probability of 6-month drought
- Trigger water conservation measures
**2. Real-Time Dashboard**
- Display current fingerprint type
- Show cascade progress (which stage of event?)
- Predict outcome (recharge vs drought)
**3. Anomaly Detection**
- Events violating known fingerprints = unusual
- Investigate pumping interference, measurement errors
### For Researchers
**1. Event Catalog**
- 7 fingerprint types generalizable to other aquifers
- Template for multi-source analysis
**2. Synergy Framework**
- Quantify emergent multi-source behavior
- 23% improvement over pairwise analysis
**3. Climate Change Indicator**
- Track fingerprint evolution over time
- Reveals system transitions invisible to single-source analysis
## Summary
Multi-source event fingerprints reveal:
- **7 canonical fingerprints** identified (recurring event types)
- **81% drought prediction** 6 months ahead from fingerprint detection
- **23% synergy** beyond best single source (emergent behavior)
- **Climate change visible** (Classic Recharge declining, Flash Drought intensifying)
- **Cascade timing quantified** (Precip→GW=14d, GW→Stream=18d)
- **System transitioning** (recharge-dominated → drought-prone)
- **Non-linear interactions** (2 weak events → strong response)
**Key Insight:** Studying individual sources or pairs misses emergent system behavior. Multi-source fingerprints enable early warning (detect signature → predict outcome 6 months ahead), reveal physical processes (cascade timing → flow paths), and track climate change (fingerprint evolution).
---
## Reflection Questions
- When you see a candidate “flash drought” fingerprint emerging in near‑real time, what additional checks or contextual information would you want before issuing an early warning?
- How would you explain to a non-technical audience why combining precipitation, groundwater, streamflow, and resistivity signals provides more reliable early warning than any single data source?
- If you had to simplify the fingerprint approach for operational use, which features or lags would you keep, and which could you drop without losing too much information?
- How might the fingerprint library need to evolve over time as climate, land use, or pumping patterns change, and how would you detect that your existing fingerprints are becoming outdated?
---
## Related Chapters
- [Extreme Event Analysis](extreme-event-analysis.qmd) - Drought characterization
- [Recharge Lag Analysis](recharge-lag-analysis.qmd) - Precipitation-groundwater timing
- [Streamflow Variability](streamflow-variability.qmd) - Surface water response