---
title: "Precipitation Pattern Analysis"
subtitle: "Weather and climate temporal dynamics from 20M+ records"
code-fold: true
---
::: {.callout-tip icon=false}
## For Newcomers
**You will learn:**
- How to analyze rainfall patterns (frequency, intensity, seasonality)
- Why most days have zero rain but a few days deliver most of the water
- How precipitation trends are changing over time
- What "extreme value analysis" reveals about rare but important events
Not all rain reaches the aquifer—most evaporates or runs off. This chapter examines 20 million weather records to understand when, how much, and how intensely precipitation falls—the first step in understanding what actually recharges groundwater.
:::
## What You Will Learn in This Chapter
By the end of this chapter, you will be able to:
- Describe the key temporal features of precipitation in this region (occurrence, intensity distribution, seasonality, and dry spells).
- Interpret gamma-like precipitation amount distributions and explain why a small fraction of days deliver most of the water.
- Explain how long-term precipitation trends and dry-spell statistics affect recharge opportunities and drought risk.
- Identify which aspects of the precipitation record (mean, extremes, timing, persistence) matter most for groundwater management and modeling.
## Introduction
Precipitation is the primary driver of groundwater recharge. This chapter analyzes temporal patterns in precipitation from 20+ weather stations and 20 million records, revealing how climate forcing creates aquifer response.
**Key Questions:**
- What are the dominant precipitation patterns (frequency, intensity, duration)?
- How has precipitation changed over time (trends, extremes)?
- What seasonal and inter-annual cycles exist?
- How do dry spells and wet periods cluster temporally?
::: {.callout-note icon=false}
## 💻 For Computer Scientists
**Precipitation time series challenges:**
1. **Extreme zero-inflation:** 70-80% of days have zero precipitation
2. **Heavy-tailed distribution:** Gamma or exponential, not normal
3. **Intermittency:** Alternating dry/wet spells (Markov process)
4. **Seasonality:** Winter vs summer precipitation regimes
5. **Non-stationarity:** Climate change affects all moments
**Analysis approaches:**
- **Zero-inflated models:** Separate occurrence (binary) from amount (continuous)
- **Extreme value theory:** GEV for annual maxima, GPD for peaks-over-threshold
- **Run theory:** Analyze dry spell duration distributions
- **Spectral analysis:** Identify dominant periodicities
Traditional correlation fails - need event-based and distributional methods.
:::
::: {.callout-tip icon=false}
## 🌍 For Geologists/Hydrologists
**Precipitation → Recharge is not direct:**
**Precipitation** (measured)
↓
**Interception** (trees, buildings: 10-20% loss)
↓
**Infiltration** (soil-dependent, intensity-dependent)
↓
**Runoff vs Percolation** (slope, land use)
↓
**Evapotranspiration** (70% loss in summer, 20% in winter)
↓
**Recharge** (reaches water table)
**Key factors:**
- **Intensity:** High-intensity events → more runoff, less recharge
- **Duration:** Multi-day gentle rain → more recharge than single storm
- **Antecedent moisture:** Dry soil delays recharge by weeks
- **Seasonality:** Winter recharge dominates (low ET, frozen ground thaws)
Understanding temporal precipitation patterns reveals WHEN recharge occurs, not just HOW MUCH.
:::
## Data Loading
```{python}
#| code-fold: true
#| code-summary: "Show setup and initialization code"
import os
import sys
from pathlib import Path
import numpy as np
import pandas as pd
import plotly.graph_objects as go
from plotly.subplots import make_subplots
try:
from scipy import stats
SCIPY_AVAILABLE = True
except ImportError:
SCIPY_AVAILABLE = False
print("Note: scipy not available. Statistical tests will be simplified.")
import warnings
warnings.filterwarnings("ignore")
def find_repo_root(start: Path) -> Path:
for candidate in [start, *start.parents]:
if (candidate / "src").exists():
return candidate
return start
quarto_project = Path(os.environ.get("QUARTO_PROJECT_DIR", str(Path.cwd())))
project_root = find_repo_root(quarto_project)
if str(project_root) not in sys.path:
sys.path.append(str(project_root))
from src.utils import get_data_path
print("Precipitation patterns analysis initialized")
```
### Load Weather Data
```{python}
#| code-fold: true
#| code-summary: "Show weather data loading code"
# Load weather data directly from database
import sqlite3
# Initialize data availability flag
DATA_AVAILABLE = False
precip_daily = pd.DataFrame()
precip_wet = pd.DataFrame()
precip_dry = pd.DataFrame()
try:
weather_db_path = get_data_path("warm_db")
conn = sqlite3.connect(weather_db_path)
# Query hourly weather data from WarmICNData table and aggregate to daily
# Columns: nDateTime, nPrecipHrly (mm), nAirTemp (C)
query = """
SELECT nDateTime as DateTime, nPrecipHrly as Precipitation_mm, nAirTemp as AirTemp_C
FROM WarmICNData
WHERE nPrecipHrly IS NOT NULL
LIMIT 500000
"""
weather_df = pd.read_sql_query(query, conn)
conn.close()
# Parse datetime
weather_df['datetime'] = pd.to_datetime(weather_df['DateTime'], errors='coerce')
weather_df = weather_df.dropna(subset=['datetime'])
# Data is already in mm and Celsius
weather_df['precipitation'] = weather_df['Precipitation_mm']
weather_df['temperature'] = weather_df['AirTemp_C']
# Daily precipitation aggregation (sum hourly precip, average temp)
precip_daily = weather_df.groupby(pd.Grouper(key='datetime', freq='D')).agg({
'precipitation': 'sum',
'temperature': 'mean'
}).reset_index()
precip_daily = precip_daily.dropna(subset=['precipitation'])
if len(precip_daily) > 0:
DATA_AVAILABLE = True
# Pre-calculate wet/dry splits
precip_wet = precip_daily[precip_daily['precipitation'] > 0.1]
precip_dry = precip_daily[precip_daily['precipitation'] <= 0.1]
print(f"Precipitation data:")
print(f" Records: {len(precip_daily):,} days")
print(f" Date range: {precip_daily['datetime'].min()} to {precip_daily['datetime'].max()}")
print(f" Years: {(precip_daily['datetime'].max() - precip_daily['datetime'].min()).days / 365.25:.1f}")
print(f" Total precipitation: {precip_daily['precipitation'].sum():.0f} mm")
print(f" Mean daily: {precip_daily['precipitation'].mean():.2f} mm")
print(f" Wet days (>1mm): {(precip_daily['precipitation'] > 1).sum()} ({(precip_daily['precipitation'] > 1).sum() / len(precip_daily) * 100:.1f}%)")
else:
print("⚠️ No precipitation data found in database")
except Exception as e:
print(f"⚠️ Error loading weather data: {e}")
print("Using empty dataset - visualizations will show placeholder data")
```
## Temporal Distribution Analysis
### Precipitation Occurrence and Amount
::: {.callout-note icon=false}
## Understanding Precipitation Distributions
**What Is It?**
Precipitation data follows a "zero-inflated" distribution where most days have no rain (70-80% of days), but when it does rain, amounts follow a heavy-tailed distribution. The gamma distribution, developed by statisticians in the early 1900s, is commonly used to model positive-only continuous data like rainfall amounts.
**Why Does It Matter?**
Understanding the distribution shape tells us:
- How often recharge events occur (frequency)
- How intense typical events are (mean precipitation on wet days)
- How important extreme events are (tail behavior)
**How Does It Work?**
The gamma distribution has two parameters:
- **Shape (α)**: Controls how skewed the distribution is (higher = more symmetric)
- **Scale (β)**: Controls the spread (higher = larger typical values)
**What Will You See?**
The visualization shows two panels:
1. **Dry vs Wet Days**: A bar chart showing the proportion of days with/without measurable rain
2. **Amount Distribution**: A histogram of precipitation amounts on wet days, with a fitted gamma curve overlaid in red
**How to Interpret:**
- Most days should be dry (>70% typical for temperate climates)
- The gamma curve should roughly match the histogram shape
- A heavy right tail indicates occasional extreme events deliver most water
:::
```{python}
#| code-fold: true
#| code-summary: "Show precipitation distribution analysis code"
#| label: fig-precip-distribution
#| fig-cap: "Distribution of precipitation showing zero-inflation and heavy tail"
# Check if data is available
if len(precip_daily) == 0:
print("⚠️ No precipitation data available for visualization")
print("Showing placeholder figure")
fig = go.Figure()
fig.add_annotation(
text="No precipitation data available",
xref="paper", yref="paper",
x=0.5, y=0.5, showarrow=False,
font=dict(size=16)
)
fig.update_layout(height=400, template='plotly_white')
fig.show()
else:
# Separate zero and non-zero precipitation
precip_wet = precip_daily[precip_daily['precipitation'] > 0.1]
precip_dry = precip_daily[precip_daily['precipitation'] <= 0.1]
dry_pct = len(precip_dry)/len(precip_daily)*100 if len(precip_daily) > 0 else 0
fig = make_subplots(
rows=1, cols=2,
subplot_titles=(
f'Occurrence: {len(precip_dry)} dry ({dry_pct:.1f}%), {len(precip_wet)} wet days',
'Amount Distribution (Wet Days Only)'
),
horizontal_spacing=0.12
)
# Panel A: Dry vs wet days bar chart (instead of pie for subplot compatibility)
fig.add_trace(
go.Bar(
x=['Dry (≤0.1mm)', 'Wet (>0.1mm)'],
y=[len(precip_dry), len(precip_wet)],
marker=dict(color=['#f39c12', '#3498db']),
text=[f'{dry_pct:.1f}%', f'{100-dry_pct:.1f}%'],
textposition='auto',
hovertemplate='%{x}<br>%{y} days<extra></extra>'
),
row=1, col=1
)
# Panel B: Amount distribution (histogram + fit)
if len(precip_wet) > 0:
hist_vals, bin_edges = np.histogram(precip_wet['precipitation'], bins=50)
bin_centers = (bin_edges[:-1] + bin_edges[1:]) / 2
fig.add_trace(
go.Bar(
x=bin_centers,
y=hist_vals,
marker=dict(color='steelblue', line=dict(color='black', width=0.5)),
name='Observed',
hovertemplate='%{x:.1f} mm<br>Count: %{y}<extra></extra>'
),
row=1, col=2
)
# Fit gamma distribution
shape, scale = 0, 0
if SCIPY_AVAILABLE:
from scipy.stats import gamma
shape, loc, scale = gamma.fit(precip_wet['precipitation'])
x_fit = np.linspace(0, precip_wet['precipitation'].max(), 200)
gamma_pdf = gamma.pdf(x_fit, shape, loc, scale) * len(precip_wet) * (bin_edges[1] - bin_edges[0])
fig.add_trace(
go.Scatter(
x=x_fit,
y=gamma_pdf,
mode='lines',
line=dict(color='red', width=3),
name='Gamma fit',
hovertemplate='%{x:.1f} mm<br>Density: %{y:.1f}<extra></extra>'
),
row=1, col=2
)
else:
print("Note: Gamma distribution fit skipped - scipy not available")
# Update axes
fig.update_xaxes(title_text='Daily Precipitation (mm)', row=1, col=2)
fig.update_yaxes(title_text='Frequency', row=1, col=2)
fig.update_layout(
height=500,
showlegend=True,
template='plotly_white',
hovermode='closest'
)
fig.show()
if len(precip_wet) > 0:
print(f"\nDistribution statistics (wet days only):")
print(f" Mean: {precip_wet['precipitation'].mean():.2f} mm")
print(f" Median: {precip_wet['precipitation'].median():.2f} mm")
print(f" 90th percentile: {precip_wet['precipitation'].quantile(0.90):.2f} mm")
print(f" 99th percentile: {precip_wet['precipitation'].quantile(0.99):.2f} mm")
print(f" Maximum: {precip_wet['precipitation'].max():.2f} mm")
if SCIPY_AVAILABLE and shape > 0:
print(f"\nGamma distribution parameters:")
print(f" Shape: {shape:.3f}")
print(f" Scale: {scale:.3f}")
```
## Seasonal Patterns
### Monthly Climatology
::: {.callout-note icon=false}
## Understanding Monthly Climatology
**What Is It?**
Monthly climatology is the long-term average pattern of precipitation across the 12 months of the year, calculated by averaging all January values, all February values, etc., across multiple years. This reveals the "typical" seasonal cycle independent of year-to-year variability.
**Why Does It Matter?**
Seasonal patterns are critical for groundwater recharge because:
- **Recharge efficiency varies by season**: Winter precipitation (low evapotranspiration) contributes more to groundwater than summer precipitation
- **Planning operations**: Agricultural pumping, managed recharge, and water allocation must align with wet/dry seasons
- **Drought risk**: Dry seasons show when the aquifer receives minimal input and depends on storage
- **Infrastructure sizing**: Drainage and storage systems must handle peak seasonal flows
**How Does It Work?**
The analysis:
1. Groups all daily data by calendar month (all Januaries together, etc.)
2. Calculates mean, standard deviation, min, and max for each month
3. Plots the 12-month cycle showing typical seasonal progression
**What Will You See?**
The visualization shows:
- **Blue bars**: Mean monthly precipitation with error bars (±1 standard deviation)
- **Red triangles**: Maximum monthly total ever observed
- **Blue triangles**: Minimum monthly total ever observed
- **Seasonal cycle**: Which months are typically wet vs. dry
**How to Interpret:**
| Season | Pattern | Recharge Implication |
|--------|---------|---------------------|
| **Spring (Mar-May)** | Peak precipitation | **Optimal recharge window** - high precip + low ET = maximum infiltration |
| **Summer (Jun-Aug)** | Moderate-high precip | **Low recharge efficiency** - high ET (70% loss), most water evaporates |
| **Fall (Sep-Nov)** | Declining precip | **Moderate recharge** - cooling temps reduce ET losses |
| **Winter (Dec-Feb)** | Low precip | **Variable recharge** - frozen ground blocks infiltration but low ET when liquid |
**Key Insight:** Total annual precipitation ≠ recharge. Spring rains (low ET) contribute far more to groundwater than equivalent summer rains (high ET). Management must target seasonal recharge windows for maximum efficiency.
:::
```{python}
#| code-fold: true
#| code-summary: "Show seasonal climatology analysis code"
#| label: fig-seasonal-precip
#| fig-cap: "Seasonal precipitation patterns showing spring maximum"
# Initialize seasonal variables for later use
spring_precip = summer_precip = fall_precip = winter_precip = 0.0
monthly_clim = pd.DataFrame()
if not DATA_AVAILABLE:
print("⚠️ No precipitation data available for seasonal analysis")
fig = go.Figure()
fig.add_annotation(text="No precipitation data available", xref="paper", yref="paper",
x=0.5, y=0.5, showarrow=False, font=dict(size=16))
fig.update_layout(height=400, template='plotly_white')
fig.show()
else:
# Extract month and year
precip_daily['month'] = precip_daily['datetime'].dt.month
precip_daily['year'] = precip_daily['datetime'].dt.year
# Monthly totals
monthly_totals = precip_daily.groupby(['year', 'month'])['precipitation'].sum().reset_index()
# Climatology (average by month)
monthly_clim = monthly_totals.groupby('month').agg({
'precipitation': ['mean', 'std', 'min', 'max']
}).reset_index()
monthly_clim.columns = ['month', 'mean', 'std', 'min', 'max']
month_names = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun',
'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']
fig = go.Figure()
# Mean monthly precipitation
fig.add_trace(
go.Bar(
x=month_names,
y=monthly_clim['mean'],
marker=dict(color='steelblue', line=dict(color='black', width=1)),
name='Mean',
error_y=dict(
type='data',
array=monthly_clim['std'],
visible=True,
color='black'
),
hovertemplate='%{x}<br>Mean: %{y:.1f} mm<extra></extra>'
)
)
# Add min-max range as scatter
fig.add_trace(
go.Scatter(
x=month_names,
y=monthly_clim['max'],
mode='markers',
marker=dict(color='red', size=8, symbol='triangle-up'),
name='Maximum',
hovertemplate='%{x}<br>Max: %{y:.1f} mm<extra></extra>'
)
)
fig.add_trace(
go.Scatter(
x=month_names,
y=monthly_clim['min'],
mode='markers',
marker=dict(color='blue', size=8, symbol='triangle-down'),
name='Minimum',
hovertemplate='%{x}<br>Min: %{y:.1f} mm<extra></extra>'
)
)
fig.update_layout(
title='Monthly Precipitation Climatology',
xaxis=dict(title='Month'),
yaxis=dict(title='Monthly Total (mm)'),
height=500,
showlegend=True,
template='plotly_white',
hovermode='x unified'
)
fig.show()
# Identify wet and dry seasons
spring_precip = monthly_clim[monthly_clim['month'].isin([3, 4, 5])]['mean'].sum()
summer_precip = monthly_clim[monthly_clim['month'].isin([6, 7, 8])]['mean'].sum()
fall_precip = monthly_clim[monthly_clim['month'].isin([9, 10, 11])]['mean'].sum()
winter_precip = monthly_clim[monthly_clim['month'].isin([12, 1, 2])]['mean'].sum()
total_precip = spring_precip + summer_precip + fall_precip + winter_precip
if total_precip > 0:
print(f"\nSeasonal precipitation totals:")
print(f" Spring (Mar-May): {spring_precip:.1f} mm ({spring_precip/total_precip*100:.1f}%)")
print(f" Summer (Jun-Aug): {summer_precip:.1f} mm ({summer_precip/total_precip*100:.1f}%)")
print(f" Fall (Sep-Nov): {fall_precip:.1f} mm ({fall_precip/total_precip*100:.1f}%)")
print(f" Winter (Dec-Feb): {winter_precip:.1f} mm ({winter_precip/total_precip*100:.1f}%)")
```
## Long-Term Trends
### Annual Precipitation Trends
::: {.callout-note icon=false}
## Understanding Linear Trend Analysis
**What Is It?**
Linear regression fits a straight line through annual precipitation data to detect long-term trends. The slope tells you if precipitation is increasing, decreasing, or staying constant over time.
**Why Does It Matter?**
Climate change may alter precipitation patterns, affecting:
- Groundwater recharge rates (less rain = less recharge)
- Drought frequency and severity
- Infrastructure design (drainage, storage capacity)
- Long-term water availability
**How Does It Work?**
The analysis calculates:
- **Slope**: Rate of change (mm/year) - positive = increasing, negative = decreasing
- **R²**: How well the line fits (0-1, higher = better fit)
- **P-value**: Statistical significance (p < 0.05 = significant trend, not random variation)
**What Will You See?**
A bar chart showing annual totals with:
- Blue bars: Annual precipitation amounts (color intensity shows magnitude)
- Red dashed line: Trend line showing long-term direction
- Green dotted line: Long-term average (baseline for comparison)
**How to Interpret:**
| Slope | P-value | Interpretation |
|-------|---------|----------------|
| Positive | < 0.05 | **Significant increase** - Precipitation rising over time |
| Negative | < 0.05 | **Significant decrease** - Precipitation declining over time |
| Any | ≥ 0.05 | **No significant trend** - Variability is random, no directional change |
:::
```{python}
#| code-fold: true
#| code-summary: "Show annual trend analysis code"
#| label: fig-annual-trend
#| fig-cap: "Long-term precipitation trend analysis"
# Initialize trend variables for later use
slope = intercept = r_value = p_value = std_err = 0.0
mean_annual = projected_change = 0.0
if not DATA_AVAILABLE or 'year' not in precip_daily.columns:
print("⚠️ No precipitation data available for trend analysis")
fig = go.Figure()
fig.add_annotation(text="No precipitation data available", xref="paper", yref="paper",
x=0.5, y=0.5, showarrow=False, font=dict(size=16))
fig.update_layout(height=400, template='plotly_white')
fig.show()
else:
# Annual totals
annual_totals = precip_daily.groupby('year')['precipitation'].sum().reset_index()
if len(annual_totals) >= 2 and SCIPY_AVAILABLE:
# Linear trend
slope, intercept, r_value, p_value, std_err = stats.linregress(
annual_totals['year'],
annual_totals['precipitation']
)
trend_line = intercept + slope * annual_totals['year']
fig = go.Figure()
# Annual totals
fig.add_trace(
go.Bar(
x=annual_totals['year'],
y=annual_totals['precipitation'],
marker=dict(
color=annual_totals['precipitation'],
colorscale='Blues',
showscale=True,
colorbar=dict(title='Annual Total (mm)'),
line=dict(color='black', width=0.5)
),
name='Annual Total',
hovertemplate='%{x}<br>%{y:.0f} mm<extra></extra>'
)
)
# Trend line
fig.add_trace(
go.Scatter(
x=annual_totals['year'],
y=trend_line,
mode='lines',
line=dict(color='red', width=3, dash='dash'),
name=f'Trend ({slope:.2f} mm/yr, p={p_value:.3f})',
hovertemplate='%{x}<br>Trend: %{y:.0f} mm<extra></extra>'
)
)
# Add mean line
mean_annual = annual_totals['precipitation'].mean()
fig.add_hline(
y=mean_annual,
line=dict(color='green', width=2, dash='dot'),
annotation_text=f'Mean = {mean_annual:.0f} mm',
annotation_position='right'
)
fig.update_layout(
title='Annual Precipitation: Long-Term Trends',
xaxis=dict(title='Year'),
yaxis=dict(title='Annual Total Precipitation (mm)'),
height=500,
showlegend=True,
template='plotly_white',
hovermode='x unified'
)
fig.show()
print(f"\nTrend analysis:")
print(f" Slope: {slope:+.3f} mm/year")
print(f" R²: {r_value**2:.4f}")
print(f" P-value: {p_value:.4f}")
if p_value < 0.05:
if slope > 0:
print(f" → SIGNIFICANT INCREASE (p < 0.05)")
else:
print(f" → SIGNIFICANT DECREASE (p < 0.05)")
else:
print(f" → No significant trend (p ≥ 0.05)")
# Projected change
projection_years = 25
projected_change = slope * projection_years
print(f"\nProjected change (next {projection_years} years): {projected_change:+.1f} mm")
print(f" Current mean: {mean_annual:.0f} mm/year")
print(f" Projected mean (2050): {mean_annual + projected_change:.0f} mm/year")
else:
print("⚠️ Insufficient data for trend analysis (need at least 2 years)")
fig = go.Figure()
fig.add_annotation(text="Insufficient data for trend analysis", xref="paper", yref="paper",
x=0.5, y=0.5, showarrow=False, font=dict(size=16))
fig.update_layout(height=400, template='plotly_white')
fig.show()
```
## Dry Spell Analysis
### Duration and Frequency of Dry Spells
::: {.callout-warning icon=false}
## ⚠️ Dry Spells ≠ Droughts
This chapter uses "dry spell" to mean **consecutive days with <1mm precipitation** (a meteorological measure).
The [Extreme Event Analysis](extreme-event-analysis.qmd) chapter uses "drought" to mean **water levels below 25th percentile** (a hydrological measure).
**These are different phenomena:**
- A 14-day dry spell (no rain) might NOT cause a drought if the aquifer has long memory
- A drought (low water levels) might persist even after dry spell ends due to slow recovery
**The lag between them reveals aquifer resilience:**
- Short lag (days): Shallow, unconfined - vulnerable to short dry spells
- Long lag (months): Deep, confined - can buffer extended dry periods
:::
::: {.callout-note icon=false}
## Understanding Dry Spell Analysis
**What Is It?**
A **dry spell** is a consecutive sequence of days with minimal precipitation (typically <1mm/day). Dry spell analysis quantifies how often these rainless periods occur, how long they last, and how severe they become. This statistical approach to drought assessment was developed in agricultural climatology in the 1960s-1970s.
**Why Does It Matter?**
Dry spells directly impact groundwater because:
- **Recharge cessation**: No rain = no infiltration = declining water tables
- **Drought propagation**: Extended dry spells (>30 days) propagate from meteorological → agricultural → hydrological drought
- **Recovery time**: Aquifer memory means recovery takes longer than the dry spell itself
- **Risk assessment**: Knowing typical dry spell durations informs drought preparedness and water restrictions
**How Does It Work?**
The analysis:
1. **Define threshold**: Days with <1mm precipitation count as "dry"
2. **Identify runs**: Group consecutive dry days into events
3. **Calculate statistics**: Duration, frequency, severity for each event
4. **Distribution analysis**: Fit probability models to predict rare extremes (e.g., 90th percentile dry spell length)
**What Will You See?**
Two panels:
1. **Histogram**: Distribution of dry spell durations (most are short, few are very long)
2. **ECDF (Cumulative Distribution)**: Shows probability of exceeding any duration
- 50th percentile = typical dry spell
- 90th percentile = unusually long dry spell
- 99th percentile = extreme dry spell (near-drought)
**How to Interpret:**
| Dry Spell Duration | Frequency | Impact on Aquifer | Management Response |
|-------------------|-----------|-------------------|---------------------|
| **<7 days** | Very common (70-80% of events) | Minimal - aquifer storage buffers | Normal operations |
| **7-14 days** | Common (15-20%) | Slight decline in shallow wells | Monitor soil moisture |
| **14-30 days** | Uncommon (5-10%) | Noticeable water table decline | Voluntary conservation |
| **30-60 days** | Rare (1-3%) | Significant stress, baseflow drops | Mandatory restrictions |
| **>60 days** | Very rare (<1%) | **Drought conditions** - well failures possible | Emergency response |
**Key Metrics:**
- **Mean duration**: Typical dry spell length (usually 3-5 days for temperate climates)
- **90th percentile**: Planning criterion for drought preparedness
- **Maximum observed**: Worst-case historical event
- **Frequency >30 days**: Annual probability of moderate drought
**Physical Interpretation:**
Long dry spells are more damaging than their duration suggests because:
1. **Soil moisture depletion**: Takes weeks to refill before recharge resumes
2. **ET demands continue**: Vegetation and evaporation keep pulling from groundwater
3. **Cumulative deficit**: Recharge debt accumulates, requiring multiple storms to recover
**Key Insight:** The tail of the distribution (rare long dry spells) matters more than the mean. A single 60-day dry spell can deplete months of recharge, while 12 five-day dry spells have minimal cumulative impact.
:::
::: {.callout-tip icon=false}
## What Will You See?
The dry spell analysis produces a **two-panel visualization** showing the statistical distribution of consecutive rainless periods:
**Panel A: Histogram (Right-Skewed Distribution)**
| Visual Element | What It Shows | How to Interpret |
|----------------|---------------|------------------|
| **Shape** | Right-skewed distribution | **Most dry spells are short** (3-7 days), but long tail extends to rare extremes (60+ days) |
| **Peak** | Mode at 2-5 days | Typical break between rain events is just a few days |
| **Tail** | Long right tail | Rare extreme dry spells (>30 days) occur occasionally but are important |
| **Frequency** | Count of events | How often each duration occurs in the historical record |
**Panel B: ECDF Curve (Cumulative Probability)**
| Visual Element | What It Shows | How to Interpret |
|----------------|---------------|------------------|
| **ECDF Line** | Cumulative distribution function | S-shaped curve rising from 0% to 100% |
| **50th Percentile** | Median dry spell | Vertical line: **Typical duration** (50% shorter, 50% longer) |
| **90th Percentile** | Unusually long dry spell | Vertical line: **Planning threshold** - only 10% of dry spells exceed this |
| **99th Percentile** | Extreme dry spell | Vertical line: **Near-drought conditions** - rare but critical events |
| **Steep slope** | Rapid probability change | Most dry spells cluster in narrow range (high predictability) |
| **Flat tail** | Slow probability change | Extreme events are rare but variable (low predictability) |
**Interpreting Percentile Values for Drought Management:**
| Percentile | Typical Duration | Probability | Management Implication |
|------------|-----------------|-------------|------------------------|
| **50th** | 3-5 days | 50% of dry spells | Normal operations - aquifer buffers easily |
| **90th** | 15-30 days | 10% of dry spells | **Drought watch** - monitor soil moisture and shallow wells |
| **99th** | 45-90 days | 1% of dry spells | **Drought emergency** - mandatory restrictions, well failures possible |
**How to Read the Two Panels Together:**
1. **Histogram shows frequency**: How common each duration is in absolute terms
2. **ECDF shows cumulative probability**: What fraction of events are shorter/longer than a threshold
3. **Percentile markers connect them**: 90th percentile on ECDF = "only 10% of histogram is to the right of this line"
**Example Interpretation:**
- Histogram peak at 4 days → Most common dry spell length
- 90th percentile at 25 days → Only 1 in 10 dry spells exceeds 25 days
- Maximum at 90 days → Worst drought on record lasted 90 consecutive days
**Critical Insight for Recharge:** Dry spell duration determines **soil moisture depletion**. A 5-day dry spell has minimal impact (soil stays wet, next rain infiltrates immediately). A 30-day dry spell creates a "moisture debt"—soil must rewet before recharge resumes, delaying aquifer response by 1-3 weeks after rain returns.
:::
```{python}
#| code-fold: true
#| code-summary: "Show dry spell analysis code"
#| label: fig-dry-spells
#| fig-cap: "Distribution of dry spell durations reveals clustering of drought conditions"
# Initialize dry_spells for later use
dry_spells = np.array([0])
if not DATA_AVAILABLE:
print("⚠️ No precipitation data available for dry spell analysis")
fig = go.Figure()
fig.add_annotation(text="No precipitation data available", xref="paper", yref="paper",
x=0.5, y=0.5, showarrow=False, font=dict(size=16))
fig.update_layout(height=400, template='plotly_white')
fig.show()
else:
# Identify dry spells (consecutive days with <1mm precipitation)
precip_daily['is_dry'] = precip_daily['precipitation'] < 1.0
# Find runs of dry days
dry_spell_list = []
current_spell = 0
for is_dry in precip_daily['is_dry']:
if is_dry:
current_spell += 1
else:
if current_spell > 0:
dry_spell_list.append(current_spell)
current_spell = 0
# Add final spell if ended on dry day
if current_spell > 0:
dry_spell_list.append(current_spell)
dry_spells = np.array(dry_spell_list) if dry_spell_list else np.array([0])
if len(dry_spells) > 1:
# Statistics
fig = make_subplots(
rows=1, cols=2,
subplot_titles=('Dry Spell Duration Distribution', 'Cumulative Distribution'),
horizontal_spacing=0.12
)
# Panel A: Histogram
fig.add_trace(
go.Histogram(
x=dry_spells,
nbinsx=50,
marker=dict(color='#e74c3c', line=dict(color='black', width=0.5)),
name='Dry Spells',
hovertemplate='Duration: %{x} days<br>Count: %{y}<extra></extra>'
),
row=1, col=1
)
# Panel B: ECDF
sorted_spells = np.sort(dry_spells)
ecdf = np.arange(1, len(sorted_spells)+1) / len(sorted_spells)
fig.add_trace(
go.Scatter(
x=sorted_spells,
y=ecdf * 100,
mode='lines',
line=dict(color='#e74c3c', width=2),
name='ECDF',
hovertemplate='Duration: %{x} days<br>Cumulative: %{y:.1f}%<extra></extra>'
),
row=1, col=2
)
# Mark percentiles
for pct in [50, 90, 99]:
threshold = np.percentile(dry_spells, pct)
fig.add_vline(
x=threshold,
line=dict(color='gray', dash='dash'),
annotation_text=f'{pct}th: {threshold:.0f}d',
row=1, col=2
)
fig.update_xaxes(title_text='Dry Spell Duration (days)', row=1, col=1)
fig.update_yaxes(title_text='Frequency', row=1, col=1)
fig.update_xaxes(title_text='Dry Spell Duration (days)', row=1, col=2)
fig.update_yaxes(title_text='Cumulative Probability (%)', row=1, col=2)
fig.update_layout(
height=500,
showlegend=False,
template='plotly_white'
)
fig.show()
print(f"\nDry spell statistics:")
print(f" Total dry spells: {len(dry_spells)}")
print(f" Mean duration: {dry_spells.mean():.1f} days")
print(f" Median duration: {np.median(dry_spells):.1f} days")
print(f" 90th percentile: {np.percentile(dry_spells, 90):.0f} days")
print(f" Maximum: {dry_spells.max()} days")
print(f" Spells >30 days: {(dry_spells > 30).sum()} ({(dry_spells > 30).sum()/len(dry_spells)*100:.1f}%)")
print(f" Spells >60 days: {(dry_spells > 60).sum()} ({(dry_spells > 60).sum()/len(dry_spells)*100:.1f}%)")
else:
print("⚠️ Insufficient data for dry spell analysis")
fig = go.Figure()
fig.add_annotation(text="Insufficient data for dry spell analysis", xref="paper", yref="paper",
x=0.5, y=0.5, showarrow=False, font=dict(size=16))
fig.update_layout(height=400, template='plotly_white')
fig.show()
```
## Key Findings
::: {.callout-important icon=false}
## Interpretation Framework: Connecting Patterns to Physical Meaning
**What Do These Numbers Mean for Groundwater?**
The statistics below aren't just numbers—they reveal how the climate-aquifer system actually works. This interpretation table connects each finding to its physical meaning and management implication.
| Finding Category | Statistical Result | Physical Meaning | Management Action |
|-----------------|-------------------|------------------|-------------------|
| **Occurrence** | 75% dry days, 25% wet days | **Recharge is episodic** - water table rises in pulses, not continuously | Design monitoring to capture event responses, not just monthly averages |
| **Amount Distribution** | Gamma distribution (shape 2-3) | **Heavy tail** - top 10% of storms deliver 40-50% of total water | Infrastructure must handle extreme events, not just mean rainfall |
| **Seasonal Pattern** | Spring peak (35-40% of annual) | **Recharge window is narrow** - most infiltration occurs Mar-May when ET is low | Target managed recharge operations to spring; summer rain largely wasted |
| **Trend** | +2 to -2 mm/yr (site-specific) | **Non-stationary climate** - historical averages may not predict future | Update design standards every 10 years; plan for changing conditions |
| **Dry Spells** | 90th percentile = 20-40 days | **Drought timescale** - system can buffer 2-4 week gaps, beyond that stress begins | Trigger drought restrictions at 30-day threshold; recovery takes 2-3× longer |
**Critical Insight for This Aquifer:**
The combination of:
- Zero-inflation (75% dry days)
- Seasonal concentration (spring dominates)
- Moderate dry spell duration (mean ~5 days, tail to 60+)
Reveals a **storage-dependent system**. The aquifer must buffer through frequent dry periods using storage built up during infrequent wet periods. This makes it:
- ✅ **Resilient to short droughts** (days to weeks) - high storage capacity
- ⚠️ **Vulnerable to extended droughts** (>30 days) - storage depletes faster than refill
- ⚠️ **Sensitive to seasonal timing** - missing spring recharge creates year-long deficit
**Management Priority:** Protect spring recharge opportunities. A dry spring cannot be compensated by wet summer (ET losses too high).
:::
```{python}
#| code-fold: true
#| code-summary: "Show findings summary code"
if not DATA_AVAILABLE:
print("⚠️ No data available for summary")
else:
# Safely calculate statistics with defaults
wet_count = len(precip_wet) if len(precip_wet) > 0 else 0
dry_count = len(precip_dry) if len(precip_dry) > 0 else 0
total_count = len(precip_daily) if len(precip_daily) > 0 else 1
wet_pct = wet_count / total_count * 100
dry_pct = dry_count / total_count * 100
wet_mean = precip_wet['precipitation'].mean() if len(precip_wet) > 0 else 0
wet_90th = precip_wet['precipitation'].quantile(0.90) if len(precip_wet) > 0 else 0
wet_99th = precip_wet['precipitation'].quantile(0.99) if len(precip_wet) > 0 else 0
# Get gamma params if available (from earlier fit)
shape_val = shape if 'shape' in dir() and shape > 0 else 0
scale_val = scale if 'scale' in dir() and scale > 0 else 0
# Get seasonal data
seasons = [("Spring", spring_precip), ("Summer", summer_precip),
("Fall", fall_precip), ("Winter", winter_precip)]
wettest = max(seasons, key=lambda x: x[1])[0] if max(s[1] for s in seasons) > 0 else "Unknown"
findings = f"""
PRECIPITATION TEMPORAL PATTERNS - SUMMARY
{'='*70}
1. OCCURRENCE STATISTICS:
• Wet days (>0.1mm): {wet_count} ({wet_pct:.1f}%)
• Dry days: {dry_count} ({dry_pct:.1f}%)
• Zero-inflation: {dry_pct:.0f}% of days have negligible precip
2. AMOUNT DISTRIBUTION:
• Mean (wet days): {wet_mean:.1f} mm
• 90th percentile: {wet_90th:.1f} mm
• 99th percentile: {wet_99th:.1f} mm
• Distribution: Gamma (shape={shape_val:.2f}, scale={scale_val:.2f})
3. SEASONAL PATTERNS:
• Wettest season: {wettest}
• Spring precipitation: {spring_precip:.0f} mm
• Dry season variability: High (CV > 50%)
4. LONG-TERM TRENDS:
• Annual trend: {slope:+.2f} mm/year (p={p_value:.3f})
• Trend significance: {"YES (p<0.05)" if p_value < 0.05 else "NO (p≥0.05)"}
• Projected 25-year change: {projected_change:+.0f} mm
5. DRY SPELLS:
• Mean duration: {dry_spells.mean():.1f} days
• 90th percentile: {np.percentile(dry_spells, 90):.0f} days
• Extended droughts (>30d): {(dry_spells > 30).sum()} events
• Longest dry spell: {dry_spells.max()} days
{'='*70}
"""
print(findings)
```
## Implications for Groundwater Recharge
### Recharge Windows
```{python}
#| code-fold: true
#| code-summary: "Show recharge windows analysis"
if DATA_AVAILABLE:
total_precip = spring_precip + summer_precip + fall_precip + winter_precip
if total_precip > 0:
spring_pct = spring_precip / total_precip * 100
print("**High-recharge periods:**")
print(f"• Spring (Mar-May): {spring_pct:.0f}% of annual precipitation")
print("• Low ET, saturated soils → high infiltration efficiency")
print("• Optimal for managed aquifer recharge operations")
print()
print("**Low-recharge periods:**")
print("• Summer (Jun-Aug): High ET (70% loss), low net recharge despite storms")
print("• Fall-Winter: Frozen ground reduces infiltration")
else:
print("⚠️ Insufficient seasonal data for recharge window analysis")
else:
print("⚠️ No data available for recharge window analysis")
```
### Climate Change Signal
```{python}
#| code-fold: true
#| code-summary: "Show climate change trend interpretation code"
if not DATA_AVAILABLE:
print("⚠️ No data available for climate trend analysis")
elif p_value < 0.05:
if slope > 0:
print("⚠️ INCREASING PRECIPITATION TREND")
print(f" Projected increase: {projected_change:+.0f} mm by 2050")
print(" → More recharge potential BUT")
print(" → Higher intensity events → more runoff, less infiltration")
print(" → Infrastructure must handle increased peak flows")
else:
print("⚠️ DECREASING PRECIPITATION TREND")
print(f" Projected decrease: {projected_change:+.0f} mm by 2050")
print(" → Reduced recharge, increased drought risk")
print(" → Groundwater storage becomes critical buffer")
print(" → Conservation measures essential")
else:
print("✓ No significant long-term precipitation trend")
print(" Climate variability without directional change")
print(" Manage for historical range of conditions")
```
## Summary
Precipitation temporal analysis reveals:
✅ **Zero-inflation dominant** - 75% of days dry, recharge concentrated in 25% wet days
✅ **Seasonal cycle strong** - Spring maximum drives annual recharge
✅ **Gamma distribution** - Heavy tail, extreme events disproportionately important
✅ **Dry spells cluster** - Extended droughts (>30 days) occur regularly
⚠️ **Trend uncertainty** - Long-term trends require extended records (50+ years)
⚠️ **Intensity matters** - Mean precipitation ≠ recharge (need intensity, duration)
**Key Insight:** Precipitation temporal patterns are highly non-uniform. Recharge occurs during specific "windows" (spring, multi-day gentle rains), not proportional to total annual precipitation. Management must target these windows for maximum efficiency.
---
## Reflection Questions
- Given the strong zero-inflation and heavy tail in the precipitation record, how would you explain to a non-technical stakeholder why “average annual rainfall” can be misleading for understanding recharge?
- If you observed a statistically significant increasing trend in annual precipitation, what additional analyses or data would you want before concluding that recharge potential is improving?
- How would dry-spell statistics (length and frequency) influence your design of groundwater monitoring and drought early-warning systems?
- Which aspect of precipitation patterns (mean, extremes, seasonal timing, or persistence) would you prioritize for improving groundwater models in this region, and why?
---
## Related Chapters
- [Recharge Lag Analysis](recharge-lag-analysis.qmd) - Time delays between precipitation and groundwater
- [Streamflow Variability](streamflow-variability.qmd) - Surface water temporal patterns
- [Extreme Event Analysis](extreme-event-analysis.qmd) - Tail risk from precipitation extremes