---
title: "Bayesian Uncertainty Model"
subtitle: "Rigorous Probabilistic Inference"
code-fold: true
---
::: {.callout-tip icon=false}
## For Newcomers
**You will learn:**
- Why "I don't know" is a valid and valuable answer in science
- How Bayesian statistics quantifies uncertainty (not just point estimates)
- What prior knowledge is and how to incorporate it
- Why prediction intervals matter more than predictions alone
All models are approximations. This chapter shows how to honestly quantify *how wrong* our predictions might be—essential for making decisions under uncertainty.
:::
**Data Sources Fused**: All 4 (with uncertainty quantification)
## What You Will Learn in This Chapter
By the end of this chapter, you will be able to:
- Explain in plain language what Bayesian uncertainty means for groundwater predictions and how it differs from single “best guess” forecasts.
- Interpret prior/posterior plots, prediction intervals, and risk curves in terms of hydrologic decisions (for example, drought risk, flooding risk).
- Describe the role of priors, likelihoods, and posterior predictive distributions in quantifying parameter and forecast uncertainty.
- Reflect on when Bayesian methods add practical value over simpler approaches, and what additional data or constraints they require.
## Overview
All models are wrong, but some are useful - **if we quantify uncertainty**. Previous chapters built point predictions. This chapter adds **Bayesian inference** to estimate probability distributions over predictions, accounting for:
1. **Data uncertainty**: Measurement errors in all 4 sources
2. **Model uncertainty**: Which relationships matter most?
3. **Parameter uncertainty**: What are the true coefficients?
4. **Scenario uncertainty**: What's the probability of extreme outcomes?
::: {.callout-note icon=false}
## 💻 For Computer Scientists
**Bayesian Framework:**
$$P(\theta | \text{Data}) = \frac{P(\text{Data} | \theta) \cdot P(\theta)}{P(\text{Data})}$$
- $P(\theta)$ = Prior belief about parameters
- $P(\text{Data} | \theta)$ = Likelihood of observations
- $P(\theta | \text{Data})$ = Posterior distribution (what we want)
**Methods:**
- **Markov Chain Monte Carlo (MCMC)**: Sample posterior distribution
- **Variational Inference**: Approximate posterior with simpler distribution
- **Gaussian Processes**: Nonparametric Bayesian regression
**Advantage:** Full probability distribution, not just point estimate
:::
::: {.callout-tip icon=false}
## 🌍 For Hydrologists
**Why Bayesian?**
Traditional frequentist statistics: "The parameter is fixed, data vary"
Bayesian statistics: "Data are fixed (observed), parameters have distributions"
**For groundwater:**
- **Prior knowledge**: Incorporate physical constraints (e.g., hydraulic conductivity must be positive)
- **Sequential learning**: Update beliefs as new data arrive
- **Prediction intervals**: "Water level will be 200±5m with 95% confidence"
**Critical for management**: Need to quantify risk, not just predict mean.
:::
## Analysis Approach
```{python}
#| code-fold: true
import sys
import pandas as pd
import numpy as np
import plotly.graph_objects as go
from plotly.subplots import make_subplots
from scipy import stats
from sklearn.preprocessing import StandardScaler
import sqlite3
import warnings
warnings.filterwarnings('ignore')
from pathlib import Path
import os
# Add project root to path for imports
def find_repo_root(start: Path) -> Path:
for candidate in [start, *start.parents]:
if (candidate / "src").exists():
return candidate
return start
quarto_project = Path(os.environ.get("QUARTO_PROJECT_DIR", str(Path.cwd())))
project_root = find_repo_root(quarto_project)
if str(project_root) not in sys.path:
sys.path.append(str(project_root))
from src.utils import get_data_path
DATA_AVAILABLE = False
well_df = pd.DataFrame()
selected_well = None
try:
# Load groundwater data directly from database using config-driven path
db_path = get_data_path("aquifer_db")
conn = sqlite3.connect(db_path)
# Query groundwater data with proper timestamp parsing
query = """
SELECT TIMESTAMP, Water_Surface_Elevation, P_Number
FROM OB_WELL_MEASUREMENTS_CHAMPAIGN_COUNTY
WHERE Water_Surface_Elevation IS NOT NULL
AND TIMESTAMP IS NOT NULL
ORDER BY TIMESTAMP
"""
gw_df = pd.read_sql_query(query, conn)
conn.close()
# Parse timestamps in US format (M/D/YYYY) - CRITICAL for correct date interpretation
gw_df['DateTime'] = pd.to_datetime(gw_df['TIMESTAMP'], format='%m/%d/%Y', errors='coerce')
gw_df = gw_df.dropna(subset=['DateTime'])
# Focus on a single well with good data coverage
well_counts = gw_df['P_Number'].value_counts()
selected_well = well_counts.index[0]
well_df = gw_df[gw_df['P_Number'] == selected_well].copy()
# Sort by date and create time features
well_df = well_df.sort_values('DateTime')
well_df['Year'] = well_df['DateTime'].dt.year
well_df['Month'] = well_df['DateTime'].dt.month
well_df['DayOfYear'] = well_df['DateTime'].dt.dayofyear
# Use normalized date for consistent merge
well_df['Date'] = well_df['DateTime'].dt.normalize()
# Load REAL weather data from warm.db - use WarmICNData (hourly) and aggregate to daily
weather_db_path = get_data_path("warm_db")
weather_conn = sqlite3.connect(weather_db_path)
# WarmICNData has hourly data with nPrecipHrly and nAirTemp columns
weather_query = """
SELECT nDateTime as DateTime, nPrecipHrly as Precip, nAirTemp as Temp
FROM WarmICNData
WHERE nPrecipHrly IS NOT NULL AND nAirTemp IS NOT NULL
ORDER BY nDateTime
LIMIT 500000
"""
weather_df = pd.read_sql_query(weather_query, weather_conn)
weather_conn.close()
weather_df['DateTime'] = pd.to_datetime(weather_df['DateTime'], errors='coerce')
weather_df = weather_df.dropna(subset=['DateTime'])
# Use normalized date for consistent merge
weather_df['Date'] = weather_df['DateTime'].dt.normalize()
# Aggregate hourly to daily: sum precip, mean temp
weather_daily = weather_df.groupby('Date').agg({
'Precip': 'sum', # Sum hourly precip to get daily total
'Temp': 'mean' # Average temp across hours
}).reset_index()
# Merge weather with groundwater data
well_df = well_df.merge(weather_daily, on='Date', how='inner')
# Use only recent data for better temporal consistency (weather data starts 2012)
well_df = well_df[well_df['Year'] >= 2015].copy()
well_df = well_df.reset_index(drop=True)
if len(well_df) > 0:
DATA_AVAILABLE = True
print(f"✓ Data loaded successfully")
print(f" Observations: {len(well_df)}")
print(f" Well ID: {selected_well}")
print(f" Date range: {well_df['DateTime'].min().strftime('%Y-%m-%d')} to {well_df['DateTime'].max().strftime('%Y-%m-%d')}")
print(f" Water level range: {well_df['Water_Surface_Elevation'].min():.2f} to {well_df['Water_Surface_Elevation'].max():.2f} m")
else:
print("⚠️ No overlapping data between groundwater and weather records")
print(" Visualizations will show placeholder messages")
except Exception as e:
print(f"⚠️ Error loading groundwater/weather data: {e}")
print(f" Sources: aquifer.db (OB_WELL_MEASUREMENTS_CHAMPAIGN_COUNTY), warm.db (WarmICNData)")
print(" Visualizations will show placeholder messages")
```
## Bayesian Linear Regression (Closed-Form)
::: {.callout-note icon=false}
## 📘 Understanding Bayesian Linear Regression
### What Is It?
**Bayesian linear regression** applies Bayesian inference to linear models, treating regression coefficients (slopes, intercepts) as **probability distributions** rather than single fixed values. Unlike frequentist regression (ordinary least squares) which gives point estimates, Bayesian regression quantifies uncertainty in every parameter.
**Historical Context:** Linear regression dates to Gauss (1809) and Legendre (1805), but Bayesian treatment emerged much later with computational advances. Harold Jeffreys (1939) developed Bayesian regression theory, but practical application awaited computers (1990s+). Now standard in uncertain environments like climate modeling and hydrology.
### Why Does It Matter for Groundwater Management?
Traditional regression: "Water level = 200 + 0.5 × precipitation"
- **Problem**: What if coefficient is really 0.3 or 0.7? How confident are we?
Bayesian regression: "Water level = 200 ± 5 + (0.5 ± 0.1) × precipitation"
- **Benefit**: Uncertainty in coefficients → uncertainty in predictions → quantified risk
**Management Impact:**
- **Drought planning**: "90% probability pumping reduces levels by 2-8m" (not just "5m")
- **Infrastructure design**: Size treatment plant for 95th percentile forecast, not mean
- **Monitoring decisions**: "Collect more data" when credible intervals are wide
### How Does It Work?
**Standard Regression (Frequentist):**
1. Find coefficients that minimize squared error
2. Get single best-fit line
3. No uncertainty in coefficients themselves
**Bayesian Regression:**
1. **Prior on coefficients**: Before seeing data, assume coefficients are normally distributed
- Example: β_precip ~ Normal(0, 10) → "Could be anywhere from -30 to +30"
2. **Likelihood**: How well do coefficient values explain observed water levels?
3. **Posterior on coefficients**: Updated distribution after seeing data
- Example: β_precip ~ Normal(0.5, 0.1) → "Probably 0.4-0.6 with 95% confidence"
4. **Posterior predictive**: Predictions incorporate uncertainty in coefficients + noise
**Key Difference:**
- Frequentist: β = 0.5 (fixed value, confidence interval confusing to interpret)
- Bayesian: β ~ Normal(0.5, 0.1) (distribution, credible interval directly interpretable)
### Choosing Appropriate Priors
**Prior selection is critical**—it encodes your pre-data beliefs. Here's practical guidance:
| Prior Type | When to Use | Example for Groundwater |
|-----------|-------------|------------------------|
| **Uninformative** (wide) | No prior knowledge, want data to dominate | β ~ Normal(0, 100) for unknown effect |
| **Weakly informative** | Physical constraints exist but uncertain | β_precip ~ Normal(0, 10) — allows wide range but not extreme |
| **Informative** | Strong prior knowledge from literature | K ~ LogNormal(2.5, 0.5) — hydraulic conductivity from similar aquifers |
| **Regularizing** | Prevent overfitting with many parameters | β ~ Normal(0, 1) — shrinks coefficients toward zero |
**Physical Constraints as Priors:**
- **Hydraulic conductivity (K)**: Must be positive → use LogNormal or Gamma
- **Storativity (S)**: Must be 0-1 → use Beta distribution
- **Precipitation effect**: Expect positive → use HalfNormal or truncated Normal
- **Recharge rate**: Bounded by precipitation → use Uniform(0, max_precip)
**Red Flags in Prior Selection:**
| Problem | Symptom | Fix |
|---------|---------|-----|
| **Prior too narrow** | Posterior = prior (data ignored) | Widen prior variance |
| **Prior too wide** | Slow convergence, extreme samples | Use domain knowledge to constrain |
| **Wrong support** | Negative K values, S > 1 | Use appropriate distribution family |
| **Prior-data conflict** | Bimodal posterior, poor fit | Re-examine assumptions or data |
### What Will You See Below?
- **Prior → Posterior plots**: How data narrows uncertainty (wide prior → sharp posterior)
- **Coefficient distributions**: Histograms showing probability of different coefficient values
- **Prediction intervals**: Bands showing 68% and 95% credible regions
- **Parameter correlations**: How uncertainty in one coefficient affects another
### How to Interpret Bayesian Regression Results
| Result | Interpretation | Management Action |
|--------|----------------|-------------------|
| **β_mean = 0.5, CI [0.4, 0.6]** | Strong positive effect with low uncertainty | Confident prediction—proceed with planning |
| **β_mean = 0.5, CI [-0.2, 1.2]** | Uncertain effect, crosses zero | Collect more data before decisions |
| **Posterior much narrower than prior** | Data highly informative | Trust predictions, data quality good |
| **Posterior ≈ prior** | Data uninformative | Poor data quality or wrong model |
| **95% CI excludes zero** | Effect is real (95% confident) | Include variable in operational models |
| **95% CI includes zero** | Effect unclear | May be noise, consider removing |
**Prediction Intervals:**
- **68% band** (±1σ): Normal operating conditions expected here
- **95% band** (±2σ): Design margins should cover this range
- **Observations outside bands**: Model underestimates uncertainty or wrong model
**Management Example:**
If 95% credible interval for "precipitation effect" is [0.3, 0.7]:
- Plan for range, not just mean (0.5)
- Wet year (high precip) could increase levels by 30-70% of expectation
- Size infrastructure to handle upper bound (0.7), not average
:::
Start simple: Bayesian linear regression with conjugate priors.
```{python}
#| code-fold: true
def bayesian_linear_regression(X, y, alpha_prior=1.0, beta_prior=1.0):
"""
Bayesian linear regression with conjugate normal-inverse-gamma prior.
Parameters:
- X: Design matrix (n x p)
- y: Target vector (n,)
- alpha_prior: Prior precision (inverse variance) for coefficients
- beta_prior: Prior precision for noise
Returns:
- posterior_mean: Posterior mean of coefficients
- posterior_cov: Posterior covariance of coefficients
- posterior_alpha: Posterior alpha (updated precision)
- posterior_beta: Posterior beta (updated precision)
"""
n, p = X.shape
# Prior precision matrix
Lambda_prior = alpha_prior * np.eye(p)
# Posterior precision
Lambda_post = Lambda_prior + beta_prior * (X.T @ X)
Sigma_post = np.linalg.inv(Lambda_post)
# Posterior mean
mu_post = beta_prior * Sigma_post @ (X.T @ y)
# Posterior alpha and beta (for predictive distribution)
alpha_post = alpha_prior + n / 2
beta_post = beta_prior + 0.5 * (y.T @ y - mu_post.T @ Lambda_post @ mu_post)
return mu_post, Sigma_post, alpha_post, beta_post
# Initialize variables for later use
X = np.array([])
y = np.array([])
X_scaled = np.array([])
y_scaled = np.array([])
X_design = np.array([])
mu_post = np.array([0, 0, 0])
Sigma_post = np.eye(3)
alpha_post = 1.0
beta_post = 1.0
scaler_X = None
scaler_y = None
if not DATA_AVAILABLE:
print("⚠️ No data available for Bayesian regression analysis")
else:
# Prepare data
X = well_df[['Precip', 'Temp']].values
y = well_df['Water_Surface_Elevation'].values
# Standardize - fit_transform expects 2D array for X, 1D reshaped for y
scaler_X = StandardScaler()
scaler_y = StandardScaler()
# X is already 2D (n_samples, 2 features), so fit_transform works directly
X_scaled = scaler_X.fit_transform(X)
# y needs to be reshaped to 2D for fit_transform, then flattened back to 1D
y_scaled = scaler_y.fit_transform(y.reshape(-1, 1)).flatten()
# Add intercept column to create design matrix
X_design = np.column_stack([np.ones(len(X_scaled)), X_scaled])
# Fit Bayesian linear regression
mu_post, Sigma_post, alpha_post, beta_post = bayesian_linear_regression(
X_design, y_scaled,
alpha_prior=1.0,
beta_prior=1.0
)
print("\nBayesian Linear Regression Results:")
print(f"Posterior coefficient means: {mu_post}")
print(f"Posterior coefficient std: {np.sqrt(np.diag(Sigma_post))}")
```
## Posterior Predictive Distribution
::: {.callout-note icon=false}
## 📘 Understanding Posterior Predictive Distribution
**What Is It?**
The **posterior predictive distribution** is the Bayesian answer to "What will future observations look like?" It incorporates two sources of uncertainty:
1. **Parameter uncertainty**: We're not sure what the true coefficients are (posterior distribution)
2. **Observation noise**: Even with true coefficients, measurements vary randomly
**Mathematical Form:**
```
P(y_new | data) = ∫ P(y_new | θ) × P(θ | data) dθ
```
Translation: "Average predictions over all plausible parameter values, weighted by their probability."
**Why Does It Matter?**
Most predictions ignore parameter uncertainty—they plug in point estimates and add noise. This **underestimates uncertainty**, leading to:
- Over-confident forecasts ("water level will be 200±2m" when really 200±8m)
- Infrastructure failures (designed for 95th percentile that's actually 85th)
- Poor risk management (probability estimates too optimistic)
**Posterior predictive fixes this** by propagating parameter uncertainty through to predictions.
**How Does It Work?**
**Standard Prediction (Wrong):**
1. Estimate coefficients: β = 0.5
2. Predict: y = β × x = 0.5 × x
3. Add noise: y ± 2 (standard error of observations)
4. **Problem**: Ignores uncertainty in β!
**Posterior Predictive (Correct):**
1. Sample coefficient from posterior: β ~ Normal(0.5, 0.1)
2. For each β sample, predict: y = β × x + noise
3. Collect all predictions → posterior predictive distribution
4. **Result**: Wider, more honest uncertainty bands
**Intuition:** If you're not sure whether the slope is 0.4 or 0.6, your predictions should reflect that uncertainty—not just pick 0.5 and pretend you're certain.
**How to Interpret Posterior Predictive Results:**
| Observation vs Prediction | Interpretation | Action |
|---------------------------|----------------|--------|
| **In 68% band** | Normal outcome, model working | Continue as planned |
| **In 95% band, outside 68%** | Unusual but plausible | Monitor closely, update model |
| **Outside 95% band** | Extreme event or model failure | Investigate—data error or missing physics? |
| **Many obs outside bands** | Model underestimates uncertainty | Revise model structure or priors |
| **All obs inside 68% band** | Model overestimates uncertainty | Priors too vague or overfitting |
**Calibration Check:**
- Good model: ~68% of observations in 68% band, ~95% in 95% band
- Under-confident: >68% in 68% band (bands too wide)
- Over-confident: <68% in 68% band (bands too narrow)
**Management Example:**
Drought threshold = 195m. Posterior predictive says:
- Mean prediction: 200m
- 95% credible interval: [190m, 210m]
- **P(below 195m) = 30%**
Action: 30% drought risk triggers contingency planning (not 0% from point estimate!).
:::
```{python}
#| code-fold: true
def predict_bayesian(X_new, mu_post, Sigma_post, alpha_post, beta_post):
"""
Compute posterior predictive distribution.
Returns mean and variance for each prediction.
"""
# Predictive mean
y_pred_mean = X_new @ mu_post
# Predictive variance (includes parameter uncertainty + noise)
y_pred_var = np.array([
x.T @ Sigma_post @ x + 1.0 / beta_post
for x in X_new
])
return y_pred_mean, y_pred_var
# Initialize prediction variables
y_pred_mean_orig = np.array([])
y_pred_std_orig = np.array([])
y_pred_lower = np.array([])
y_pred_upper = np.array([])
if not DATA_AVAILABLE:
print("⚠️ No data available for posterior prediction")
else:
# Predictions on training data
y_pred_mean, y_pred_var = predict_bayesian(X_design, mu_post, Sigma_post, alpha_post, beta_post)
y_pred_std = np.sqrt(y_pred_var)
# Convert back to original scale
y_pred_mean_orig = scaler_y.inverse_transform(y_pred_mean.reshape(-1, 1)).flatten()
y_pred_std_orig = y_pred_std * scaler_y.scale_
# 95% credible interval
y_pred_lower = y_pred_mean_orig - 1.96 * y_pred_std_orig
y_pred_upper = y_pred_mean_orig + 1.96 * y_pred_std_orig
print(f"\nPredictive uncertainty (mean std): {y_pred_std_orig.mean():.3f} m")
```
## Visualization 1: Prior vs Posterior Distributions
Show how data updates our beliefs about parameters:
```{python}
#| code-fold: true
if not DATA_AVAILABLE:
print("⚠️ No data available for prior/posterior visualization")
fig = go.Figure()
fig.add_annotation(text="No data available for analysis", xref="paper", yref="paper",
x=0.5, y=0.5, showarrow=False, font=dict(size=16))
fig.update_layout(height=400, template='plotly_white')
fig.show()
else:
# Define priors (weakly informative)
prior_mean = np.zeros(3)
prior_std = np.sqrt(10.0) * np.ones(3) # Broad prior
# Generate prior and posterior distributions
param_names = ['β₀ (Intercept)', 'β₁ (Precipitation)', 'β₂ (Temperature)']
x_range = np.linspace(-3, 3, 1000)
fig = make_subplots(
rows=1, cols=3,
subplot_titles=param_names,
horizontal_spacing=0.1
)
for i in range(3):
# Prior distribution
prior_pdf = stats.norm.pdf(x_range, prior_mean[i], prior_std[i])
# Posterior distribution
posterior_pdf = stats.norm.pdf(x_range, mu_post[i], np.sqrt(Sigma_post[i, i]))
# Prior
fig.add_trace(
go.Scatter(
x=x_range,
y=prior_pdf,
mode='lines',
line=dict(color='gray', width=2, dash='dash'),
name='Prior' if i == 0 else None,
showlegend=(i == 0),
legendgroup='prior'
),
row=1, col=i+1
)
# Posterior
fig.add_trace(
go.Scatter(
x=x_range,
y=posterior_pdf,
mode='lines',
line=dict(color='#7c3aed', width=3),
fill='tozeroy',
fillcolor='rgba(124, 58, 237, 0.3)',
name='Posterior' if i == 0 else None,
showlegend=(i == 0),
legendgroup='posterior'
),
row=1, col=i+1
)
# Mark posterior mean
fig.add_vline(
x=mu_post[i],
line_dash='dot',
line_color='#7c3aed',
row=1, col=i+1
)
# Update axes
fig.update_xaxes(title_text='Parameter Value (standardized)', row=1, col=i+1)
fig.update_yaxes(title_text='Density' if i == 0 else '', row=1, col=i+1)
fig.update_layout(
title='Prior vs Posterior Distributions<br><sub>Data narrows uncertainty from broad prior to sharp posterior</sub>',
height=400,
showlegend=True
)
fig.show()
# Print summary
print("\nPrior → Posterior Update:")
for i, name in enumerate(param_names):
reduction = (1 - np.sqrt(Sigma_post[i, i]) / prior_std[i]) * 100
print(f" {name}: Uncertainty reduced by {reduction:.1f}%")
```
## Visualization 2: Prediction Intervals with Uncertainty Bands
```{python}
#| code-fold: true
# Sample for visibility
sample_idx = np.linspace(0, len(well_df)-1, min(500, len(well_df)), dtype=int)
fig = go.Figure()
# 95% Credible interval
fig.add_trace(go.Scatter(
x=well_df['DateTime'].iloc[sample_idx],
y=y_pred_upper[sample_idx],
mode='lines',
line=dict(width=0),
showlegend=False,
hoverinfo='skip'
))
fig.add_trace(go.Scatter(
x=well_df['DateTime'].iloc[sample_idx],
y=y_pred_lower[sample_idx],
mode='lines',
line=dict(width=0),
fillcolor='rgba(124, 58, 237, 0.2)',
fill='tonexty',
name='95% Credible Interval',
hoverinfo='skip'
))
# 68% Credible interval (±1σ)
y_pred_lower_68 = y_pred_mean_orig - y_pred_std_orig
y_pred_upper_68 = y_pred_mean_orig + y_pred_std_orig
fig.add_trace(go.Scatter(
x=well_df['DateTime'].iloc[sample_idx],
y=y_pred_upper_68[sample_idx],
mode='lines',
line=dict(width=0),
showlegend=False,
hoverinfo='skip'
))
fig.add_trace(go.Scatter(
x=well_df['DateTime'].iloc[sample_idx],
y=y_pred_lower_68[sample_idx],
mode='lines',
line=dict(width=0),
fillcolor='rgba(124, 58, 237, 0.4)',
fill='tonexty',
name='68% Credible Interval',
hoverinfo='skip'
))
# Predictive mean
fig.add_trace(go.Scatter(
x=well_df['DateTime'].iloc[sample_idx],
y=y_pred_mean_orig[sample_idx],
mode='lines',
line=dict(color='#7c3aed', width=2),
name='Posterior Mean'
))
# Actual observations
fig.add_trace(go.Scatter(
x=well_df['DateTime'].iloc[sample_idx],
y=y[sample_idx],
mode='markers',
marker=dict(size=4, color='black', opacity=0.5),
name='Observations'
))
fig.update_layout(
title='Bayesian Prediction Intervals<br><sub>Wider bands = higher uncertainty in predictions</sub>',
xaxis_title='Date',
yaxis_title='Water Surface Elevation (m)',
height=600,
hovermode='x unified'
)
fig.show()
# Calculate coverage
in_95_interval = ((y >= y_pred_lower) & (y <= y_pred_upper)).mean() * 100
in_68_interval = ((y >= y_pred_lower_68) & (y <= y_pred_upper_68)).mean() * 100
print(f"\nInterval Coverage (should match nominal if model is calibrated):")
print(f" 95% interval contains {in_95_interval:.1f}% of observations")
print(f" 68% interval contains {in_68_interval:.1f}% of observations")
```
## Visualization 3: Parameter Correlation Heatmap
::: {.callout-note icon=false}
## 📊 Reading the Parameter Correlation Heatmap
**What This Visualization Shows:**
The correlation heatmap displays dependencies between estimated parameters. When parameters are correlated, uncertainty in one propagates to uncertainty in the other.
**Color Scale Interpretation:**
| Color | Correlation | Physical Meaning |
|-------|-------------|------------------|
| **Dark Red (+1.0)** | Perfect positive | Parameters increase together (rare in practice) |
| **Light Red (+0.3 to +0.7)** | Moderate positive | Some dependency—uncertainty shared |
| **White (0)** | No correlation | Independent—can estimate separately |
| **Light Blue (-0.3 to -0.7)** | Moderate negative | Trade-off—increasing one decreases other |
| **Dark Blue (-1.0)** | Perfect negative | Full trade-off (model identifiability issue) |
**What to Look For:**
1. **Diagonal (always 1.0)**: Each parameter is perfectly correlated with itself
2. **Off-diagonal near zero**: Good! Parameters independently estimable
3. **Off-diagonal > 0.5**: Concern—uncertainty propagates between parameters
4. **Off-diagonal > 0.8**: Problem—parameters confounded (consider reparameterization)
**Common Patterns:**
| Pattern | Cause | Solution |
|---------|-------|----------|
| **Intercept ↔ Slope correlation** | Mean-centering not done | Center predictors around mean |
| **Precip ↔ Temp correlation** | Seasonal confounding | Include explicit seasonal term |
| **All parameters correlated** | Multicollinearity | Reduce predictors or use regularization |
**Management Implication:**
High correlation between β_precip and β_temp means: "If precipitation effect is actually higher, temperature effect must be lower to fit the same data." Your prediction may be okay, but attribution to specific drivers is uncertain.
:::
Understanding parameter dependencies:
```{python}
#| code-fold: true
# Compute correlation matrix from posterior covariance
posterior_corr = np.zeros_like(Sigma_post)
for i in range(3):
for j in range(3):
posterior_corr[i, j] = Sigma_post[i, j] / np.sqrt(Sigma_post[i, i] * Sigma_post[j, j])
param_labels = ['β₀<br>(Intercept)', 'β₁<br>(Precip)', 'β₂<br>(Temp)']
fig = go.Figure(data=go.Heatmap(
z=posterior_corr,
x=param_labels,
y=param_labels,
colorscale='RdBu',
zmid=0,
zmin=-1,
zmax=1,
text=np.round(posterior_corr, 3),
texttemplate='%{text}',
textfont={"size": 14},
colorbar=dict(title='Correlation')
))
fig.update_layout(
title='Posterior Parameter Correlations<br><sub>Reveals dependencies between parameters</sub>',
height=500,
width=600,
xaxis=dict(side='bottom'),
yaxis=dict(autorange='reversed')
)
fig.show()
print("\nParameter Correlation Interpretation:")
print(f" β₀ ↔ β₁: {posterior_corr[0, 1]:.3f}")
print(f" β₀ ↔ β₂: {posterior_corr[0, 2]:.3f}")
print(f" β₁ ↔ β₂: {posterior_corr[1, 2]:.3f}")
print("\nNote: Near-zero correlations mean parameters are independently estimable")
print(" High correlations indicate trade-offs (changing one affects another)")
```
## Markov Chain Monte Carlo
::: {.callout-note icon=false}
## 📘 Understanding MCMC Sampling
### What Is It?
**Markov Chain Monte Carlo** (MCMC) is a computational algorithm that generates samples from complex probability distributions by constructing a Markov chain. Developed during the Manhattan Project (1940s) to simulate neutron diffusion, MCMC enables Bayesian inference for models too complex for closed-form solutions.
**Historical Context:** The Metropolis algorithm (1953) and Metropolis-Hastings generalization (1970) made Bayesian statistics practical. Before MCMC, Bayesian inference was limited to simple models with conjugate priors. MCMC unlocked arbitrarily complex models.
### Why Does It Matter?
For simple models (linear regression), we can compute posteriors directly. For realistic aquifer models (nonlinear, hierarchical, spatial), direct computation is impossible. MCMC provides the only practical way to obtain posterior distributions.
**Enables:**
- Nonlinear relationships (e.g., exponential decay of pumping impact)
- Hierarchical models (well-specific + regional parameters)
- Spatial models (kriging, Gaussian processes)
- Missing data imputation
### How Does It Work?
MCMC explores the posterior distribution by random walk:
**Step 1: Start Anywhere**
- Initialize parameters at random (or educated guess)
**Step 2: Propose New Values**
- Randomly perturb current parameters: θ_new = θ_current + noise
**Step 3: Accept or Reject**
- If new values fit data better → accept
- If worse → sometimes accept anyway (allows exploration)
**Step 4: Repeat Thousands of Times**
- Build up a "cloud" of samples
- Density of cloud = posterior probability
**Intuition:** Like a blind person exploring a landscape by taking random steps. Dense samples where terrain is low (high probability), sparse samples on peaks (low probability). After enough steps, the distribution of visited points matches the posterior.
### What Will You See Below?
- **Trace plots**: Parameter values over iterations (should look like "fuzzy caterpillar")
- **Posterior histograms**: Distribution of sampled parameter values
- **Acceptance rate**: Fraction of proposals accepted (~20-50% is good)
- **Burn-in**: Initial samples discarded (before chain equilibrates)
### How to Interpret MCMC Diagnostics
| Diagnostic | Good | Bad | Fix |
|-----------|------|-----|-----|
| **Trace plot** | Fuzzy caterpillar, no trends | Stuck, trending, spiky | Adjust proposal std, run longer |
| **Acceptance rate** | 20-50% | <10% or >90% | Tune proposal distribution |
| **Burn-in** | First 20-50% of samples | - | Discard early samples |
| **Autocorrelation** | Decays quickly | Stays high | Thin samples, run longer |
| **Multiple chains** | All converge to same distribution | Diverge | Model may be unidentified |
**Common Issues:**
- **Stuck chain**: Proposal std too small → tiny steps, slow exploration
- **Rejected chain**: Proposal std too large → wild jumps, all rejected
- **Trending chain**: Haven't reached equilibrium → run longer burn-in
### Gelman-Rubin Diagnostic (R̂)
The **Gelman-Rubin statistic** (R̂, "R-hat") is the gold standard for MCMC convergence. It compares variance *within* chains to variance *between* chains.
**Formula:**
$$\hat{R} = \sqrt{\frac{\text{Var}_{\text{between chains}} + \text{Var}_{\text{within chains}}}{\text{Var}_{\text{within chains}}}}$$
**Interpretation:**
| R̂ Value | Status | Action |
|---------|--------|--------|
| **< 1.01** | Excellent convergence | Safe to use samples |
| **1.01 - 1.05** | Acceptable | Probably okay, consider more iterations |
| **1.05 - 1.10** | Borderline | Run longer, check trace plots |
| **> 1.10** | Not converged | Do NOT use—run much longer or fix model |
**Why R̂ > 1 Indicates Problems:**
- **R̂ = 1.0**: All chains exploring same region → agreement → convergence
- **R̂ > 1.0**: Chains disagree on posterior location → haven't mixed → not converged
**Practical Guidance:**
1. **Run multiple chains** (≥4): Single chain can fool you
2. **Check R̂ for ALL parameters**: One bad R̂ invalidates all results
3. **Combine with visual checks**: Trace plots should look like "fuzzy caterpillars"
4. **Report R̂ in publications**: Reviewers expect convergence evidence
**If R̂ > 1.10:**
- Increase iterations (try 10× more)
- Adjust proposal distribution
- Reparameterize model (e.g., center predictors)
- Consider simpler model (overparameterization)
:::
For more complex models, use sampling:
```{python}
#| code-fold: true
def metropolis_hastings_mcmc(X, y, n_iterations=10000, burn_in=2000):
"""
MCMC sampling for Bayesian linear regression.
Uses Metropolis-Hastings algorithm.
"""
n, p = X.shape
# Initialize
beta = np.zeros(p)
sigma2 = 1.0
# Storage
samples_beta = []
samples_sigma2 = []
# Proposal std
proposal_std_beta = 0.1
proposal_std_sigma2 = 0.1
accepted = 0
for i in range(n_iterations):
# Propose new beta
beta_new = beta + np.random.normal(0, proposal_std_beta, p)
# Propose new sigma2
sigma2_new = sigma2 + np.random.normal(0, proposal_std_sigma2)
if sigma2_new <= 0: # Reject if negative
sigma2_new = sigma2
# Compute log-likelihood
residuals = y - X @ beta
residuals_new = y - X @ beta_new
log_lik = -0.5 * n * np.log(2 * np.pi * sigma2) - 0.5 * (residuals ** 2).sum() / sigma2
log_lik_new = -0.5 * n * np.log(2 * np.pi * sigma2_new) - 0.5 * (residuals_new ** 2).sum() / sigma2_new
# Prior (weakly informative)
log_prior = -0.5 * (beta ** 2).sum() / 10 - 1.0 / sigma2
log_prior_new = -0.5 * (beta_new ** 2).sum() / 10 - 1.0 / sigma2_new
# Acceptance ratio
log_alpha = (log_lik_new + log_prior_new) - (log_lik + log_prior)
alpha = min(1, np.exp(log_alpha))
# Accept/reject
if np.random.rand() < alpha:
beta = beta_new
sigma2 = sigma2_new
accepted += 1
# Store samples (after burn-in)
if i >= burn_in:
samples_beta.append(beta.copy())
samples_sigma2.append(sigma2)
acceptance_rate = accepted / n_iterations
return np.array(samples_beta), np.array(samples_sigma2), acceptance_rate
# Initialize MCMC outputs
samples_beta = np.zeros((1, 3))
samples_sigma2 = np.ones(1)
acceptance_rate = 0.0
if not DATA_AVAILABLE:
print("⚠️ No data available for MCMC analysis")
else:
# Run MCMC
print("\nRunning MCMC (this may take a moment)...")
samples_beta, samples_sigma2, acceptance_rate = metropolis_hastings_mcmc(
X_design, y_scaled,
n_iterations=10000,
burn_in=2000
)
print(f"MCMC complete. Acceptance rate: {acceptance_rate:.2%}")
print(f"Posterior samples: {len(samples_beta)}")
```
## MCMC Diagnostics
```{python}
#| code-fold: true
# Trace plots and posterior distributions
if not DATA_AVAILABLE:
fig = go.Figure()
fig.add_annotation(
text="⚠️ No data available for MCMC diagnostics",
xref="paper", yref="paper", x=0.5, y=0.5,
showarrow=False, font=dict(size=16)
)
fig.update_layout(title='MCMC Diagnostics', height=400)
fig.show()
else:
fig = make_subplots(
rows=3, cols=2,
subplot_titles=(
'β₀ Trace', 'β₀ Posterior',
'β₁ (Precip) Trace', 'β₁ Posterior',
'β₂ (Temp) Trace', 'β₂ Posterior'
)
)
for i in range(3):
# Trace plot
fig.add_trace(
go.Scatter(
y=samples_beta[:, i],
mode='lines',
line=dict(color='steelblue', width=0.5),
showlegend=False
),
row=i+1, col=1
)
# Posterior histogram
fig.add_trace(
go.Histogram(
x=samples_beta[:, i],
nbinsx=50,
marker_color='coral',
showlegend=False
),
row=i+1, col=2
)
# Update axes
fig.update_xaxes(title_text='Iteration', row=i+1, col=1)
fig.update_xaxes(title_text=f'β{i}', row=i+1, col=2)
fig.update_yaxes(title_text=f'β{i}', row=i+1, col=1)
fig.update_yaxes(title_text='Frequency', row=i+1, col=2)
fig.update_layout(
title_text='MCMC Diagnostics: Coefficient Posteriors',
height=1000,
showlegend=False
)
fig.show()
```
## Posterior Summary Statistics
```{python}
#| code-fold: true
# Initialize defaults for variables used later
beta_means = np.array([0.0, 0.0, 0.0])
beta_std = np.array([1.0, 1.0, 1.0])
beta_ci_lower = np.array([-1.0, -1.0, -1.0])
beta_ci_upper = np.array([1.0, 1.0, 1.0])
prob_precip_positive = 0.5
if not DATA_AVAILABLE:
print("⚠️ No data available for posterior summary")
else:
# Compute credible intervals
beta_means = samples_beta.mean(axis=0)
beta_std = samples_beta.std(axis=0)
beta_ci_lower = np.percentile(samples_beta, 2.5, axis=0)
beta_ci_upper = np.percentile(samples_beta, 97.5, axis=0)
posterior_summary = pd.DataFrame({
'Parameter': ['Intercept', 'Precip_cum_30d', 'Temperature'],
'Mean': beta_means,
'Std': beta_std,
'2.5%': beta_ci_lower,
'97.5%': beta_ci_upper
})
print("\nPosterior Summary:")
print(posterior_summary.to_string(index=False))
# Probability that precipitation coefficient is positive
prob_precip_positive = (samples_beta[:, 1] > 0).mean()
print(f"\nP(β_precip > 0) = {prob_precip_positive:.2%}")
```
## Bayesian Model Comparison
Compare models with different feature sets using Bayes factors:
```{python}
#| code-fold: true
def compute_marginal_likelihood(X, y):
"""
Compute marginal likelihood (evidence) for model comparison.
Using Bayesian Information Criterion (BIC) approximation.
"""
n, p = X.shape
# Fit model (frequentist for simplicity)
beta_hat = np.linalg.lstsq(X, y, rcond=None)[0]
residuals = y - X @ beta_hat
rss = (residuals ** 2).sum()
# BIC approximation to log marginal likelihood
log_ml = -n/2 * np.log(2 * np.pi) - n/2 * np.log(rss/n) - p/2 * np.log(n)
return log_ml
# Initialize defaults for model_evidence (used in later visualizations)
model_evidence = {
'Full (Precip + Temp)': 0.0,
'Precip only': 0.0,
'Temp only': 0.0
}
full_model_ml = 0.0
if not DATA_AVAILABLE:
print("⚠️ No data available for Bayesian model comparison")
else:
# Compare models
models = {
'Full (Precip + Temp)': ['Precip', 'Temp'],
'Precip only': ['Precip'],
'Temp only': ['Temp']
}
model_evidence = {}
for model_name, features in models.items():
X_model = well_df[features].values
# Use a separate scaler for each model to avoid overwriting the main scaler_X
model_scaler = StandardScaler()
X_model_scaled = model_scaler.fit_transform(X_model)
X_model_design = np.column_stack([np.ones(len(X_model_scaled)), X_model_scaled])
log_ml = compute_marginal_likelihood(X_model_design, y_scaled)
model_evidence[model_name] = log_ml
# Bayes factors (relative to full model)
full_model_ml = model_evidence['Full (Precip + Temp)']
print("\nBayesian Model Comparison:")
for model_name, log_ml in model_evidence.items():
bayes_factor = np.exp(log_ml - full_model_ml)
print(f" {model_name}: log(ML)={log_ml:.1f}, BF={bayes_factor:.2f}")
print("\nInterpretation:")
print(" BF > 1: Model preferred over full model")
print(" BF < 1: Full model preferred")
```
## Visualization 2: Model Comparison
```{python}
#| code-fold: true
if not DATA_AVAILABLE:
fig = go.Figure()
fig.add_annotation(
text="⚠️ No data available for model comparison",
xref="paper", yref="paper", x=0.5, y=0.5,
showarrow=False, font=dict(size=16)
)
fig.update_layout(title='Bayesian Model Comparison', height=400)
fig.show()
else:
fig = go.Figure()
models_list = list(model_evidence.keys())
log_ml_values = [model_evidence[m] for m in models_list]
bayes_factors = [np.exp(log_ml - full_model_ml) for log_ml in log_ml_values]
fig.add_trace(go.Bar(
x=models_list,
y=bayes_factors,
marker_color=['green' if bf >= 1 else 'red' for bf in bayes_factors],
text=[f"BF={bf:.2f}" for bf in bayes_factors],
textposition='outside'
))
fig.add_hline(y=1, line_dash='dash', line_color='black', annotation_text='BF=1 (equal)')
fig.update_layout(
title='Bayesian Model Comparison<br><sub>Bayes Factor relative to full model</sub>',
xaxis_title='Model',
yaxis_title='Bayes Factor',
yaxis_type='log',
height=500
)
fig.show()
```
## Probabilistic Forecasting
Use posterior to generate probabilistic forecasts:
```{python}
#| code-fold: true
# Future scenario: Predict next 90 days using historical climatology
future_days = 90
# Initialize forecast variables with defaults (used in later visualizations)
forecast_samples = np.zeros((1, future_days))
forecast_mean = np.zeros(future_days)
forecast_p05 = np.zeros(future_days)
forecast_p25 = np.zeros(future_days)
forecast_p75 = np.zeros(future_days)
forecast_p95 = np.zeros(future_days)
if not DATA_AVAILABLE:
print("⚠️ No data available for probabilistic forecasting")
else:
# Use historical weather data to compute climatological normals for forecasting
# (Average weather for each day-of-year from historical record)
weather_daily['DayOfYear'] = pd.to_datetime(weather_daily['Date']).dt.dayofyear
climatology = weather_daily.groupby('DayOfYear').agg({
'Precip': 'mean',
'Temp': 'mean'
}).reset_index()
# Get day-of-year for forecast period (starting from last observation)
last_doy = well_df['DayOfYear'].iloc[-1]
future_doy = [(last_doy + i) % 365 + 1 for i in range(1, future_days + 1)]
# Look up climatological values for forecast days
future_precip = np.array([climatology.loc[climatology['DayOfYear'] == doy, 'Precip'].values[0]
if doy in climatology['DayOfYear'].values else climatology['Precip'].mean()
for doy in future_doy])
future_temp = np.array([climatology.loc[climatology['DayOfYear'] == doy, 'Temp'].values[0]
if doy in climatology['DayOfYear'].values else climatology['Temp'].mean()
for doy in future_doy])
X_future = np.column_stack([future_precip, future_temp])
X_future_scaled = scaler_X.transform(X_future)
X_future_design = np.column_stack([np.ones(len(X_future_scaled)), X_future_scaled])
# Sample from posterior predictive distribution
n_posterior_samples = 1000
forecast_samples = []
for i in range(n_posterior_samples):
# Sample parameters from posterior
beta_sample = samples_beta[np.random.randint(len(samples_beta))]
sigma2_sample = samples_sigma2[np.random.randint(len(samples_sigma2))]
# Generate predictions
y_pred = X_future_design @ beta_sample + np.random.normal(0, np.sqrt(sigma2_sample), future_days)
# Convert back to original scale
y_pred_orig = scaler_y.inverse_transform(y_pred.reshape(-1, 1)).flatten()
forecast_samples.append(y_pred_orig)
forecast_samples = np.array(forecast_samples)
# Compute quantiles
forecast_mean = forecast_samples.mean(axis=0)
forecast_p05 = np.percentile(forecast_samples, 5, axis=0)
forecast_p25 = np.percentile(forecast_samples, 25, axis=0)
forecast_p75 = np.percentile(forecast_samples, 75, axis=0)
forecast_p95 = np.percentile(forecast_samples, 95, axis=0)
print(f"\nProbabilistic Forecast ({future_days} days ahead):")
print(f" Mean: {forecast_mean.mean():.2f} m")
print(f" 90% interval: [{forecast_p05.mean():.2f}, {forecast_p95.mean():.2f}] m")
```
## Visualization 3: Probabilistic Forecast
```{python}
#| code-fold: true
# Initialize future_dates for use in later blocks
future_dates = pd.date_range(start='2020-01-01', periods=future_days, freq='D')
if not DATA_AVAILABLE:
fig = go.Figure()
fig.add_annotation(
text="⚠️ No data available for probabilistic forecast visualization",
xref="paper", yref="paper", x=0.5, y=0.5,
showarrow=False, font=dict(size=16)
)
fig.update_layout(title='Bayesian Probabilistic Forecast', height=400)
fig.show()
else:
future_dates = pd.date_range(
start=well_df['DateTime'].iloc[-1],
periods=future_days+1,
freq='D'
)[1:] # Exclude first (overlaps with data)
fig = go.Figure()
# 90% interval
fig.add_trace(go.Scatter(
x=future_dates,
y=forecast_p95,
mode='lines',
line=dict(width=0),
showlegend=False,
hoverinfo='skip'
))
fig.add_trace(go.Scatter(
x=future_dates,
y=forecast_p05,
mode='lines',
line=dict(width=0),
fillcolor='rgba(70, 130, 180, 0.2)',
fill='tonexty',
name='90% Interval',
hoverinfo='skip'
))
# 50% interval
fig.add_trace(go.Scatter(
x=future_dates,
y=forecast_p75,
mode='lines',
line=dict(width=0),
showlegend=False,
hoverinfo='skip'
))
fig.add_trace(go.Scatter(
x=future_dates,
y=forecast_p25,
mode='lines',
line=dict(width=0),
fillcolor='rgba(70, 130, 180, 0.4)',
fill='tonexty',
name='50% Interval',
hoverinfo='skip'
))
# Mean forecast
fig.add_trace(go.Scatter(
x=future_dates,
y=forecast_mean,
mode='lines',
line=dict(color='blue', width=2),
name='Posterior Mean'
))
# Historical data (last 180 days for context)
hist_idx = max(0, len(well_df) - 180)
fig.add_trace(go.Scatter(
x=well_df['DateTime'].iloc[hist_idx:],
y=y[hist_idx:],
mode='lines',
line=dict(color='black', width=1),
name='Historical'
))
fig.update_layout(
title='Bayesian Probabilistic Forecast<br><sub>Uncertainty from parameter posterior + noise</sub>',
xaxis_title='Date',
yaxis_title='Water Level (m)',
height=600,
hovermode='x unified'
)
fig.show()
```
## Risk Assessment
Calculate probability of critical thresholds:
```{python}
#| code-fold: true
# Initialize risk assessment variables with defaults
threshold_low = 0.0
threshold_high = 1.0
prob_below_low = np.zeros(future_days)
prob_above_high = np.zeros(future_days)
if not DATA_AVAILABLE:
print("⚠️ No data available for risk assessment")
else:
# Define critical thresholds
threshold_low = y.mean() - y.std() # Low water level (concern)
threshold_high = y.mean() + y.std() # High water level (flooding)
# Compute exceedance probabilities
prob_below_low = (forecast_samples < threshold_low).mean(axis=0)
prob_above_high = (forecast_samples > threshold_high).mean(axis=0)
print(f"\nRisk Assessment (90-day forecast):")
print(f" Probability of water level < {threshold_low:.2f} m: {prob_below_low.mean():.1%}")
print(f" Probability of water level > {threshold_high:.2f} m: {prob_above_high.mean():.1%}")
print(f" Days with >50% chance of low level: {(prob_below_low > 0.5).sum()}")
```
## Visualization 4: Risk Timeline
```{python}
#| code-fold: true
if not DATA_AVAILABLE:
fig = go.Figure()
fig.add_annotation(
text="⚠️ No data available for risk timeline visualization",
xref="paper", yref="paper", x=0.5, y=0.5,
showarrow=False, font=dict(size=16)
)
fig.update_layout(title='Bayesian Risk Assessment', height=400)
fig.show()
else:
fig = make_subplots(
rows=2, cols=1,
subplot_titles=('Exceedance Probabilities', 'Forecast Ensemble')
)
# Risk probabilities
fig.add_trace(
go.Scatter(
x=future_dates,
y=prob_below_low * 100,
mode='lines',
line=dict(color='red', width=2),
name='P(Low Water)',
fill='tozeroy',
fillcolor='rgba(255, 0, 0, 0.2)'
),
row=1, col=1
)
fig.add_trace(
go.Scatter(
x=future_dates,
y=prob_above_high * 100,
mode='lines',
line=dict(color='blue', width=2),
name='P(High Water)',
fill='tozeroy',
fillcolor='rgba(0, 0, 255, 0.2)'
),
row=1, col=1
)
# Threshold line
fig.add_hline(y=50, line_dash='dash', line_color='black', row=1, col=1)
# Ensemble forecast (sample 50 trajectories)
for i in range(min(50, len(forecast_samples))):
fig.add_trace(
go.Scatter(
x=future_dates,
y=forecast_samples[i],
mode='lines',
line=dict(color='gray', width=0.5),
opacity=0.1,
showlegend=False,
hoverinfo='skip'
),
row=2, col=1
)
# Mean
fig.add_trace(
go.Scatter(
x=future_dates,
y=forecast_mean,
mode='lines',
line=dict(color='blue', width=3),
name='Mean Forecast'
),
row=2, col=1
)
# Thresholds
fig.add_hline(y=threshold_low, line_dash='dash', line_color='red',
annotation_text='Low Threshold', row=2, col=1)
fig.add_hline(y=threshold_high, line_dash='dash', line_color='blue',
annotation_text='High Threshold', row=2, col=1)
fig.update_xaxes(title_text='Date', row=2, col=1)
fig.update_yaxes(title_text='Probability (%)', row=1, col=1)
fig.update_yaxes(title_text='Water Level (m)', row=2, col=1)
fig.update_layout(
title_text='Bayesian Risk Assessment',
height=800,
hovermode='x unified'
)
fig.show()
```
## Key Insights
::: {.callout-important icon=false}
## 🔍 Bayesian Uncertainty Findings
**Parameter Uncertainty:**
- **Precipitation effect**: β = {beta_means[1]:.3f} ± {beta_std[1]:.3f} (95% CI: [{beta_ci_lower[1]:.3f}, {beta_ci_upper[1]:.3f}])
- **Temperature effect**: β = {beta_means[2]:.3f} ± {beta_std[2]:.3f} (95% CI: [{beta_ci_lower[2]:.3f}, {beta_ci_upper[2]:.3f}])
- **Precipitation positive with {prob_precip_positive:.0%} probability**
**Model Selection:**
- **Best model**: {max(model_evidence, key=model_evidence.get)}
- **Evidence strength**: Moderate (BF ~ {max([np.exp(log_ml - full_model_ml) for log_ml in model_evidence.values()]):.1f})
**Forecast Uncertainty:**
- **Mean prediction**: {forecast_mean.mean():.2f} m
- **90% credible interval width**: {(forecast_p95.mean() - forecast_p05.mean()):.2f} m
- **Risk**: {prob_below_low.mean()*100:.1f}% chance of low water in next 90 days
:::
## Advantages of Bayesian Approach
```{python}
#| code-fold: true
print("\n=== Bayesian vs Frequentist ===")
print("\nFrequentist (Classical):")
print(" ✓ Point estimates (single best value)")
print(" ✓ Confidence intervals (hypothetical repeated sampling)")
print(" ✗ No probability statements about parameters")
print(" ✗ Cannot incorporate prior knowledge")
print("\nBayesian:")
print(" ✓ Full posterior distribution (all plausible values)")
print(" ✓ Credible intervals (directly interpretable probability)")
print(" ✓ Probability statements: P(β > 0) = {prob_precip_positive:.0%}")
print(" ✓ Sequential learning (update as new data arrive)")
print(" ✓ Risk quantification: P(water level < threshold)")
```
## Limitations
1. **Computational cost**: MCMC requires many iterations
2. **Prior sensitivity**: Results depend on prior choice (use weakly informative priors)
3. **Model complexity**: More complex models (nonlinear, hierarchical) require advanced MCMC
4. **Convergence**: Need to check diagnostics (trace plots, Gelman-Rubin statistic)
## References
- Gelman, A., et al. (2013). *Bayesian Data Analysis* (3rd ed.). CRC Press.
- Kruschke, J. K. (2014). *Doing Bayesian Data Analysis*. Academic Press.
- Cooley, D., & Sain, S. R. (2010). Spatial hierarchical modeling of precipitation extremes. *Journal of Agricultural, Biological, and Environmental Statistics*, 15(3), 381-402.
## Summary
Bayesian uncertainty modeling provides **rigorous probabilistic inference** for aquifer management:
✅ **Full posterior distributions** - Not just point estimates, but probability distributions over parameters
✅ **Credible intervals** - 95% intervals directly interpretable as "95% probability parameter lies here"
✅ **Risk quantification** - P(water level < threshold) enables proactive management
✅ **Prior incorporation** - Leverage domain knowledge (e.g., K must be positive)
✅ **Model comparison** - Bayes factors objectively compare competing hypotheses
**Key Insight:** Uncertainty is not a weakness—it's essential information for decision-making. A wide credible interval says "collect more data before drilling."
---
## Reflection Questions
- When communicating with decision makers, how would you explain the difference between a single deterministic forecast and a Bayesian forecast that includes credible intervals and explicit risk estimates?
- Looking at the posterior summaries and risk plots, what additional monitoring or data (for example, better precipitation records, pumping logs, or structural constraints) would most reduce the uncertainty that matters for your decisions?
- How would you decide whether the priors you use are appropriately “weakly informative” versus overly restrictive or too vague for your aquifer context?
- In which situations would you rely on Bayesian results to support a high‑stakes decision (for example, major infrastructure or policy changes), and when would you treat them as exploratory and push for more data first?
---
## Related Chapters
- [Value of Information](value-of-information.qmd) - Economic benefit of uncertainty reduction
- [Temporal Fusion Engine](temporal-fusion-engine.qmd) - Predictions requiring uncertainty
- [Explainable AI Insights](../part-5-operations/explainable-ai-insights.qmd) - Interpretability