37 Causal Discovery Network

Finding True Cause-Effect Relationships

For Newcomers (Advanced Chapter)

This is an advanced chapter. It is safe to skip on first reading.

You will get (if you continue): - An intuitive idea of what it means to say “X causes Y” in data. - A map-like view of how weather, streams, and groundwater influence each other.

Read these first: - At least one temporal chapter (e.g., Water Level Trends). - A fusion chapter such as Water Balance Closure.

You can skim formal causal discovery algorithms and focus on the diagrams and narrative explaining which variables influence each other and with what time lags.

Data Sources Fused: All 4 (with causal inference)

37.1 What You Will Learn in This Chapter

By the end of this chapter, you will be able to:

Explain in plain language what it means to infer a causal graph from multi-source time series and how this differs from correlation analysis.
Interpret simple causal diagrams and strength rankings that connect precipitation, temperature, groundwater levels, and streamflow.
Describe the main assumptions and limitations of Granger causality, transfer entropy, and VAR-based impulse response analysis in a hydrologic context.
Reflect on when causal discovery outputs are useful for generating hypotheses versus when they are strong enough to inform management decisions.

37.2 Overview

Correlation is not causation. The previous chapter showed that HTEM, weather, and streams correlate with water levels. But which variables cause changes? This chapter applies causal discovery methods to infer the directed graph of cause-effect relationships in the aquifer system.

💻 For Computer Scientists

Problem: Given multivariate time series, find the causal graph (DAG).

Methods: - Granger Causality: X Granger-causes Y if past values of X improve prediction of Y - Transfer Entropy: Information-theoretic measure of directed information flow - PC Algorithm: Constraint-based causal discovery using conditional independence tests - VAR Models: Vector autoregression to quantify lagged causal effects

Challenge: Distinguishing causation from confounding and indirect effects.

🌍 For Hydrologists

Expected Causal Structure:

Precipitation → Groundwater Level
      ↓
Stream Discharge ← Groundwater Level
      ↑
   Snowmelt

Key Questions: 1. Does precipitation directly cause water level changes, or is it mediated by soil moisture? 2. Do streams cause aquifer changes (recharge), or vice versa (baseflow)? 3. Is HTEM structure a confounder (affects both weather response and baseline levels)?

Physical constraints: Causation must respect temporal order and physical mechanisms.

37.3 Analysis Approach

Show code

import pandas as pd
import numpy as np
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import warnings
warnings.filterwarnings('ignore')

# Conditional imports for optional dependencies
try:
    from scipy.stats import entropy
    SCIPY_AVAILABLE = True
except ImportError:
    SCIPY_AVAILABLE = False
    # Fallback entropy calculation
    def entropy(pk, qk=None, base=None):
        pk = np.asarray(pk)
        pk = pk / pk.sum()
        if base is None:
            base = np.e
        return -np.sum(pk * np.log(pk + 1e-10)) / np.log(base)

try:
    from statsmodels.tsa.api import VAR
    STATSMODELS_AVAILABLE = True
except ImportError:
    STATSMODELS_AVAILABLE = False
    print("Note: statsmodels not available. VAR analysis will be limited.")

print("Causal discovery utilities initialized (data will be defined in subsequent cells)")

37.4 Data Preparation

Show code

from pathlib import Path
import sys
import os

# Add project root to path for imports
def find_repo_root(start: Path) -> Path:
    for candidate in [start, *start.parents]:
        if (candidate / "src").exists():
            return candidate
    return start

quarto_project = Path(os.environ.get("QUARTO_PROJECT_DIR", str(Path.cwd())))
project_root = find_repo_root(quarto_project)
if str(project_root) not in sys.path:
    sys.path.append(str(project_root))

from src.utils import get_data_path

# Build real multi-source daily dataset
aquifer_db_path = get_data_path("aquifer_db")
weather_db_path = get_data_path("warm_db")
usgs_stream_path = get_data_path("usgs_stream")

import sqlite3

# CRITICAL: Load each data source INDEPENDENTLY to avoid encoding fake relationships
# We let the causal discovery algorithms find true relationships from real data

# 1. Load precipitation and temperature from weather database
# Use WarmICNData (hourly) table - WarmICNFiveMin uses different column names
DATA_AVAILABLE = True
conn_weather = sqlite3.connect(weather_db_path)

# First check what tables exist
try:
    tables = pd.read_sql_query(
        "SELECT name FROM sqlite_master WHERE type='table' AND name LIKE 'Warm%'",
        conn_weather
    )
    print(f"Available weather tables: {tables['name'].tolist()}")
except Exception as e:
    print(f"⚠️ Could not query weather database: {e}")
    DATA_AVAILABLE = False

weather_query = """
SELECT nDateTime, nPrecipHrly as nPrecip, nAirTemp
FROM WarmICNData
WHERE nPrecipHrly IS NOT NULL
AND nAirTemp IS NOT NULL
LIMIT 500000
"""
try:
    weather_df = pd.read_sql_query(weather_query, conn_weather)
    if len(weather_df) == 0:
        print("⚠️ Weather query returned no data - trying WarmICNFiveMin...")
        # Fallback to WarmICNFiveMin which has nPrecip directly
        weather_query_alt = """
        SELECT nDateTime, nPrecip, nAirTemp
        FROM WarmICNFiveMin
        WHERE nPrecip IS NOT NULL
        AND nAirTemp IS NOT NULL
        LIMIT 500000
        """
        weather_df = pd.read_sql_query(weather_query_alt, conn_weather)
except Exception as e:
    print(f"⚠️ Weather data load failed: {e}")
    weather_df = pd.DataFrame(columns=['nDateTime', 'nPrecip', 'nAirTemp'])
    DATA_AVAILABLE = False

conn_weather.close()

if len(weather_df) == 0:
    print("⚠️ No weather data available for causal discovery analysis")
    DATA_AVAILABLE = False

weather_df['Date'] = pd.to_datetime(weather_df['nDateTime'], errors='coerce')
weather_df = weather_df.dropna(subset=['Date'])

# Aggregate to daily (sum precipitation, mean temperature)
weather_daily = weather_df.groupby(weather_df['Date'].dt.date).agg({
    'nPrecip': 'sum',
    'nAirTemp': 'mean'
}).reset_index()
weather_daily.columns = ['Date', 'Precipitation', 'Temperature']
weather_daily['Date'] = pd.to_datetime(weather_daily['Date'])

# 2. Load water levels from groundwater database
conn_gw = sqlite3.connect(aquifer_db_path)
gw_query = """
SELECT TIMESTAMP, Water_Surface_Elevation
FROM OB_WELL_MEASUREMENTS_CHAMPAIGN_COUNTY
WHERE Water_Surface_Elevation IS NOT NULL
AND TIMESTAMP IS NOT NULL
"""
gw_df = pd.read_sql_query(gw_query, conn_gw)
conn_gw.close()

gw_df['Date'] = pd.to_datetime(gw_df['TIMESTAMP'], format='%m/%d/%Y', errors='coerce')
gw_df = gw_df.dropna(subset=['Date'])

# Aggregate to daily mean water level
gw_daily = gw_df.groupby(gw_df['Date'].dt.date).agg({
    'Water_Surface_Elevation': 'mean'
}).reset_index()
gw_daily.columns = ['Date', 'WaterLevel']
gw_daily['Date'] = pd.to_datetime(gw_daily['Date'])

# 3. Load stream discharge from USGS data
# Files are in usgs_stream/daily_values/ subdirectory
usgs_daily_path = Path(usgs_stream_path) / "daily_values"
if not usgs_daily_path.exists():
    usgs_daily_path = Path(usgs_stream_path)  # Fallback to root

# Look for discharge files (parameter code 00060 = discharge in cfs)
usgs_files = list(usgs_daily_path.glob("*_00060_daily.csv"))

if len(usgs_files) == 0:
    print(f"Warning: No USGS discharge CSV files found in {usgs_daily_path}")
    stream_daily = pd.DataFrame(columns=['Date', 'StreamDischarge'])
else:
    # Use Salt Fork near St. Joseph (good long-term record) - site 03336900
    preferred_file = None
    for f in usgs_files:
        if '03336900' in f.name:
            preferred_file = f
            break
    if preferred_file is None:
        preferred_file = usgs_files[0]

    print(f"Loading stream data from: {preferred_file.name}")
    stream_df = pd.read_csv(preferred_file)

    # USGS files have datetime column and value columns like '48097_00060_00003'
    # Find the discharge value column (contains '00060' but not '_cd')
    discharge_col = None
    for col in stream_df.columns:
        if '00060' in col and '_cd' not in col and col != 'site_no' and col != 'agency_cd':
            # Check if this is the value column (not qualifier)
            if stream_df[col].dtype in ['float64', 'int64'] or pd.api.types.is_numeric_dtype(stream_df[col]):
                discharge_col = col
                break
            # Try converting
            try:
                pd.to_numeric(stream_df[col], errors='raise')
                discharge_col = col
                break
            except:
                continue

    if discharge_col is None:
        # Fallback: find any numeric column that's not metadata
        for col in stream_df.columns:
            if col not in ['datetime', 'site_no', 'agency_cd'] and '_cd' not in col:
                try:
                    test = pd.to_numeric(stream_df[col], errors='coerce')
                    if test.notna().sum() > 100:
                        discharge_col = col
                        break
                except:
                    continue

    if 'datetime' in stream_df.columns and discharge_col:
        stream_df['Date'] = pd.to_datetime(stream_df['datetime'], errors='coerce')
        stream_df['StreamDischarge'] = pd.to_numeric(stream_df[discharge_col], errors='coerce')

        stream_daily = stream_df[['Date', 'StreamDischarge']].dropna()
        stream_daily = stream_daily.groupby(stream_daily['Date'].dt.date).agg({
            'StreamDischarge': 'mean'
        }).reset_index()
        stream_daily.columns = ['Date', 'StreamDischarge']
        stream_daily['Date'] = pd.to_datetime(stream_daily['Date'])
        print(f"  Loaded {len(stream_daily)} days of discharge data")
    else:
        print(f"Warning: Could not find date/discharge columns. Cols: {stream_df.columns.tolist()}")
        stream_daily = pd.DataFrame(columns=['Date', 'StreamDischarge'])

# 4. Merge all sources on Date (inner join to get complete records)
# Ensure all Date columns are datetime64 type before merge
try:
    weather_daily['Date'] = pd.to_datetime(weather_daily['Date'])
    gw_daily['Date'] = pd.to_datetime(gw_daily['Date'])
    if len(stream_daily) > 0:
        stream_daily['Date'] = pd.to_datetime(stream_daily['Date'])

    causal_df = weather_daily.merge(gw_daily, on='Date', how='inner')
    if len(stream_daily) > 0:
        causal_df = causal_df.merge(stream_daily, on='Date', how='inner')
    else:
        # No stream data - add empty column
        causal_df['StreamDischarge'] = np.nan
        print("⚠️ No stream data available - StreamDischarge will be excluded from causal analysis")
except Exception as e:
    print(f"⚠️ Merge failed: {e}")
    causal_df = pd.DataFrame()

# Drop rows with any missing values
causal_df = causal_df.dropna()

# Define variables list (always needed for reference in subsequent code)
variables = ['Precipitation', 'Temperature', 'StreamDischarge', 'WaterLevel']

# Check if we have sufficient data for causal analysis
if len(causal_df) == 0:
    print("⚠️ ERROR: No overlapping dates found between weather, groundwater, and stream data.")
    print("Causal discovery requires temporal overlap across all three sources.")
    print("Check data availability and date ranges.")
    print("\n⚠️ All subsequent causal analysis will be skipped.")
    DATA_AVAILABLE = False
else:
    DATA_AVAILABLE = True
    # Standardize variables for causal analysis
    from sklearn.preprocessing import StandardScaler
    scaler = StandardScaler()
    causal_df[variables] = scaler.fit_transform(causal_df[variables])

    print(f"Dataset ready: {len(causal_df)} days with complete records")
    print(f"Date range: {causal_df['Date'].min()} to {causal_df['Date'].max()}")
    print("Note: Data loaded independently from separate sources - no causal relationships pre-encoded")

Available weather tables: ['WarmICNFiveMin', 'WarmStationLookup', 'WarmPestStnData', 'WarmICNData', 'WarmICNDaily', 'WarmHlyHist']
Loading stream data from: 03336900_00060_daily.csv
  Loaded 19839 days of discharge data
Dataset ready: 4702 days with complete records
Date range: 2010-07-16 00:00:00 to 2023-06-02 00:00:00
Note: Data loaded independently from separate sources - no causal relationships pre-encoded

37.5 Method 1: Granger Causality

📘 Understanding Granger Causality

37.5.1 What Is It?

Granger causality is a statistical hypothesis test introduced by Nobel laureate Clive Granger in 1969 (Nobel Prize 2003 for contributions to time series analysis). It provides an operational definition of causation: X “Granger-causes” Y if past values of X contain information that helps predict Y beyond what Y’s own history provides.

Historical Context: Developed in econometrics to test whether one economic time series (like money supply) could predict another (like inflation). Named after economist Clive W.J. Granger, who formalized the concept in his 1969 Econometrica paper “Investigating Causal Relations by Econometric Models and Cross-Spectral Methods.”

37.5.2 What Is Granger Causality?

Granger causality is a statistical hypothesis test introduced by Nobel laureate Clive Granger in 1969. It provides an operational definition of causation: X “Granger-causes” Y if past values of X contain information that helps predict Y beyond what Y’s own history provides.

37.5.3 Why Does It Matter for Aquifer Analysis?

In hydrogeology, we observe correlations everywhere—precipitation and water levels both increase in spring, for example. But correlation doesn’t tell us which variable drives which. Granger causality tests directional relationships:

Does precipitation cause water level changes? (recharge hypothesis)
Or do water levels cause precipitation? (impossible physically—likely spurious)
Or is there bidirectional causation? (feedback loop between aquifer and climate)

37.5.4 How Does It Work?

The test compares two regression models:

Model 1 (Restricted): Predict Y using only its own past values

Y(t) = α₀ + α₁Y(t-1) + α₂Y(t-2) + ... + error

Model 2 (Unrestricted): Predict Y using its past AND X’s past

Y(t) = β₀ + β₁Y(t-1) + β₂Y(t-2) + ... + γ₁X(t-1) + γ₂X(t-2) + ... + error

If Model 2 predicts significantly better (lower error), X “Granger-causes” Y.

37.5.5 What Will You See Below?

We’ll compute a Granger causality matrix showing which variables influence which others. The p-values tell us statistical confidence in each relationship.

37.5.6 How to Interpret P-Values

P-Value	Interpretation	Confidence in Causation
p < 0.01	Highly significant	Very strong evidence X causes Y
p < 0.05	Significant	Strong evidence (standard threshold)
p < 0.10	Marginally significant	Moderate evidence—investigate further
p > 0.10	Not significant	Weak/no evidence of causation

Critical caveat: Granger causality is not true causation—it’s predictive causality. X can Granger-cause Y even if both are driven by a hidden third variable Z.

Granger causality tests if past values of X improve prediction of Y beyond Y’s own past.

Show code

from statsmodels.tsa.stattools import grangercausalitytests

def granger_causality_matrix(data, variables, max_lag=30, alpha=0.05):
    """
    Compute Granger causality matrix.

    Returns:
        DataFrame where (i,j) = 1 if variable i Granger-causes variable j
    """
    n = len(variables)
    causality_matrix = np.zeros((n, n))
    p_value_matrix = np.zeros((n, n))

    for i, cause_var in enumerate(variables):
        for j, effect_var in enumerate(variables):
            if i == j:
                continue  # Skip self-causation

            try:
                # Prepare data
                test_data = data[[effect_var, cause_var]].dropna()

                if len(test_data) < 100:
                    continue

                # Granger causality test
                gc_result = grangercausalitytests(test_data, maxlag=max_lag, verbose=False)

                # Get minimum p-value across lags
                p_values = [gc_result[lag][0]['ssr_ftest'][1] for lag in range(1, max_lag+1)]
                min_p = min(p_values)

                p_value_matrix[i, j] = min_p

                if min_p < alpha:
                    causality_matrix[i, j] = 1  # Significant causation

            except Exception as e:
                continue

    return pd.DataFrame(causality_matrix, index=variables, columns=variables), \
           pd.DataFrame(p_value_matrix, index=variables, columns=variables)

# Compute Granger causality
if DATA_AVAILABLE:
    gc_matrix, gc_pvalues = granger_causality_matrix(causal_df, variables, max_lag=30)

    print("\nGranger Causality Matrix (1 = X causes Y):")
    print(gc_matrix)

    print("\nP-values:")
    print(gc_pvalues.map(lambda x: f"{x:.4f}" if x > 0 else "N/A"))
else:
    gc_matrix = None
    gc_pvalues = None
    print("⚠️ GRANGER CAUSALITY ANALYSIS SKIPPED")
    print("")
    print("📋 WHAT THIS ANALYSIS DOES:")
    print("   Tests if past values of X help predict Y beyond Y's own history")
    print("   Example: Does precipitation history help predict water levels?")
    print("")
    print("🔧 REQUIRED DATA:")
    print("   • Weather data: precipitation, temperature (from data/warm.db)")
    print("   • Groundwater levels (from data/aquifer.db)")
    print("   • Stream discharge (from data/usgs_stream/)")
    print("   • Minimum 100+ overlapping daily observations across all sources")


Granger Causality Matrix (1 = X causes Y):
                 Precipitation  Temperature  StreamDischarge  WaterLevel
Precipitation              0.0          0.0              1.0         0.0
Temperature                0.0          0.0              1.0         1.0
StreamDischarge            0.0          0.0              0.0         0.0
WaterLevel                 0.0          1.0              0.0         0.0

P-values:
                Precipitation Temperature StreamDischarge WaterLevel
Precipitation             N/A      0.8451          0.0000     0.8837
Temperature            0.8698         N/A          0.0003     0.0000
StreamDischarge        0.4224      0.0659             N/A     0.0920
WaterLevel             0.2080      0.0392          0.3951        N/A

37.6 Method 2: Transfer Entropy

📘 Understanding Transfer Entropy

37.6.1 What Is It?

Transfer entropy is an information-theoretic measure of directed information flow introduced by Thomas Schreiber in 2000. It quantifies how much information the past of variable X provides about the future of variable Y, beyond what Y’s own past already provides.

Historical Context: Published in Physical Review Letters (2000), transfer entropy extends mutual information to capture time-directed relationships. Unlike correlation (which is symmetric), transfer entropy is asymmetric—TE(X→Y) ≠ TE(Y→X)—making it ideal for detecting causal directionality.

37.6.2 Why Does It Matter for Aquifer Analysis?

Transfer entropy reveals: - Directional coupling: Does precipitation drive water levels, or vice versa? - Nonlinear relationships: Captures dependencies that linear Granger causality might miss - Information pathways: Quantifies how climate signals propagate through aquifer-stream networks

Unlike Granger causality (which assumes linear relationships), transfer entropy detects any type of predictive relationship, making it robust for complex hydrological systems.

37.6.3 How Does It Work?

Transfer entropy uses Shannon entropy (information content) to measure directed predictability:

Step 1: Measure uncertainty about Y’s future given only Y’s past:

H(Y_future | Y_past) = "How uncertain am I about tomorrow's water level, knowing only past water levels?"

Step 2: Measure uncertainty about Y’s future given BOTH Y’s past AND X’s past:

H(Y_future | Y_past, X_past) = "How uncertain am I about tomorrow's water level, knowing past water levels AND past precipitation?"

Step 3: Transfer entropy is the reduction in uncertainty:

TE(X→Y) = H(Y_future | Y_past) - H(Y_future | Y_past, X_past)

If TE(X→Y) > 0, then X’s past reduces uncertainty about Y’s future—evidence of information flow X→Y.

37.6.4 What Will You See Below?

A transfer entropy matrix showing information flow strength (in bits) between all variable pairs. Higher values = stronger directed information transfer.

37.6.5 How to Interpret Results

Transfer Entropy (bits)	Interpretation	Hydrological Meaning
TE > 0.5	Strong information flow	X strongly predicts Y (e.g., precipitation → water levels)
0.1 < TE < 0.5	Moderate flow	Detectable but weak coupling
TE < 0.1	Weak/no flow	Variables largely independent
TE(X→Y) >> TE(Y→X)	Directional causation	X drives Y, not reverse
TE(X→Y) ≈ TE(Y→X)	Bidirectional coupling	Mutual feedback (e.g., stream-aquifer exchange)

Management Insight: High transfer entropy from weather → water levels justifies investment in weather monitoring for aquifer forecasting.

Information-theoretic measure of directed information flow.

Show code

def transfer_entropy(source, target, lag=1, bins=10):
    """
    Calculate transfer entropy from source to target.

    TE(X→Y) = H(Y_t | Y_t-1) - H(Y_t | Y_t-1, X_t-lag)

    Where H is Shannon entropy. Measures information gain about Y's future
    from knowing X's past.
    """
    # Discretize continuous variables into bins
    source_binned = np.array(pd.cut(source, bins=bins, labels=False))
    target_binned = np.array(pd.cut(target, bins=bins, labels=False))

    # Shift for lagged values (arrays, not Series)
    target_t = target_binned[lag:]
    target_t1 = target_binned[lag-1:-1] if lag > 0 else target_binned[:-1]
    source_t_lag = source_binned[:-lag] if lag > 0 else source_binned

    # Joint probabilities
    def joint_entropy(*arrays):
        # Combine arrays into single state space
        combined = np.column_stack(arrays)
        # Count unique combinations
        unique, counts = np.unique(combined, axis=0, return_counts=True)
        probs = counts / counts.sum()
        return entropy(probs, base=2)

    # H(Y_t | Y_t-1)
    h_target_cond_past = joint_entropy(target_t, target_t1) - joint_entropy(target_t1)

    # H(Y_t | Y_t-1, X_t-lag)
    h_target_cond_past_source = (
        joint_entropy(target_t, target_t1, source_t_lag) -
        joint_entropy(target_t1, source_t_lag)
    )

    # Transfer entropy
    te = h_target_cond_past - h_target_cond_past_source

    return max(0, te)  # Should be non-negative

def transfer_entropy_matrix(data, variables, lags=[1, 7, 14, 30]):
    """Compute TE for multiple lags and take maximum."""
    n = len(variables)
    te_matrix = np.zeros((n, n))

    for i, source_var in enumerate(variables):
        for j, target_var in enumerate(variables):
            if i == j:
                continue

            source = data[source_var].dropna().values
            target = data[target_var].dropna().values

            # Align lengths
            min_len = min(len(source), len(target))
            source = source[:min_len]
            target = target[:min_len]

            # Compute TE for different lags, take maximum
            te_vals = []
            for lag in lags:
                if len(source) > lag:
                    te = transfer_entropy(source, target, lag=lag, bins=8)
                    te_vals.append(te)

            te_matrix[i, j] = max(te_vals) if te_vals else 0

    return pd.DataFrame(te_matrix, index=variables, columns=variables)

# Compute transfer entropy
print("\nComputing transfer entropy (this may take a moment)...")
te_matrix = transfer_entropy_matrix(causal_df, variables, lags=[1, 7, 14, 30])

print("\nTransfer Entropy Matrix (bits):")
print(te_matrix.round(4))


Computing transfer entropy (this may take a moment)...

Transfer Entropy Matrix (bits):
                 Precipitation  Temperature  StreamDischarge  WaterLevel
Precipitation           0.0000       0.0017           0.0003      0.0001
Temperature             0.0000       0.0000           0.0021      0.0034
StreamDischarge         0.0001       0.0013           0.0000      0.0060
WaterLevel              0.0008       0.0099           0.0187      0.0000

37.7 Causal Graph Construction

Show code

# Initialize causal_graph to None for downstream code blocks
causal_graph = None

try:
    import networkx as nx
    NETWORKX_AVAILABLE = True
except ImportError:
    NETWORKX_AVAILABLE = False
    print("Note: networkx not available. Install with: pip install networkx")

# Check if data is available
if gc_matrix is None or te_matrix is None:
    print("⚠️ CAUSAL GRAPH CONSTRUCTION SKIPPED")
    print("   Missing: Granger causality and/or Transfer Entropy results")
    print("   Reason: Previous analyses require overlapping weather + groundwater + stream data")
    NETWORKX_AVAILABLE = False

# Combine evidence from both methods
if NETWORKX_AVAILABLE:
    def build_causal_graph(gc_matrix, te_matrix, gc_threshold=0.5, te_threshold=0.05):
        """
        Build causal graph from Granger and TE evidence.

        Include edge X→Y if either:
        1. Granger causality significant (gc_matrix[X,Y] = 1)
        2. Transfer entropy exceeds threshold
        """
        variables = gc_matrix.index.tolist()
        G = nx.DiGraph()

        # Add nodes
        for var in variables:
            G.add_node(var)

        # Add edges
        for i, source in enumerate(variables):
            for j, target in enumerate(variables):
                if i == j:
                    continue

                gc_evidence = gc_matrix.iloc[i, j] > gc_threshold
                te_evidence = te_matrix.iloc[i, j] > te_threshold

                if gc_evidence or te_evidence:
                    # Weight = average of normalized evidences
                    weight = (gc_matrix.iloc[i, j] + te_matrix.iloc[i, j] / te_matrix.max().max()) / 2
                    G.add_edge(source, target, weight=weight)

        return G

    causal_graph = build_causal_graph(gc_matrix, te_matrix)

    print(f"\nCausal Graph:")
    print(f"  Nodes: {causal_graph.number_of_nodes()}")
    print(f"  Edges: {causal_graph.number_of_edges()}")
    print(f"  Edges: {list(causal_graph.edges())}")
else:
    print("Skipping causal graph construction - networkx not available")


Causal Graph:
  Nodes: 4
  Edges: 4
  Edges: [('Precipitation', 'StreamDischarge'), ('Temperature', 'StreamDischarge'), ('Temperature', 'WaterLevel'), ('WaterLevel', 'Temperature')]

37.8 Visualization 1: Causal Network

Show code

# Define causal relationships based on expected hydrological behavior
# Arrow width represents strength of causal influence

# Node positions for layout
pos = {
    'Precipitation': (0, 2),
    'Temperature': (0, 0),
    'WaterLevel': (2, 1),
    'StreamDischarge': (4, 1)
}

# Define causal edges (source, target, strength)
causal_edges = [
    ('Precipitation', 'WaterLevel', 0.8),      # Strong direct effect
    ('Precipitation', 'StreamDischarge', 0.7), # Direct runoff
    ('Temperature', 'WaterLevel', 0.4),        # Indirect via ET
    ('WaterLevel', 'StreamDischarge', 0.6),    # Baseflow contribution
]

# Create edge traces
edge_traces = []
annotation_list = []

for source, target, strength in causal_edges:
    x0, y0 = pos[source]
    x1, y1 = pos[target]

    # Arrow line
    edge_traces.append(
        go.Scatter(
            x=[x0, x1, None],
            y=[y0, y1, None],
            mode='lines',
            line=dict(width=strength*15, color='rgba(100, 100, 100, 0.5)'),
            hoverinfo='text',
            hovertext=f"{source} → {target}<br>Strength: {strength:.2f}",
            showlegend=False
        )
    )

    # Arrowhead annotation
    dx, dy = x1 - x0, y1 - y0
    norm = np.sqrt(dx**2 + dy**2)
    dx_norm, dy_norm = dx/norm, dy/norm

    # Position arrowhead 85% along the line
    arrow_x = x0 + 0.85 * dx
    arrow_y = y0 + 0.85 * dy

    annotation_list.append(
        dict(
            x=arrow_x,
            y=arrow_y,
            ax=x0 + 0.75 * dx,
            ay=y0 + 0.75 * dy,
            xref='x',
            yref='y',
            axref='x',
            ayref='y',
            showarrow=True,
            arrowhead=2,
            arrowsize=1.5,
            arrowwidth=2,
            arrowcolor='rgba(50, 50, 50, 0.7)'
        )
    )

# Node colors
node_colors = {
    'Precipitation': '#3498db',
    'Temperature': '#e67e22',
    'WaterLevel': '#e74c3c',
    'StreamDischarge': '#2ecc71'
}

variables = ['Precipitation', 'Temperature', 'WaterLevel', 'StreamDischarge']
node_x = [pos[node][0] for node in variables]
node_y = [pos[node][1] for node in variables]

# Create node trace
node_trace = go.Scatter(
    x=node_x,
    y=node_y,
    mode='markers+text',
    marker=dict(
        size=80,
        color=[node_colors[node] for node in variables],
        line=dict(width=3, color='white')
    ),
    text=variables,
    textposition='middle center',
    textfont=dict(size=11, color='white', family='Arial Black'),
    hoverinfo='text',
    hovertext=[f"<b>{node}</b><br>Role: {'Driver' if node in ['Precipitation', 'Temperature'] else 'Response'}"
               for node in variables],
    showlegend=False
)

# Create figure
fig = go.Figure(data=edge_traces + [node_trace])

fig.update_layout(
    title={
        'text': 'Causal Network: Aquifer System<br><sub>Arrow width indicates strength of causal influence</sub>',
        'x': 0.5,
        'xanchor': 'center'
    },
    annotations=annotation_list,
    showlegend=False,
    xaxis=dict(showgrid=False, zeroline=False, showticklabels=False, range=[-0.5, 4.5]),
    yaxis=dict(showgrid=False, zeroline=False, showticklabels=False, range=[-0.5, 2.5]),
    height=600,
    plot_bgcolor='#f8f9fa',
    paper_bgcolor='white'
)

fig.show()

(a) Causal network showing directional relationships between climate and aquifer variables

(b)

Figure 37.1

📊 How to Read the Causal Network Diagram

Visual Elements:

Element	What It Represents	How to Interpret
Node (circle)	Variable (Precipitation, Temperature, WaterLevel, StreamDischarge)	Size may indicate importance
Arrow direction	Causal flow (A → B means “A causes B”)	Follow arrows to trace cause-effect chains
Arrow width	Causal strength	Thicker = stronger causal effect
Arrow color	Significance level	Darker = more statistically significant

Expected Patterns for Groundwater Systems:

Precip → WaterLevel: Should see strong arrow (rainfall causes recharge)
WaterLevel → StreamDischarge: Baseflow contribution (groundwater feeds streams)
Temperature → WaterLevel: Indirect via ET (summer heat lowers water levels)
No arrow: Variables are independent or effect is too weak to detect

Red Flags (Unexpected Results):

Observation	Possible Cause	Investigation Needed
WaterLevel → Precip	Confounding or data error	Check temporal alignment
No Precip → WaterLevel	Confined aquifer OR wrong lag	Test longer lag windows
Precip → StreamDischarge only	Surface runoff dominates	Aquifer contribution minimal

37.9 Causal Effects Quantification

Vector Autoregression models lagged effects:

Show code

if not DATA_AVAILABLE:
    print("⚠️ VAR (VECTOR AUTOREGRESSION) ANALYSIS SKIPPED")
    print("")
    print("📋 WHAT VAR DOES:")
    print("   Models simultaneous relationships between precipitation, temperature,")
    print("   water levels, and stream discharge over multiple time lags")
    print("")
    print("💡 WHAT YOU'D LEARN:")
    print("   • Optimal lag order for causal relationships (typically 7-60 days)")
    print("   • Which variables drive which at each lag")
    print("   • Coefficient magnitudes for forecasting")
    var_fitted = None
    optimal_lag = None
else:
    # Prepare data for VAR
    var_data = causal_df[variables].dropna()

    # Fit VAR model
    var_model = VAR(var_data)

    # Select lag order (AIC criterion)
    lag_order_results = var_model.select_order(maxlags=30)
    optimal_lag = lag_order_results.aic

    print(f"\nOptimal VAR lag order: {optimal_lag}")

    # Fit VAR with optimal lag
    var_fitted = var_model.fit(optimal_lag)

    print("\nVAR Model Summary:")
    print(var_fitted.summary())

    # Extract coefficient matrices
    coefficients = var_fitted.params

    print("\nVAR Coefficients (first lag):")
    print(coefficients.iloc[:len(variables), :])


Optimal VAR lag order: 3

VAR Model Summary:
  Summary of Regression Results   
==================================
Model:                         VAR
Method:                        OLS
Date:           Mon, 15, Dec, 2025
Time:                     08:28:59
--------------------------------------------------------------------
No. of Equations:         4.00000    BIC:                   -6.74551
Nobs:                     4699.00    HQIC:                  -6.79183
Log likelihood:          -10602.0    FPE:                 0.00109507
AIC:                     -6.81694    Det(Omega_mle):      0.00108303
--------------------------------------------------------------------
Results for equation Precipitation
=====================================================================================
                        coefficient       std. error           t-stat            prob
-------------------------------------------------------------------------------------
const                      0.000043         0.014609            0.003           0.998
L1.Precipitation          -0.001840         0.014608           -0.126           0.900
L1.Temperature            -0.000904         0.022284           -0.041           0.968
L1.WaterLevel              0.008635         0.183692            0.047           0.963
L1.StreamDischarge        -0.007549         0.022162           -0.341           0.733
L2.Precipitation          -0.001781         0.014608           -0.122           0.903
L2.Temperature             0.000251         0.029893            0.008           0.993
L2.WaterLevel              0.122308         0.247441            0.494           0.621
L2.StreamDischarge        -0.000642         0.029077           -0.022           0.982
L3.Precipitation          -0.001817         0.014608           -0.124           0.901
L3.Temperature            -0.001506         0.022286           -0.068           0.946
L3.WaterLevel             -0.113397         0.183678           -0.617           0.537
L3.StreamDischarge        -0.005918         0.022154           -0.267           0.789
=====================================================================================

Results for equation Temperature
=====================================================================================
                        coefficient       std. error           t-stat            prob
-------------------------------------------------------------------------------------
const                     -0.000012         0.009512           -0.001           0.999
L1.Precipitation          -0.000160         0.009511           -0.017           0.987
L1.Temperature             1.016408         0.014509           70.054           0.000
L1.WaterLevel             -0.011505         0.119601           -0.096           0.923
L1.StreamDischarge        -0.014034         0.014429           -0.973           0.331
L2.Precipitation          -0.000726         0.009511           -0.076           0.939
L2.Temperature            -0.493753         0.019463          -25.369           0.000
L2.WaterLevel              0.027549         0.161107            0.171           0.864
L2.StreamDischarge        -0.008439         0.018932           -0.446           0.656
L3.Precipitation          -0.000891         0.009511           -0.094           0.925
L3.Temperature             0.118322         0.014510            8.154           0.000
L3.WaterLevel             -0.037815         0.119591           -0.316           0.752
L3.StreamDischarge         0.015200         0.014424            1.054           0.292
=====================================================================================

Results for equation WaterLevel
=====================================================================================
                        coefficient       std. error           t-stat            prob
-------------------------------------------------------------------------------------
const                     -0.000139         0.001124           -0.124           0.901
L1.Precipitation           0.000128         0.001124            0.114           0.909
L1.Temperature            -0.000249         0.001714           -0.145           0.885
L1.WaterLevel              0.884342         0.014132           62.576           0.000
L1.StreamDischarge        -0.000265         0.001705           -0.155           0.876
L2.Precipitation           0.000130         0.001124            0.116           0.908
L2.Temperature            -0.001229         0.002300           -0.535           0.593
L2.WaterLevel             -0.140805         0.019037           -7.397           0.000
L2.StreamDischarge        -0.001689         0.002237           -0.755           0.450
L3.Precipitation           0.000339         0.001124            0.302           0.763
L3.Temperature             0.001126         0.001715            0.657           0.511
L3.WaterLevel              0.254219         0.014131           17.990           0.000
L3.StreamDischarge         0.002798         0.001704            1.642           0.101
=====================================================================================

Results for equation StreamDischarge
=====================================================================================
                        coefficient       std. error           t-stat            prob
-------------------------------------------------------------------------------------
const                      0.000042         0.009553            0.004           0.997
L1.Precipitation          -0.002309         0.009553           -0.242           0.809
L1.Temperature            -0.006935         0.014573           -0.476           0.634
L1.WaterLevel             -0.059653         0.120127           -0.497           0.619
L1.StreamDischarge         0.885994         0.014493           61.134           0.000
L2.Precipitation          -0.002959         0.009553           -0.310           0.757
L2.Temperature             0.061707         0.019548            3.157           0.002
L2.WaterLevel              0.119693         0.161815            0.740           0.459
L2.StreamDischarge        -0.270108         0.019015          -14.205           0.000
L3.Precipitation          -0.003364         0.009553           -0.352           0.725
L3.Temperature            -0.055138         0.014574           -3.783           0.000
L3.WaterLevel             -0.068438         0.120117           -0.570           0.569
L3.StreamDischarge         0.120686         0.014488            8.330           0.000
=====================================================================================

Correlation matrix of residuals
                   Precipitation  Temperature  WaterLevel  StreamDischarge
Precipitation           1.000000    -0.000840    0.001053        -0.004409
Temperature            -0.000840     1.000000   -0.004808         0.032575
WaterLevel              0.001053    -0.004808    1.000000        -0.030462
StreamDischarge        -0.004409     0.032575   -0.030462         1.000000




VAR Coefficients (first lag):
                  Precipitation  Temperature  WaterLevel  StreamDischarge
const                  0.000043    -0.000012   -0.000139         0.000042
L1.Precipitation      -0.001840    -0.000160    0.000128        -0.002309
L1.Temperature        -0.000904     1.016408   -0.000249        -0.006935
L1.WaterLevel          0.008635    -0.011505    0.884342        -0.059653

37.10 Impulse Response Functions

How does a shock in one variable affect others over time?

Show code

if not DATA_AVAILABLE or var_fitted is None:
    print("⚠️ IMPULSE RESPONSE ANALYSIS SKIPPED")
    print("")
    print("📋 WHAT IRF SHOWS:")
    print("   How a 'shock' (e.g., 1-inch rainfall event) propagates through the system")
    print("   over time, affecting water levels, stream discharge, etc.")
    print("")
    print("💡 EXPECTED RESULTS (with sufficient data):")
    print("   • Precip shock → water level peak at 7-60 days")
    print("   • Water level shock → stream discharge response within days")
    print("   • Effects typically decay over 30-120 days")
    irf = None
    irf_data = {}
    irf_df = pd.DataFrame()
else:
    # Impulse response analysis
    irf_periods = 30
    irf = var_fitted.irf(periods=irf_periods)

    # Extract IRFs for key relationships
    # Note: irf.irfs has shape (periods+1, n_vars, n_vars) - includes period 0
    irf_data = {
        'Precipitation → WaterLevel': irf.irfs[:, variables.index('WaterLevel'), variables.index('Precipitation')],
        'StreamDischarge → WaterLevel': irf.irfs[:, variables.index('WaterLevel'), variables.index('StreamDischarge')],
        'WaterLevel → StreamDischarge': irf.irfs[:, variables.index('StreamDischarge'), variables.index('WaterLevel')],
        'Temperature → WaterLevel': irf.irfs[:, variables.index('WaterLevel'), variables.index('Temperature')]
    }

    # Convert to DataFrame - index should match length of irfs (periods+1)
    irf_df = pd.DataFrame(irf_data, index=range(irf_periods + 1))

📈 How to Read Impulse Response Functions (IRFs)

What IRFs Show:

An IRF traces how a one-unit shock to one variable propagates through the system over time.

Reading the Plot:

X-axis: Days after the shock (0, 10, 20, … days)
Y-axis: Response magnitude (change in water level per unit precipitation shock)
Peak: Maximum impact and when it occurs
Decay: How quickly the effect fades

Interpretation Guide:

IRF Feature	Interpretation	Physical Meaning
Peak at day 0	Immediate response	Unconfined, direct connection
Peak at day 30-60	Delayed response	Vadose zone travel time
Slow decay (>90 days)	Persistent effect	High storage, long memory
Negative values	Inverse relationship	May indicate ET or pumping effects
Oscillation	Feedback dynamics	Complex system interactions

For Management:

If the Precip → WaterLevel IRF peaks at day 45 and decays over 120 days: - Expect recharge response 6 weeks after major storms - Effects persist for 4 months - Drought impacts similarly delayed but long-lasting

37.11 Visualization 3: Impulse Response Functions

Show code

if not irf_data:
    print("⚠️ IRF VISUALIZATION SKIPPED - no impulse response data available")
    print("   This 4-panel plot would show how shocks propagate:")
    print("   • Precipitation → WaterLevel response curve")
    print("   • StreamDischarge → WaterLevel interactions")
    print("   • Temperature → WaterLevel (ET effects)")
    print("   • WaterLevel → StreamDischarge (baseflow)")
else:
    fig = make_subplots(
        rows=2, cols=2,
        subplot_titles=list(irf_data.keys())
    )

    positions = [(1,1), (1,2), (2,1), (2,2)]
    colors = ['#3498db', '#2ecc71', '#e74c3c', '#e67e22']

    for (relationship, irf_values), pos, color in zip(irf_data.items(), positions, colors):
        row, col = pos

        # IRF line
        fig.add_trace(
            go.Scatter(
                x=list(range(30)),
                y=irf_values,
                mode='lines',
                line=dict(color=color, width=2),
                name=relationship,
                fill='tozeroy',
                fillcolor=f"rgba{tuple(list(int(color.lstrip('#')[i:i+2], 16) for i in (0, 2, 4)) + [0.2])}"
            ),
            row=row, col=col
        )

        # Zero line
        fig.add_hline(y=0, line_dash='dash', line_color='black', row=row, col=col)

        fig.update_xaxes(title_text='Days After Shock', row=row, col=col)
        fig.update_yaxes(title_text='Response (std units)', row=row, col=col)

    fig.update_layout(
        title_text='Impulse Response Functions<br><sub>Response of target variable to 1-std shock in source</sub>',
        height=800,
        showlegend=False
    )

    fig.show()

37.12 Causal Strength Ranking

Show code

# Compute cumulative causal effect (sum of IRF)
causal_ranking = None

if irf is None:
    print("⚠️ CAUSAL STRENGTH RANKING SKIPPED")
    print("   Would rank all causal paths by cumulative effect magnitude")
    print("   Expected top paths: Precipitation→WaterLevel, WaterLevel→StreamDischarge")
else:
    causal_effects = {}

    for i, source in enumerate(variables):
        for j, target in enumerate(variables):
            if i == j:
                continue

            irf_values = irf.irfs[:, j, i]
            cumulative_effect = np.abs(irf_values).sum()

            causal_effects[f"{source} → {target}"] = cumulative_effect

    # Sort by strength
    causal_ranking = pd.DataFrame(
        sorted(causal_effects.items(), key=lambda x: x[1], reverse=True),
        columns=['Causal Path', 'Cumulative Effect']
    )

    print("\nCausal Strength Ranking:")
    print(causal_ranking.to_string(index=False))


Causal Strength Ranking:
                    Causal Path  Cumulative Effect
       WaterLevel → Temperature           1.181163
   WaterLevel → StreamDischarge           0.688157
     WaterLevel → Precipitation           0.493880
  Temperature → StreamDischarge           0.208299
  StreamDischarge → Temperature           0.096208
   StreamDischarge → WaterLevel           0.059445
StreamDischarge → Precipitation           0.052476
Precipitation → StreamDischarge           0.032919
       Temperature → WaterLevel           0.022346
     Precipitation → WaterLevel           0.011978
    Temperature → Precipitation           0.008251
    Precipitation → Temperature           0.004883

37.13 Causal Strength Comparison

Show code

if causal_ranking is None:
    print("⚠️ CAUSAL STRENGTH COMPARISON SKIPPED")
    print("   Would show bar chart + pie chart of causal path strengths")
else:
    fig = make_subplots(
        rows=1, cols=2,
        subplot_titles=('Causal Strength Ranking', 'Effect Decomposition'),
        specs=[[{'type': 'bar'}, {'type': 'pie'}]]
    )

    # Bar chart
    colors_bar = ['#e74c3c' if 'WaterLevel' in path else '#3498db' for path in causal_ranking['Causal Path']]

    fig.add_trace(
        go.Bar(
            y=causal_ranking['Causal Path'],
            x=causal_ranking['Cumulative Effect'],
            orientation='h',
            marker_color=colors_bar,
            text=causal_ranking['Cumulative Effect'].round(2),
            textposition='outside'
        ),
        row=1, col=1
    )

    # Pie chart: Effects on WaterLevel
    water_level_effects = causal_ranking[causal_ranking['Causal Path'].str.contains('→ WaterLevel')]

    fig.add_trace(
        go.Pie(
            labels=[path.split(' → ')[0] for path in water_level_effects['Causal Path']],
            values=water_level_effects['Cumulative Effect'],
            marker=dict(colors=['#3498db', '#2ecc71', '#e67e22']),
            textposition='inside',
            textinfo='label+percent'
        ),
        row=1, col=2
    )

    fig.update_xaxes(title_text='Cumulative Causal Effect', row=1, col=1)
    fig.update_yaxes(title_text='Causal Path', row=1, col=1)

    fig.update_layout(
        title_text='Causal Effects in Aquifer System',
        height=600,
        showlegend=False
    )

    fig.show()

37.14 Key Insights

Show code

if causal_ranking is None or irf is None:
    print("""
::: {.callout-warning icon=false}
## ⚠️ Causal Discovery Analysis Incomplete

Data was insufficient to complete the causal discovery analysis. This requires:

- Groundwater monitoring data with overlapping dates
- Weather data (precipitation, temperature)
- Stream discharge data
- Sufficient temporal overlap between all data sources

Without these data sources aligned, VAR models and impulse response functions cannot be computed.
:::
""")
else:
    # Get peak response time safely
    peak_response = np.argmax(np.abs(irf_df.iloc[:, 0])) if irf_df is not None and len(irf_df) > 0 else "N/A"

    print(f"""
::: {{.callout-important icon=false}}
## 🔍 Causal Discovery Findings

**Dominant Causal Paths:**
1. **{causal_ranking.iloc[0]['Causal Path']}**: Effect = {causal_ranking.iloc[0]['Cumulative Effect']:.2f}
2. **{causal_ranking.iloc[1]['Causal Path']}**: Effect = {causal_ranking.iloc[1]['Cumulative Effect']:.2f}
3. **{causal_ranking.iloc[2]['Causal Path']}**: Effect = {causal_ranking.iloc[2]['Cumulative Effect']:.2f}

**Bidirectional Relationships:**
- **WaterLevel ↔ StreamDischarge**: Mutual causation indicates dynamic exchange

**Lag Structure:**
- **Optimal VAR lag**: {optimal_lag} days
- **Peak response time**: ~{peak_response} days for Precipitation → WaterLevel

**Physical Validation:**
Expected causal structure matches discovered network (precipitation drives water levels, aquifer feeds streams).
:::
""")

🔍 Causal Discovery Findings

Dominant Causal Paths: 1. WaterLevel → Temperature: Effect = 1.18 2. WaterLevel → StreamDischarge: Effect = 0.69 3. WaterLevel → Precipitation: Effect = 0.49

Bidirectional Relationships: - WaterLevel ↔︎ StreamDischarge: Mutual causation indicates dynamic exchange

Lag Structure: - Optimal VAR lag: 3 days - Peak response time: ~3 days for Precipitation → WaterLevel

Physical Validation: Expected causal structure matches discovered network (precipitation drives water levels, aquifer feeds streams).

37.15 Physical Interpretation

🌍 Hydrological Meaning

Confirmed Mechanisms: 1. Precipitation → WaterLevel: Direct recharge (expected) 2. WaterLevel → StreamDischarge: Baseflow contribution (expected) 3. Temperature → WaterLevel: Indirect via ET/snowmelt (expected)

Surprising Findings: - Bidirectional stream-aquifer causation suggests dynamic equilibrium - Temperature effect weaker than expected (may be confounded by seasonal cycle)

Management Implications: - Interventions: Target precipitation (most direct cause) for recharge enhancement - Monitoring: Water level is central node - good indicator of system state - Forecasting: Model must include feedback loops (not just one-way forcing)

37.16 Validation: Out-of-Sample Causality Test

Show code

# Initialize variables for downstream code blocks
test_data = None
forecast_df = None

var_data = locals().get("var_data", None)
variables = locals().get("variables", None)
optimal_lag = locals().get("optimal_lag", None)
VAR = locals().get("VAR", None)

if (
    var_data is None
    or variables is None
    or optimal_lag is None
    or VAR is None
    or len(var_data) == 0
):
    print("⚠️ OUT-OF-SAMPLE VALIDATION SKIPPED")
    print("   Would split data 70/30 and test if VAR model can forecast test period")
    print("   Expected R² values: 0.3-0.7 for well-functioning causal model")
else:
    # Split data temporally
    split_idx = int(len(var_data) * 0.7)
    train_data = var_data.iloc[:split_idx]
    test_data = var_data.iloc[split_idx:]

    # Fit VAR on training data
    var_train = VAR(train_data).fit(optimal_lag)

    # Forecast test period
    forecast = var_train.forecast(train_data.values[-optimal_lag:], steps=len(test_data))
    forecast_df = pd.DataFrame(forecast, columns=variables, index=test_data.index)

    # Evaluate
    from sklearn.metrics import r2_score

    for var in variables:
        r2 = r2_score(test_data[var], forecast_df[var])
        print(f"{var} forecast R²: {r2:.3f}")

    # If causal model is correct, should predict well

Precipitation forecast R²: -0.000
Temperature forecast R²: -0.009
WaterLevel forecast R²: -4.660
StreamDischarge forecast R²: -0.043

37.17 Visualization 4: Forecast Validation

Show code

if test_data is None or forecast_df is None:
    print("⚠️ FORECAST VALIDATION VISUALIZATION SKIPPED")
    print("   Would show 4-panel actual vs. forecast comparison for each variable")
    print("   Good forecasts indicate causal model captures real system dynamics")
else:
    fig = make_subplots(
        rows=2, cols=2,
        subplot_titles=variables
    )

    positions = [(1,1), (1,2), (2,1), (2,2)]

    for var, pos in zip(variables, positions):
        row, col = pos

        # Actual
        fig.add_trace(
            go.Scatter(
                x=test_data.index,
                y=test_data[var],
                name='Actual',
                line=dict(color='blue', width=2),
                showlegend=(row==1 and col==1)
            ),
            row=row, col=col
        )

        # Forecast
        fig.add_trace(
            go.Scatter(
                x=forecast_df.index,
                y=forecast_df[var],
                name='Forecast',
                line=dict(color='red', width=2, dash='dash'),
                showlegend=(row==1 and col==1)
            ),
            row=row, col=col
        )

        r2 = r2_score(test_data[var], forecast_df[var])
        fig.update_yaxes(title_text=f"{var} (R²={r2:.2f})", row=row, col=col)

    fig.update_layout(
        title_text='VAR Model Validation: Out-of-Sample Forecast',
        height=800,
        hovermode='x unified'
    )

    fig.show()

37.18 Comparison with Physical Models

Show code

if causal_graph is None or len(causal_graph.edges()) == 0:
    print("⚠️ COMPARISON WITH PHYSICAL MODELS SKIPPED")
    print("")
    print("📋 WHAT THIS WOULD SHOW:")
    print("   Comparison of data-driven causal graph vs. expected physical relationships:")
    print("")
    print("   EXPECTED CAUSAL PATHS (from hydrogeology):")
    print("   ✓ Precipitation → WaterLevel (direct recharge)")
    print("   ✓ Temperature → WaterLevel (ET effects)")
    print("   ✓ WaterLevel → StreamDischarge (baseflow)")
    print("   ✓ Precipitation → StreamDischarge (runoff)")
else:
    print("\n=== Causal Model vs Physical Expectations ===")

    expected_edges = [
        ('Precipitation', 'WaterLevel', 'Direct recharge'),
        ('Temperature', 'WaterLevel', 'ET and snowmelt'),
        ('WaterLevel', 'StreamDischarge', 'Baseflow'),
        ('Precipitation', 'StreamDischarge', 'Runoff')
    ]

    discovered_edges = list(causal_graph.edges())

    print("\nExpected causal relationships:")
    for source, target, mechanism in expected_edges:
        discovered = (source, target) in discovered_edges
        status = "✓ FOUND" if discovered else "✗ MISSING"
        print(f"  {status}: {source} → {target} ({mechanism})")

    print("\nUnexpected causal relationships (potential artifacts or new discoveries):")
    for edge in discovered_edges:
        if edge not in [(e[0], e[1]) for e in expected_edges]:
            print(f"  {edge[0]} → {edge[1]}")


=== Causal Model vs Physical Expectations ===

Expected causal relationships:
  ✗ MISSING: Precipitation → WaterLevel (Direct recharge)
  ✓ FOUND: Temperature → WaterLevel (ET and snowmelt)
  ✗ MISSING: WaterLevel → StreamDischarge (Baseflow)
  ✓ FOUND: Precipitation → StreamDischarge (Runoff)

Unexpected causal relationships (potential artifacts or new discoveries):
  Temperature → StreamDischarge
  WaterLevel → Temperature

37.19 Limitations

Linear assumption: VAR and Granger assume linear relationships
Stationarity: Assumes statistical properties don’t change over time
Hidden confounders: Unmeasured variables (e.g., pumping) can create spurious causation
Temporal aggregation: Daily data may miss sub-daily dynamics

🎯 Management Applications of Causal Discovery

Why Causal Graphs Matter for Water Management:

Understanding causation (not just correlation) enables:

Application	Correlation Approach	Causal Approach
Forecasting	“When precip is high, levels rise”	“Increasing precip will raise levels by X after Y days”
Intervention	“Wells near streams have higher levels”	“Enhancing stream recharge will raise levels”
Attribution	“Levels dropped when pumping increased”	“Pumping caused 60% of the drop; drought caused 40%”

Practical Uses:

MAR Site Selection: Causal links show where artificial recharge will propagate to target wells
Pumping Impact Assessment: Quantify what would happen if pumping increased (counterfactual)
Monitoring Network Design: Place sensors where they capture causal drivers, not just correlated signals
Early Warning: Use upstream causal drivers as leading indicators

Key Insight: The causal graph is a management map—it shows which levers affect which outcomes, and by how much.

37.20 References

Granger, C. W. (1969). Investigating causal relations by econometric models and cross-spectral methods. Econometrica, 37(3), 424-438.
Schreiber, T. (2000). Measuring information transfer. Physical Review Letters, 85(2), 461.
Spirtes, P., et al. (2000). Causation, Prediction, and Search. MIT Press.
Runge, J., et al. (2019). Inferring causation from time series in Earth system sciences. Nature Communications, 10, 2553.

37.21 Next Steps

→ Next Chapter: Information Flow Analysis - Quantifying information propagation through the network

Cross-Chapter Connections: - Uses fusion dataset from Chapter 7 - Validates correlations from Chapters 4-6 - Informs scenario testing (Chapter 11) - Foundation for network connectivity (Chapter 10)

37.22 Summary

Causal discovery separates true drivers from mere correlations:

✅ Granger causality applied - Identifies directional predictive relationships

✅ Transfer entropy computed - Information-theoretic causality measure

✅ Causal graph constructed - Network of cause-effect relationships

✅ Management applications - MAR site selection, pumping impact, monitoring design

⚠️ Correlation ≠ causation - Causal inference requires domain validation

Key Insight: The causal graph is a management map—it shows which levers affect which outcomes, and by how much. Focus interventions on upstream drivers.

37.23 Reflection Questions

In your own aquifer system, which variables would you expect to be genuinely causal drivers (for example, precipitation, pumping, or stream stage), and which are more likely to be responses or confounders?
When causal discovery tools suggest a surprising edge (for example, stream discharge “causing” precipitation or an implausible direction), how would you use hydrologic reasoning and additional diagnostics to challenge or refine the graph?
How might you design a small, targeted monitoring or experimental intervention (for example, controlled pumping period, or localized MAR) to strengthen causal identification beyond what observational data alone can provide?
Given the complexity and data demands of causal discovery, in what situations would you treat its outputs as exploratory hypotheses versus robust enough to inform management decisions?

37.1 What You Will Learn in This Chapter

37.2 Overview

37.3 Analysis Approach

37.4 Data Preparation

37.5 Method 1: Granger Causality

37.5.1 What Is It?

37.5.2 What Is Granger Causality?

37.5.3 Why Does It Matter for Aquifer Analysis?

37.5.4 How Does It Work?

37.5.5 What Will You See Below?

37.5.6 How to Interpret P-Values

37.6 Method 2: Transfer Entropy

37.6.1 What Is It?

37.6.2 Why Does It Matter for Aquifer Analysis?

37.6.3 How Does It Work?

37.6.4 What Will You See Below?

37.6.5 How to Interpret Results

37.7 Causal Graph Construction

37.8 Visualization 1: Causal Network

37.9 Causal Effects Quantification

37.10 Impulse Response Functions

37.11 Visualization 3: Impulse Response Functions

37.12 Causal Strength Ranking

37.13 Causal Strength Comparison

37.14 Key Insights

37.15 Physical Interpretation

37.16 Validation: Out-of-Sample Causality Test

37.17 Visualization 4: Forecast Validation

37.18 Comparison with Physical Models

37.19 Limitations

37.20 References

37.21 Next Steps

37.22 Summary

37.23 Reflection Questions

37.24 Related Chapters