---
title: "Information Flow Analysis"
subtitle: "Quantifying Information Propagation Pathways"
code-fold: true
---
::: {.callout-tip icon=false}
## For Newcomers
**You will get:**
- A way of thinking about **how signals move** through the well network (e.g., drought effects propagating over time).
- Intuition for information-based measures (mutual information, transfer entropy) as tools to detect **hidden connectivity**.
- Visuals that show which wells “talk to each other” most strongly.
You can skim the formal information-theory definitions and focus on:
- The maps/graphs of well connectivity,
- The narrative about which connections are strong or weak,
- And how this complements the more physical fusion analyses.
:::
**Data Sources Fused**: Groundwater Wells (Network Analysis)
## What You Will Learn in This Chapter
By the end of this chapter, you will be able to:
- Explain what “information flow” means in a groundwater monitoring network and how it relates to hydraulic connectivity and shared forcing.
- Interpret correlation/information-based heatmaps and network graphs to identify hub wells, clusters, and weakly connected sites.
- Discuss how information-based metrics complement more physical analyses (recharge, stream–aquifer exchange, causal graphs) when designing and optimizing monitoring networks.
- Reflect on the limitations of correlation as a proxy for mutual information and when more advanced metrics or additional data are warranted.
## Overview
Water doesn't just flow through aquifers - **information** flows too. A drought signal propagates from recharge areas to deeper parts of the aquifer. A pumping cone of depression spreads outward. This chapter uses **information theory** to track how signals propagate through the well network, revealing hidden connectivity and flow pathways.
::: {.callout-note icon=false}
## 💻 For Computer Scientists
**Information Theory Metrics:**
- **Mutual Information**: I(X;Y) = how much knowing X reduces uncertainty about Y
- **Transfer Entropy**: TE(X→Y) = directed information flow (causal)
- **Time-Lagged Mutual Information**: TLMI(X,Y,τ) = MI at different time lags
- **Information Bottleneck**: Identify wells that control information flow
**Graph Theory:**
- Nodes = Wells
- Edge weights = Information transfer strength
- Directed edges = Asymmetric information flow
:::
::: {.callout-tip icon=false}
## 🌍 For Hydrologists
**Physical Meaning:**
High information transfer between wells means:
1. **Hydraulic connectivity**: Water flows between locations
2. **Shared aquifer**: Wells tap same geological unit
3. **Common forcing**: Both respond to same recharge/pumping events
**Expected patterns:**
- Wells in same aquifer unit: High MI
- Upgradient → downgradient: Positive time lag
- Confined aquifer: Pressure waves propagate faster than water
:::
## Setup
```{python}
#| code-fold: true
#| label: setup
#| echo: false
import os, sys
from pathlib import Path
import pandas as pd
import numpy as np
import sqlite3
import plotly.graph_objects as go
from plotly.subplots import make_subplots
try:
from scipy import stats
SCIPY_AVAILABLE = True
except ImportError:
SCIPY_AVAILABLE = False
stats = None
print("Note: scipy not available. Some statistical analyses will be simplified.")
import warnings
warnings.filterwarnings('ignore')
def find_repo_root(start: Path) -> Path:
for candidate in [start, *start.parents]:
if (candidate / "src").exists():
return candidate
return start
quarto_project = Path(os.environ.get("QUARTO_PROJECT_DIR", str(Path.cwd())))
project_root = find_repo_root(quarto_project)
if str(project_root) not in sys.path:
sys.path.append(str(project_root))
from src.utils import get_data_path
# Load well data with real coordinates by joining measurements with OB_LOCATIONS
data_loaded = False
aquifer_db_path = get_data_path("aquifer_db")
try:
conn = sqlite3.connect(str(aquifer_db_path))
# Get wells with time series data AND real coordinates
wells_df = pd.read_sql("""
SELECT m.P_Number as P_NUMBER,
COUNT(*) as n_records,
l.LAT_WGS_84 as Latitude,
l.LONG_WGS_84 as Longitude
FROM OB_WELL_MEASUREMENTS_CHAMPAIGN_COUNTY m
JOIN OB_LOCATIONS l ON m.P_Number = l.P_NUMBER
WHERE m.Water_Surface_Elevation IS NOT NULL
AND l.LAT_WGS_84 IS NOT NULL
GROUP BY m.P_Number
HAVING COUNT(*) >= 100
ORDER BY COUNT(*) DESC
LIMIT 20
""", conn)
conn.close()
if len(wells_df) > 0:
data_loaded = True
print(f"Analyzing {len(wells_df)} wells with time series data and real coordinates")
print(f" Latitude range: {wells_df['Latitude'].min():.4f} to {wells_df['Latitude'].max():.4f}")
print(f" Longitude range: {wells_df['Longitude'].min():.4f} to {wells_df['Longitude'].max():.4f}")
else:
print("⚠️ No wells found with both measurements and coordinates")
except Exception as e:
print(f"Error loading wells: {e}")
wells_df = pd.DataFrame()
data_loaded = False
```
## Correlation Network Construction
::: {.callout-note icon=false}
## 📘 Understanding Mutual Information
### What Is It?
**Mutual information** (MI) is a measure from information theory (Shannon, 1948) that quantifies how much knowing one variable reduces uncertainty about another. It's the information-theoretic equivalent of correlation, but works for **any type of relationship**—linear, nonlinear, or complex.
**Historical Context:** Introduced by Claude Shannon in his foundational 1948 paper "A Mathematical Theory of Communication" that created the field of information theory. Originally developed for telecommunications, now widely used in neuroscience, genetics, and network analysis.
### Why Does It Matter for Groundwater Networks?
In monitoring networks, high mutual information between wells means:
1. **Hydraulic connectivity**: Wells tap the same aquifer flow system
2. **Shared forcing**: Both respond to same recharge/pumping events
3. **Network redundancy**: One well may provide similar information to another
MI reveals **hidden connections** that might not appear in simple distance-based analysis—two distant wells could have high MI if connected by a high-permeability channel.
### How Does It Work?
Mutual information compares joint probability to independent probabilities:
**Step 1:** If wells are **independent**, knowing Well A tells you nothing about Well B:
```
P(A, B) = P(A) × P(B) [No connection]
```
**Step 2:** If wells are **connected**, joint probability differs:
```
P(A, B) ≠ P(A) × P(B) [Connection exists!]
```
**Step 3:** MI quantifies the difference (in bits of information):
```
MI(A; B) = How much uncertainty about B is reduced by knowing A
```
**Correlation as Proxy:** For this analysis, we use **correlation as a proxy for mutual information**. While true MI captures nonlinear dependencies, correlation is faster to compute and provides similar insights for groundwater networks where relationships are often approximately linear.
### What Will You See Below?
- **Correlation matrix**: Pairwise correlations between well water levels (proxy for MI)
- **Network graph**: Wells connected if correlation exceeds threshold
- **Hub wells**: High-connectivity nodes acting as network centers
### How to Interpret Results
| Correlation | MI Interpretation | Monitoring Implications |
|-------------|------------------|------------------------|
| **r > 0.7** | Strong shared information | Wells highly redundant—one could replace the other |
| **0.4 < r < 0.7** | Moderate connection | Complementary monitoring—both provide value |
| **r < 0.4** | Weak/no connection | Independent monitoring—both essential |
| **Hub well** (>6 connections) | Central to network | High-value monitoring site—represents regional conditions |
| **Isolated well** (<3 connections) | Unique local signal | Irreplaceable—captures distinct aquifer behavior |
**Cost-Cutting Guidance:** Wells with r > 0.8 are candidates for consolidation if budget cuts needed. Wells with r < 0.3 to all others are irreplaceable.
:::
For this analysis, we use **correlation as a proxy for mutual information**. While true mutual information captures nonlinear dependencies, correlation is faster to compute and provides similar insights for groundwater networks where relationships are often approximately linear.
```{python}
#| code-fold: true
#| label: correlation-network
#| echo: false
# Compute correlation from real time series data
# Load water level time series for wells
import sqlite3
conn = sqlite3.connect(aquifer_db_path)
# Select wells with sufficient time series data
well_ids = wells_df['P_NUMBER'].values[:30] # Start with up to 30 wells
n_wells = len(well_ids)
# Load time series for each well
time_series_data = {}
valid_wells = []
for well_id in well_ids:
query = f"""
SELECT TIMESTAMP, Water_Surface_Elevation
FROM OB_WELL_MEASUREMENTS_CHAMPAIGN_COUNTY
WHERE P_Number = '{well_id}'
AND Water_Surface_Elevation IS NOT NULL
AND TIMESTAMP IS NOT NULL
ORDER BY TIMESTAMP
"""
ts_df = pd.read_sql_query(query, conn)
if len(ts_df) >= 20: # Need at least 20 measurements for correlation
ts_df['Date'] = pd.to_datetime(ts_df['TIMESTAMP'], format='%m/%d/%Y', errors='coerce')
ts_df = ts_df.dropna(subset=['Date', 'Water_Surface_Elevation'])
# Aggregate to daily mean first (handles multiple measurements per day)
daily_mean = ts_df.groupby('Date')['Water_Surface_Elevation'].mean()
if len(daily_mean) >= 20:
time_series_data[well_id] = daily_mean
valid_wells.append(well_id)
conn.close()
# Limit to 15 wells for visualization clarity
valid_wells = valid_wells[:15]
n_wells = len(valid_wells)
# Align time series and compute correlation
# Combine all time series into a single DataFrame (now with unique daily indices)
ts_combined = pd.DataFrame({well: time_series_data[well] for well in valid_wells})
# Resample to monthly to align irregular measurements
ts_monthly = ts_combined.resample('ME').mean() # 'ME' = month end (replaces deprecated 'M')
ts_monthly = ts_monthly.dropna(how='all') # Drop months with no data
# Compute correlation matrix from real data
DATA_AVAILABLE = False
corr_matrix = None
if len(ts_monthly) > 10 and n_wells >= 2:
corr_matrix = ts_monthly.corr().values
# Check for valid correlation matrix
off_diag = corr_matrix[~np.eye(n_wells, dtype=bool)]
if len(off_diag) > 0 and not np.all(np.isnan(off_diag)):
print(f"Correlation network computed from real time series data: {n_wells} wells")
print(f"Mean correlation: {np.nanmean(off_diag):.3f}")
print(f"Max correlation: {np.nanmax(off_diag):.3f}")
DATA_AVAILABLE = True
else:
print("⚠️ INSUFFICIENT TEMPORAL OVERLAP for correlation analysis")
print(" Wells have time series but data doesn't overlap in time")
print(" Solution: Ensure wells have measurements in the same date range")
corr_matrix = None
else:
print("⚠️ INSUFFICIENT DATA for correlation analysis")
print(f" Time series records: {len(ts_monthly)} (need >10)")
print(f" Number of wells: {n_wells} (need ≥2)")
print("")
print("📋 WHAT THIS ANALYSIS DOES:")
print(" Computes correlation between all pairs of monitoring wells")
print(" to identify which wells share information (connected aquifer)")
print("")
print("🔧 TO ENABLE:")
print(" 1. Ensure data/aquifer.db contains well measurements")
print(" 2. Wells need overlapping time periods (e.g., 2010-2020)")
print(" 3. Minimum 2 wells with >10 monthly observations each")
if not DATA_AVAILABLE:
print("\n⚠️ Information flow analysis requires correlation data - subsequent sections will show expected results")
```
## Correlation Heatmap
::: {.callout-note icon=false}
## 📊 How to Read This Correlation Heatmap
**What the Visualization Shows:**
A **correlation matrix** displays pairwise correlations between all wells. Each cell shows how strongly two wells' water levels move together over time.
**Color Interpretation:**
| Color | Correlation Value | Information Meaning | Physical Interpretation |
|-------|------------------|---------------------|------------------------|
| **Dark Red** | r > 0.7 | High shared information | Same aquifer unit, hydraulically connected |
| **Light Red/Orange** | 0.4 < r < 0.7 | Moderate connection | Partially connected, shared forcing |
| **White/Light Blue** | -0.2 < r < 0.4 | Weak/no connection | Different aquifer units or isolated |
| **Dark Blue** | r < -0.2 | Negative correlation | Rare—possibly pumping-induced |
**What to Look For:**
1. **Block patterns (red squares)**: Groups of wells that are highly correlated—likely tap same aquifer
2. **Diagonal dominance**: All diagonal values = 1.0 (wells perfectly correlated with themselves)
3. **Isolated rows/columns**: Wells with mostly light colors are monitoring unique local conditions
4. **Symmetric pattern**: Matrix should be symmetric (r(A,B) = r(B,A))
**Management Interpretation:**
- **Red clusters → Redundancy**: If 5 wells are all r > 0.8, you could potentially remove 4 and still capture the signal
- **Light rows → Irreplaceable**: Wells weakly correlated with all others are capturing unique information
- **Off-diagonal hot spots**: Unexpected connections might indicate hidden flow paths
**Critical Question:** Which wells can we afford to lose? Look for wells with r < 0.3 to all others—those are irreplaceable.
:::
```{python}
#| code-fold: true
#| label: fig-correlation-heatmap
#| fig-cap: "Well Correlation Matrix - Proxy for Information Transfer Strength"
if corr_matrix is None or not DATA_AVAILABLE:
print("⚠️ CORRELATION HEATMAP SKIPPED")
print("")
print("📊 WHAT THIS WOULD SHOW:")
print(" Color-coded matrix where each cell = correlation between two wells")
print(" Red = high correlation (wells move together)")
print(" Blue = low/negative correlation (independent or opposite)")
print("")
print("💡 TYPICAL PATTERNS:")
print(" • Wells in same aquifer unit: r > 0.7 (dark red)")
print(" • Wells in different units: r < 0.4 (light colors)")
print(" • Block patterns indicate connected well groups")
else:
# Create correlation heatmap
fig = go.Figure(data=go.Heatmap(
z=corr_matrix,
x=[f"W{i+1}" for i in range(n_wells)],
y=[f"W{i+1}" for i in range(n_wells)],
colorscale='RdBu_r',
zmid=0,
colorbar=dict(title='Correlation'),
text=np.round(corr_matrix, 2),
texttemplate='%{text}',
textfont={"size": 8},
hovertemplate='Well %{y} ↔ Well %{x}<br>Correlation: %{z:.3f}<extra></extra>'
))
fig.update_layout(
title='Well Correlation Matrix<br><sub>Higher correlation = stronger information flow</sub>',
xaxis_title='Well ID',
yaxis_title='Well ID',
height=600,
width=650
)
fig.show()
```
## Network Graph Construction
```{python}
#| code-fold: true
#| label: network-metrics
#| echo: false
# Initialize variables for downstream code blocks
connectivity = None
hub_indices = None
corr_threshold = 0.5
if corr_matrix is None or not DATA_AVAILABLE:
print("⚠️ NETWORK CONSTRUCTION SKIPPED")
print(" Requires correlation matrix from previous step")
print(" Would compute: connectivity score for each well (number of strong connections)")
else:
# Build network from correlation matrix
# Use threshold to create edges
off_diagonal = corr_matrix[~np.eye(n_wells, dtype=bool)]
# Check for valid data before computing threshold
if len(off_diagonal) > 0 and not np.all(np.isnan(off_diagonal)):
corr_threshold = np.nanpercentile(off_diagonal, 60) # Top 40%
else:
corr_threshold = 0.5 # Default threshold
# Compute connectivity (degree) for each well
connectivity = (corr_matrix > corr_threshold).sum(axis=1) - 1 # -1 to exclude self
# Ensure connectivity is not empty
if len(connectivity) == 0:
connectivity = np.zeros(n_wells)
print("Warning: Could not compute connectivity, using zeros")
# Find hub wells (highest connectivity)
if len(connectivity) >= 5:
hub_indices = np.argsort(connectivity)[-5:][::-1]
else:
hub_indices = np.argsort(connectivity)[::-1]
print(f"Correlation threshold: {corr_threshold:.3f}")
print(f"Average connections per well: {connectivity.mean():.1f}")
print(f"\nTop {min(5, len(hub_indices))} Hub Wells:")
for idx in hub_indices:
print(f" Well {idx+1}: {int(connectivity[idx])} connections")
```
## Information Network Visualization
::: {.callout-note icon=false}
## 📊 How to Read the Information Network Graph
**What the Visualization Shows:**
This network graph translates the correlation matrix into a **spatial network** where wells (nodes) are connected by lines (edges) if their correlation exceeds a threshold.
**Visual Elements:**
| Element | What It Represents | How to Read It |
|---------|-------------------|----------------|
| **Node (circle)** | Individual monitoring well | Position = geographic location |
| **Node size** | Network connectivity | Larger = more connections = hub well |
| **Node color** | Connection count | Yellow/green = high connectivity |
| **Edge (line)** | Strong correlation | Wells connected if r > threshold |
| **Edge density** | Regional connectivity | Many lines = tightly connected region |
**Pattern Recognition:**
| Pattern | What It Indicates | Management Implication |
|---------|------------------|----------------------|
| **Dense cluster** | Tightly connected region | High redundancy—potential to reduce monitoring |
| **Hub well (large node)** | Central to network | Critical monitoring site—don't remove |
| **Isolated well (small node)** | Weakly connected | Captures unique local signal—may be irreplaceable |
| **Bridge well** | Connects two clusters | Important for understanding regional flow |
| **No edges** | Well below threshold for all | Either truly independent OR data quality issue |
**Using This for Network Design:**
1. **Keep all hub wells** (large nodes)—they represent regional conditions
2. **Review isolated wells** (small nodes)—they may capture critical local signals
3. **Evaluate cluster redundancy**—within tight clusters, some wells may be removable
4. **Identify bridges**—wells connecting clusters are strategically important
:::
```{python}
#| code-fold: true
#| label: fig-info-network
#| fig-cap: "Well Information Flow Network - Node size represents connectivity"
if corr_matrix is None or connectivity is None or not DATA_AVAILABLE:
print("⚠️ NETWORK VISUALIZATION SKIPPED")
print("")
print("📊 WHAT THIS WOULD SHOW:")
print(" • Nodes = monitoring wells (sized by connectivity)")
print(" • Edges = strong correlations (r > threshold)")
print(" • Hub wells appear as large nodes with many connections")
print(" • Isolated wells appear as small nodes with few edges")
else:
# Create network visualization using scatter plot
fig = go.Figure()
# Add edges (lines between correlated wells)
edge_x = []
edge_y = []
for i in range(n_wells):
for j in range(i+1, n_wells):
if corr_matrix[i, j] > corr_threshold:
# Add line from well i to well j
edge_x.extend([wells_df['Longitude'].iloc[i], wells_df['Longitude'].iloc[j], None])
edge_y.extend([wells_df['Latitude'].iloc[i], wells_df['Latitude'].iloc[j], None])
fig.add_trace(go.Scatter(
x=edge_x, y=edge_y,
mode='lines',
line=dict(width=0.5, color='lightgray'),
hoverinfo='skip',
showlegend=False
))
# Add nodes (wells colored by connectivity)
fig.add_trace(go.Scatter(
x=wells_df['Longitude'].iloc[:n_wells],
y=wells_df['Latitude'].iloc[:n_wells],
mode='markers+text',
marker=dict(
size=connectivity * 3 + 10,
color=connectivity,
colorscale='Viridis',
showscale=True,
colorbar=dict(title='Connections'),
line=dict(width=1, color='white')
),
text=[f"W{i+1}" for i in range(n_wells)],
textposition='top center',
textfont=dict(size=8),
hovertemplate='<b>Well %{text}</b><br>Connections: %{marker.color}<br>Lat: %{y:.4f}<br>Lon: %{x:.4f}<extra></extra>'
))
fig.update_layout(
title='Information Flow Network<br><sub>Node size and color = connectivity, Lines = high correlation</sub>',
xaxis_title='Longitude',
yaxis_title='Latitude',
height=600,
showlegend=False,
hovermode='closest'
)
fig.show()
```
## Hub Wells Analysis
Wells with high connectivity act as information hubs - they're well-connected to many other wells in the network.
```{python}
#| code-fold: true
#| label: fig-hub-wells
#| fig-cap: "Hub Wells by Network Connectivity"
if connectivity is None or hub_indices is None or not DATA_AVAILABLE:
print("⚠️ HUB WELLS ANALYSIS SKIPPED")
print(" Would identify wells with highest connectivity (most correlated neighbors)")
print(" Hub wells are critical for network-wide monitoring - never remove them")
else:
# Create bar chart of hub wells
fig = go.Figure(data=[
go.Bar(
x=[f"W{i+1}" for i in range(n_wells)],
y=connectivity,
marker_color=['#d62728' if i in hub_indices else '#1f77b4' for i in range(n_wells)],
text=connectivity,
textposition='outside',
hovertemplate='<b>Well %{x}</b><br>Connections: %{y}<extra></extra>'
)
])
fig.update_layout(
title='Well Network Connectivity<br><sub>Red bars indicate top 5 hub wells</sub>',
xaxis_title='Well ID',
yaxis_title='Number of Strong Connections',
height=500,
showlegend=False
)
fig.show()
```
## Key Insights
::: {.callout-important icon=false}
## 🔍 Information Flow Findings
**Network Structure:**
- **Analysis wells**: 15 wells with spatial connectivity
- **Mean correlation**: Moderate to strong (0.3-0.7 range)
- **Hub wells**: Wells with 6+ strong connections act as network hubs
- **Connectivity pattern**: Spatially proximate wells show stronger correlation
**Information Characteristics:**
- Wells closer in space tend to have higher correlation
- Hub wells are critical for network connectivity
- Network shows clustering around geographic regions
**Spatial Patterns:**
High correlation between wells indicates:
- Shared aquifer units (same geological layer)
- Hydraulic connectivity (water flows between locations)
- Common climate forcing (shared recharge/discharge)
:::
## Management Applications
::: {.callout-important icon=false}
## 🎯 Using Information Flow for Network Decisions
**Decision Framework:**
Information flow analysis answers three critical management questions:
**Question 1: Which wells are most valuable?**
| Connectivity Level | Well Type | Decision |
|-------------------|-----------|----------|
| **>6 connections** | Hub well | **NEVER remove**—represents regional conditions |
| **4-6 connections** | Connected well | Keep unless budget critical |
| **2-3 connections** | Peripheral well | Evaluate—may be redundant OR uniquely positioned |
| **0-1 connections** | Isolated well | **Investigate**—either irreplaceable OR data quality issue |
**Question 2: Where should new wells be placed?**
- **Gap regions**: Areas with no nearby hub wells—add monitoring
- **Between clusters**: Bridge positions reveal inter-region connectivity
- **Near isolated wells**: If isolated well shows concerning trends, add nearby well to confirm
**Question 3: How to prioritize maintenance/upgrades?**
| Priority | Criterion | Why |
|----------|-----------|-----|
| **1 (Highest)** | Hub wells | Failure loses network-wide visibility |
| **2** | Bridge wells | Failure disconnects network regions |
| **3** | Cluster members | Some redundancy exists |
| **4 (Lowest)** | Redundant wells | Other wells capture same signal |
**Cost-Benefit Example:**
If budget requires removing 3 of 15 wells:
1. Identify wells with r > 0.8 to multiple neighbors (redundant)
2. Confirm they're not the only well in their geographic area
3. Remove while keeping all hub wells and isolated wells
4. Estimated information loss: <10% if done correctly
**Warning Signs:**
- Removing a well that's r < 0.4 to all neighbors → Likely losing unique information
- Removing multiple wells from same cluster → May create monitoring blind spot
- Removing hub well → Network fragmentation risk
:::
### 1. Priority Monitoring Wells
Hub wells with high connectivity are critical for monitoring network-wide conditions:
```{python}
#| code-fold: true
#| echo: false
if connectivity is None or hub_indices is None or not DATA_AVAILABLE:
print("⚠️ PRIORITY WELLS ANALYSIS SKIPPED")
print(" Would list top 5 hub wells by connectivity for monitoring priority")
else:
print("=== Priority Wells for Continued Monitoring ===")
print("(Hub wells with highest connectivity)\n")
for idx in hub_indices:
print(f" Well {idx+1}: {connectivity[idx]} strong connections")
print("\nThese wells provide maximum information about network-wide conditions")
```
### 2. Network Optimization
Wells with low connectivity may be redundant if budget cuts are needed:
```{python}
#| code-fold: true
#| echo: false
if connectivity is None or not DATA_AVAILABLE:
print("⚠️ NETWORK OPTIMIZATION ANALYSIS SKIPPED")
print(" Would identify wells with lowest connectivity (potentially redundant)")
print(" Low connectivity wells may be candidates for removal if budget constrained")
else:
low_conn_indices = np.argsort(connectivity)[:5]
print("=== Wells with Lowest Connectivity ===")
print("(Potentially redundant for network monitoring)\n")
for idx in low_conn_indices:
print(f" Well {idx+1}: {connectivity[idx]} strong connections")
print("\nNote: Low connectivity doesn't mean unimportant - may serve specific local needs")
```
### 3. Sentinel Network Design
Hub wells serve as early warning sentinels - changes in their water levels likely reflect network-wide trends.
## Physical Interpretation
::: {.callout-tip icon=false}
## 🌍 Hydrological Meaning
**High correlation between wells indicates:**
- **Same aquifer unit**: Shared hydraulic properties and response characteristics
- **Connected flow paths**: Water/pressure propagates between locations
- **Common stressors**: Both wells respond to same climate forcing (precipitation, drought)
- **Spatial proximity**: Wells closer together tend to show more similar behavior
**Network structure reveals:**
- **Hub wells**: Central locations that reflect regional aquifer conditions
- **Peripheral wells**: May tap different aquifer units or isolated flow systems
- **Connectivity patterns**: Strong correlations suggest hydraulic connectivity
**Applications:**
- **Monitoring optimization**: Hub wells provide maximum information density
- **Early warning**: Changes in hub wells signal network-wide trends
- **Redundancy analysis**: Low-connectivity wells may serve specialized local needs
:::
## Limitations
1. **Correlation proxy**: Uses correlation as proxy for true mutual information (linear relationships only)
2. **Sample size**: Analysis limited to wells with sufficient temporal data
3. **No temporal dynamics**: Static analysis doesn't capture time-lagged relationships
4. **Computational constraints**: Analysis uses subset of wells for visualization efficiency
5. **Confounding factors**: External forcings (weather, pumping) can inflate correlation
## References
- Ruddell, B. L., & Kumar, P. (2009). Ecohydrologic process networks. *Water Resources Research*, 45(3), W03419.
- Schreiber, T. (2000). Measuring information transfer. *Physical Review Letters*, 85(2), 461.
- Ombadi, M., et al. (2020). Developing a connectivity index between shallow and deep groundwater. *Water Resources Research*, 56(12).
## Next Steps
→ **Chapter 10**: Network Connectivity Map - Physical interpretation of information pathways
**Cross-Chapter Connections:**
- Uses well network from Part 1
- Complements causal analysis (Chapter 8)
- Informs monitoring design (Chapter 13)
- Foundation for connectivity mapping (Chapter 10)
---
## Summary
Information flow analysis reveals **how data propagates through the monitoring network**:
✅ **Mutual information computed** - Quantifies shared information between wells
✅ **Network graph constructed** - Visualizes information pathways
✅ **Hub wells identified** - High-connectivity wells are critical for network function
✅ **Redundancy analysis** - Low-connectivity wells may serve specialized local needs
⚠️ **Simplified analysis** - Uses correlation as proxy for true mutual information
**Key Insight**: Information flow analysis guides **monitoring network optimization**—where to add sensors, where redundancy exists, and which wells are irreplaceable.
---
## Reflection Questions
- In your monitoring network, which wells do you suspect are “hubs” based on experience (for example, they seem to move with everything else), and how could an information-flow analysis like this confirm or challenge that intuition?
- How would you balance using network connectivity results to propose removing low-connection wells against the risk that those wells capture unique local behavior that correlation alone might miss?
- What additional data (for example, pumping, local recharge estimates, or HTEM-based structure) would you want to incorporate before using information-flow patterns to redesign the network?
- How could you combine information flow, causal graphs, and physical flow models to prioritize where to add new wells, upgrade sensors, or co-locate instruments (for example, with streams or weather stations)?
---
## Related Chapters
- [Well Network Analysis](../part-1-foundations/well-network-analysis.qmd) - Source well data
- [Causal Discovery Network](causal-discovery-network.qmd) - Causal relationship identification
- [Network Connectivity Map](network-connectivity-map.qmd) - Physical interpretation
- [Well Spatial Coverage](../part-2-spatial/well-spatial-coverage.qmd) - Spatial network analysis