---
title: "HTEM-Groundwater Fusion"
subtitle: "Subsurface Structure Controls Water Response"
code-fold: true
---
::: {.callout-tip icon=false}
## For Newcomers
**You will get:**
- A picture of how **underground materials** (from HTEM) relate to how **water levels behave** in wells.
- Examples of how structure (clay vs sand, thickness, depth) influences trends and seasonality.
- Practice reading fusion plots that connect “what the ground is like” to “how the water moves”.
You can skim detailed model-fitting steps and focus on:
- The maps and plots showing structure vs response,
- The narrative about which structural patterns matter most for groundwater behavior,
- And the main qualitative conclusions.
:::
**Data Sources Fused**: HTEM Structure + Groundwater Wells
## What You Will Learn in This Chapter
By the end of this chapter, you will be able to:
- Describe how HTEM-derived subsurface structure (resistivity, material types, depth) relates to observed groundwater behavior in wells.
- Interpret maps and summary plots that link structural classes (clay, mixed, sand/gravel) to response metrics such as variance, autocorrelation, and rate of change.
- Explain why fusing dense spatial geophysics with sparse but rich well time series gives a more complete picture than either alone.
- Discuss how structure–response relationships can inform well placement, monitoring frequency, and vulnerability assessment.
## Overview
HTEM geophysics reveals the 3D structure of the subsurface - where clay layers confine the aquifer, where sand and gravel transmit water rapidly, and where bedrock forms boundaries. Groundwater well measurements show how water levels respond to pumping and recharge. **The fusion question**: How does geological structure control groundwater behavior?
This chapter connects static structure (HTEM) to dynamic response (water levels).
::: {.callout-note icon=false}
## 💻 For Computer Scientists
This is a **structure-function mapping problem**:
- **Input**: HTEM resistivity (proxy for lithology)
- **Output**: Well response characteristics (variance, trend, autocorrelation)
- **Challenge**: Non-stationary relationships - same resistivity means different things at different depths
**Machine learning approach**: Spatial features from HTEM predict temporal patterns in wells.
:::
::: {.callout-tip icon=false}
## 🌍 For Hydrologists
Aquifer properties control well response:
| HTEM Signature | Lithology | Expected Well Response |
|---------------|-----------|----------------------|
| Low resistivity (<50 Ω·m) | Clay/silt | Low variance, slow response, confined |
| Medium resistivity (50-120 Ω·m) | Mixed sediments | Moderate variance, seasonal |
| High resistivity (>120 Ω·m) | Sand/gravel | High variance, rapid response, unconfined |
**Key insight**: Wells in high-resistivity zones should show faster recharge response.
:::
## Analysis Approach
```{python}
#| code-fold: true
#| label: load-htem-data
#| output: false
import os
import sys
from pathlib import Path
import pandas as pd
import numpy as np
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import warnings
warnings.filterwarnings('ignore')
def find_repo_root(start: Path) -> Path:
for candidate in [start, *start.parents]:
if (candidate / "src").exists():
return candidate
return start
quarto_project = Path(os.environ.get("QUARTO_PROJECT_DIR", str(Path.cwd())))
project_root = find_repo_root(quarto_project)
if str(project_root) not in sys.path:
sys.path.append(str(project_root))
# Load HTEM Unit D data using IntegratedDataLoader
data_loaded = False
try:
from src.data_loaders import IntegratedDataLoader
from src.utils import get_data_path
htem_root = get_data_path("htem_root")
aquifer_db_path = get_data_path("aquifer_db")
weather_db_path = get_data_path("warm_db")
usgs_stream_root = get_data_path("usgs_stream")
loader = IntegratedDataLoader(
htem_path=htem_root,
aquifer_db_path=aquifer_db_path,
weather_db_path=weather_db_path,
usgs_stream_path=usgs_stream_root
)
# Load Unit D (primary aquifer)
htem_unit_d = loader.htem.load_material_type_grid('D', 'Preferred', sample_size=50000)
# Rename columns if needed for compatibility
if 'MT_Index' in htem_unit_d.columns:
htem_unit_d['MaterialType'] = htem_unit_d['MT_Index']
# Estimate resistivity from material type if not available
if 'Resistivity' not in htem_unit_d.columns:
# Map material type to approximate resistivity
mt_to_resist = {
1: 15, 2: 20, 3: 30, 4: 45, 5: 60, 6: 75, 7: 90,
8: 110, 9: 130, 10: 150, 11: 180, 12: 220, 13: 280, 14: 350,
101: 400, 102: 450, 103: 500, 104: 550, 105: 600
}
htem_unit_d['Resistivity'] = htem_unit_d['MaterialType'].map(mt_to_resist).fillna(100)
# Add depth column if not available
if 'Depth' not in htem_unit_d.columns:
htem_unit_d['Depth'] = -htem_unit_d['Z']
data_loaded = True
print(f"✅ HTEM points loaded: {len(htem_unit_d):,}")
except Exception as e:
print(f"Error loading HTEM via IntegratedDataLoader ({e}). Loading directly from CSV files.")
import glob
from src.utils import get_data_path
htem_root = get_data_path("htem_root")
# Load HTEM data from 3DGrids CSV files
htem_files = glob.glob(f"{htem_root}/3DGrids/*.csv")
# Find Unit D file (primary aquifer)
unit_d_file = None
for htem_file in htem_files:
if 'Unit_D' in htem_file or 'UnitD' in htem_file or 'unit_d' in htem_file.lower():
unit_d_file = htem_file
break
# If Unit D not found, try preferred model files
if unit_d_file is None:
for htem_file in htem_files:
if 'preferred' in htem_file.lower() or 'Preferred' in htem_file:
unit_d_file = htem_file
break
# If still not found, use any available CSV
if unit_d_file is None and len(htem_files) > 0:
unit_d_file = htem_files[0]
if unit_d_file:
print(f"Loading HTEM data from: {unit_d_file}")
htem_unit_d = pd.read_csv(unit_d_file)
# Identify columns (names may vary across files)
# Look for X, Y, Z coordinates
x_col = None
y_col = None
z_col = None
resist_col = None
material_col = None
for col in htem_unit_d.columns:
col_lower = col.lower()
if col_lower in ['x', 'easting', 'utmx']:
x_col = col
elif col_lower in ['y', 'northing', 'utmy']:
y_col = col
elif col_lower in ['z', 'elevation', 'depth']:
z_col = col
elif 'resist' in col_lower or 'rho' in col_lower:
resist_col = col
elif 'material' in col_lower or 'type' in col_lower:
material_col = col
# Rename to standard column names
rename_map = {}
if x_col: rename_map[x_col] = 'X'
if y_col: rename_map[y_col] = 'Y'
if z_col: rename_map[z_col] = 'Z'
if resist_col: rename_map[resist_col] = 'Resistivity'
if material_col: rename_map[material_col] = 'MaterialType'
htem_unit_d = htem_unit_d.rename(columns=rename_map)
# Add depth column if not available
if 'Depth' not in htem_unit_d.columns and 'Z' in htem_unit_d.columns:
htem_unit_d['Depth'] = -htem_unit_d['Z']
# Ensure numeric types
if 'Resistivity' in htem_unit_d.columns:
htem_unit_d['Resistivity'] = pd.to_numeric(htem_unit_d['Resistivity'], errors='coerce')
if 'MaterialType' in htem_unit_d.columns:
htem_unit_d['MaterialType'] = pd.to_numeric(htem_unit_d['MaterialType'], errors='coerce')
# Drop rows with missing essential data
essential_cols = [col for col in ['X', 'Y', 'Resistivity'] if col in htem_unit_d.columns]
htem_unit_d = htem_unit_d.dropna(subset=essential_cols)
data_loaded = True
print(f"✅ HTEM points loaded from CSV: {len(htem_unit_d):,}")
# Clip extreme resistivity values
htem_unit_d['Resistivity'] = htem_unit_d['Resistivity'].clip(1, 500)
print(f"Resistivity range: {htem_unit_d['Resistivity'].min():.1f} - {htem_unit_d['Resistivity'].max():.1f} Ω·m")
print(f"Mean resistivity: {htem_unit_d['Resistivity'].mean():.1f} Ω·m")
```
## HTEM Spatial Distribution
Before we can link subsurface structure to groundwater behavior, we need to understand **what the structure looks like**. The map below shows HTEM resistivity measurements across the study area, classified into three broad categories that correspond to different geological materials:
- **Low resistivity (red)**: Clay and silt—these fine-grained materials restrict water flow and create confined conditions
- **Medium resistivity (orange)**: Mixed sediments—transitional zones with moderate permeability
- **High resistivity (blue)**: Sand and gravel—these coarse materials transmit water readily and characterize productive aquifer zones
Look for **spatial patterns**: Are high-resistivity zones clustered? Do they follow buried river channels? These patterns will help explain why wells in different locations behave differently.
::: {.callout-note icon=false}
## 📊 How to Read This HTEM Map
**Color-coded point cloud showing subsurface geology:**
| Color | Resistivity Range | Lithology Interpretation | Aquifer Quality |
|-------|-------------------|--------------------------|-----------------|
| **Red** | <50 Ω·m | Clay/Silt | Poor (confining layer) |
| **Orange** | 50-120 Ω·m | Mixed sediments | Moderate |
| **Blue** | >120 Ω·m | Sand/Gravel | Excellent (productive aquifer) |
**Spatial Patterns to Identify:**
- **Linear blue corridors**: Buried paleochannels (ancient river deposits)—high-yield aquifer zones
- **Patchy blue zones**: Isolated sand lenses within clay matrix
- **Red blankets**: Continuous clay layers that create confined conditions
- **Sharp color boundaries**: Distinct geological contacts (faults, erosional surfaces)
- **Gradational transitions**: Mixing of sediment types
**Why this matters:** High-resistivity corridors are prime targets for well placement and recharge enhancement. Red (clay) zones should be avoided for production wells but may indicate protected aquifer zones.
:::
```{python}
#| code-fold: true
#| label: fig-htem-spatial-distribution
#| fig-cap: "HTEM resistivity distribution showing subsurface structure and material types"
# Sample for visualization (plot every 10th point for speed)
sample_idx = np.arange(0, len(htem_unit_d), max(1, len(htem_unit_d) // 10000))
htem_sample = htem_unit_d.iloc[sample_idx].copy()
# Create resistivity classification
def classify_resistivity(resist):
if resist < 50:
return 'Low (<50 Ω·m)<br>Clay/Silt'
elif resist < 120:
return 'Medium (50-120 Ω·m)<br>Mixed Sediments'
else:
return 'High (>120 Ω·m)<br>Sand/Gravel'
htem_sample['ResistivityClass'] = htem_sample['Resistivity'].apply(classify_resistivity)
# Create scatter plot with material type coloring
fig = go.Figure()
# Plot by resistivity class
resist_classes = ['Low (<50 Ω·m)<br>Clay/Silt',
'Medium (50-120 Ω·m)<br>Mixed Sediments',
'High (>120 Ω·m)<br>Sand/Gravel']
colors = ['#e74c3c', '#f39c12', '#3498db']
for resist_class, color in zip(resist_classes, colors):
data = htem_sample[htem_sample['ResistivityClass'] == resist_class]
fig.add_trace(go.Scatter(
x=data['X'],
y=data['Y'],
mode='markers',
marker=dict(
size=3,
color=color,
opacity=0.6
),
name=resist_class,
hovertemplate='<b>HTEM Point</b><br>' +
'X: %{x:.0f}<br>' +
'Y: %{y:.0f}<br>' +
'Resistivity: %{customdata[0]:.1f} Ω·m<br>' +
'Material Type: %{customdata[1]}<br>' +
'Depth: %{customdata[2]:.1f} m<br>' +
'<extra></extra>',
customdata=data[['Resistivity', 'MaterialType', 'Depth']].values
))
fig.update_layout(
title={
'text': 'HTEM Unit D Resistivity Distribution (Primary Aquifer)<br><sub>Colors indicate lithology: Clay → Sand/Gravel</sub>',
'x': 0.5,
'xanchor': 'center'
},
xaxis_title='Easting (m, UTM Zone 16N)',
yaxis_title='Northing (m, UTM Zone 16N)',
height=700,
hovermode='closest',
plot_bgcolor='white',
legend=dict(
title='Resistivity Class',
yanchor='top',
y=0.99,
xanchor='left',
x=0.01,
bgcolor='rgba(255, 255, 255, 0.8)'
)
)
# Equal aspect ratio
fig.update_yaxes(scaleanchor="x", scaleratio=1)
fig.show()
```
**What to notice in this map:**
The spatial distribution of resistivity reveals the **geological architecture** of the aquifer system. In this study area, you may observe:
1. **Linear features**: High-resistivity corridors often trace buried paleochannels—ancient river deposits filled with sand and gravel that now serve as preferential flow paths
2. **Patchy patterns**: Isolated high-resistivity zones may represent localized sand lenses within a clay matrix
3. **Gradational boundaries**: Where colors blend gradually, the subsurface transitions smoothly between material types; sharp boundaries indicate distinct geological contacts
This heterogeneity is **not random**—it reflects thousands of years of sediment deposition, erosion, and reworking. Understanding these patterns is the first step toward predicting how water moves through the system.
## Material Type Distribution
While the map above shows **where** different materials occur, we also need to understand **how much** of each material type exists. This matters because:
- If the aquifer is dominated by clay, most wells will show confined, stable behavior
- If sand and gravel dominate, wells will be more responsive to recharge and pumping
- A mix of materials suggests complex, spatially-variable responses
The histogram below shows the frequency of each material type, while the box plots reveal how resistivity varies within each type (some overlap is expected due to measurement uncertainty and geological variability).
::: {.callout-note icon=false}
## 📊 Interpreting Material Type Distribution
**This figure reveals the overall geological character of the aquifer:**
**Left Panel - Frequency Histogram:**
| Distribution Pattern | Aquifer Interpretation | Expected Well Behavior |
|---------------------|------------------------|------------------------|
| **Skewed left (types 1-5 dominate)** | Clay-dominated system | Stable, confined, slow response |
| **Bimodal (peaks at 3 AND 11)** | Mixed system: clay + sand layers | Variable response by location |
| **Skewed right (types 10-14 dominate)** | Sand/gravel-dominated | Responsive, high variability |
**Right Panel - Resistivity Box Plots:**
- **Increasing median with type**: Confirms physical basis (higher type = coarser = higher resistivity)
- **Overlapping boxes**: Natural geological variability—boundaries are gradational, not sharp
- **Wide boxes (high IQR)**: Heterogeneous material type with local variation
- **Narrow boxes**: Well-sorted, consistent material
**Physical meaning:** A bimodal distribution (like this aquifer) indicates two distinct environments coexist—perhaps a confined clay-capped zone overlying an unconfined sand zone.
:::
```{python}
#| code-fold: true
#| label: fig-material-type-distribution
#| fig-cap: "Distribution of material types showing aquifer heterogeneity"
# Create histogram of material types
material_counts = htem_unit_d['MaterialType'].value_counts().sort_index()
# Material type descriptions (simplified)
material_labels = {
1: 'Type 1: Clay',
2: 'Type 2: Silty Clay',
3: 'Type 3: Clayey Silt',
4: 'Type 4: Silt',
5: 'Type 5: Sandy Silt',
6: 'Type 6: Silty Sand',
7: 'Type 7: Fine Sand',
8: 'Type 8: Medium Sand',
9: 'Type 9: Coarse Sand',
10: 'Type 10: Very Coarse Sand',
11: 'Type 11: Well Sorted Sand',
12: 'Type 12: Very Well Sorted Sand',
13: 'Type 13: Extremely Well Sorted Sand',
14: 'Type 14: Sand/Gravel'
}
# Create bar chart
fig = make_subplots(
rows=1, cols=2,
subplot_titles=('Material Type Frequency', 'Resistivity by Material Type'),
specs=[[{'type': 'bar'}, {'type': 'box'}]]
)
# Bar chart of material type counts
fig.add_trace(
go.Bar(
x=material_counts.index,
y=material_counts.values,
marker_color='steelblue',
text=material_counts.values,
textposition='outside',
hovertemplate='<b>%{x}</b><br>Count: %{y:,}<extra></extra>'
),
row=1, col=1
)
# Box plot of resistivity by material type
for mat_type in sorted(htem_unit_d['MaterialType'].unique())[:10]: # Top 10 types
data = htem_unit_d[htem_unit_d['MaterialType'] == mat_type]
fig.add_trace(
go.Box(
y=data['Resistivity'],
name=f'Type {mat_type}',
boxmean='sd',
marker_color='coral'
),
row=1, col=2
)
fig.update_xaxes(title_text='Material Type', row=1, col=1)
fig.update_xaxes(title_text='Material Type', row=1, col=2)
fig.update_yaxes(title_text='Count', row=1, col=1)
fig.update_yaxes(title_text='Resistivity (Ω·m)', type='log', row=1, col=2)
fig.update_layout(
title_text='HTEM Material Type Characterization',
height=500,
showlegend=False
)
fig.show()
```
**Interpreting the distribution:**
The material type histogram reveals the **overall character** of the aquifer system:
- **Skewed toward low types (1-5)**: A clay-dominated system—expect most wells to show stable, confined behavior with slow response to climate forcing
- **Skewed toward high types (10-14)**: A sand/gravel-dominated system—expect responsive wells with high variability
- **Bimodal distribution**: Two distinct aquifer environments coexist—wells will behave very differently depending on their location
The box plots on the right show that resistivity increases systematically with material type, confirming the physical basis of our classification. However, note the **overlap between adjacent types**—this reflects both measurement uncertainty and the continuous nature of geological transitions. A resistivity of 80 Ω·m could be dense sand or loose gravel; context (depth, surrounding materials) helps resolve ambiguity.
## Spatial Data Extraction
Now comes the **critical fusion step**: linking HTEM measurements to well locations. The challenge is that HTEM and wells sample the subsurface at different scales:
- **HTEM**: Dense grid (~100m spacing), but measures bulk properties averaged over a volume
- **Wells**: Point measurements, but capture actual aquifer conditions at specific depths
Our approach: for each well, extract HTEM statistics from all points within a 1 km radius. Why 1 km?
1. **Hydrogeological relevance**: Groundwater flow integrates conditions over kilometers, not meters
2. **Statistical robustness**: A larger radius captures more HTEM points, yielding more stable statistics
3. **Scale matching**: 1 km roughly matches the "capture zone" of a pumping well
The function below uses a **KD-tree** for efficient spatial search—essential when matching thousands of wells to millions of HTEM points.
```python
#| code-fold: true
from scipy.spatial import cKDTree
def extract_htem_at_wells(wells_df, htem_df, radius_km=1.0):
"""
Extract HTEM properties around each well.
For each well, find all HTEM points within radius and compute statistics.
Returns:
DataFrame with well_id and HTEM statistics
"""
# Build KD-tree for fast spatial search
htem_coords = htem_df[['Latitude', 'Longitude']].values
tree = cKDTree(htem_coords)
results = []
for idx, well in wells_df.iterrows():
well_coord = [well['Latitude'], well['Longitude']]
# Find HTEM points within radius (approximate degrees)
radius_deg = radius_km / 111.0 # 1 degree ≈ 111 km
indices = tree.query_ball_point(well_coord, radius_deg)
if len(indices) == 0:
continue
# Extract HTEM properties in neighborhood
htem_local = htem_df.iloc[indices]
results.append({
'WellID': well['WellID'],
'Latitude': well['Latitude'],
'Longitude': well['Longitude'],
'n_htem_points': len(htem_local),
# Resistivity statistics
'resistivity_mean': htem_local['Resistivity'].mean(),
'resistivity_std': htem_local['Resistivity'].std(),
'resistivity_min': htem_local['Resistivity'].min(),
'resistivity_max': htem_local['Resistivity'].max(),
'resistivity_p25': htem_local['Resistivity'].quantile(0.25),
'resistivity_p75': htem_local['Resistivity'].quantile(0.75),
# Material type mode
'material_mode': htem_local['MaterialType'].mode().iloc[0] if len(htem_local) > 0 else np.nan,
# Depth statistics
'depth_mean': htem_local['Depth'].mean(),
'depth_range': htem_local['Depth'].max() - htem_local['Depth'].min(),
})
return pd.DataFrame(results)
# Extract HTEM properties at well locations (1 km radius)
wells_with_htem = extract_htem_at_wells(wells_summary, htem_unit_d, radius_km=1.0)
print(f"Wells with HTEM data: {len(wells_with_htem)}")
print(f"Mean HTEM points per well: {wells_with_htem['n_htem_points'].mean():.0f}")
```
For each well, we now have a **structural fingerprint**: mean resistivity, variability, depth range, and dominant material type. These features will be used to predict and explain well behavior.
::: {.callout-note icon=false}
## Why extract statistics, not just the nearest point?
A single HTEM point might be anomalous—an isolated clay lens in otherwise sandy material. By computing statistics over a neighborhood, we capture:
- **Central tendency** (mean, median): What's the "typical" material?
- **Variability** (std, range): How heterogeneous is the local geology?
- **Extremes** (min, max, percentiles): What's the best/worst case scenario?
This statistical approach makes our fusion robust to local anomalies and measurement noise.
:::
## Well Response Characterization
With HTEM structure quantified, we now need to characterize **how each well behaves over time**. Rather than analyzing raw hydrographs, we extract summary metrics that capture different aspects of well response:
| Metric | What It Measures | Physical Meaning |
|--------|------------------|------------------|
| **Variance** | Spread of water levels | How much the well fluctuates—high variance suggests unconfined conditions |
| **Autocorrelation (lag 30)** | Memory in the system | How long past conditions influence the present—high ACF means slow response |
| **Rate of Change** | Average daily fluctuation | How quickly levels rise/fall—high ROC indicates rapid recharge/discharge |
| **Trend** | Long-term direction | Is the aquifer gaining or losing water over time? |
| **Coefficient of Variation** | Normalized variability | Allows comparison across wells with different mean levels |
These metrics transform a time series (potentially thousands of measurements) into a compact **behavioral fingerprint** that can be linked to structure.
```python
#| code-fold: true
# NOTE: This code requires IntegratedDataLoader from src.data_loaders
# In production: loader = IntegratedDataLoader()
# For demonstration, this shows the methodology; actual execution
# requires database connection (see Data Dictionary chapter)
def characterize_well_response(loader, well_id):
"""
Compute temporal response metrics for a well.
Returns:
Dict with response characteristics
"""
try:
ts = loader.groundwater.load_well_time_series(well_id)
if len(ts) < 50: # Need sufficient data
return None
wl = ts['WaterLevelElevation'].dropna()
# Temporal statistics
variance = wl.var()
mean_level = wl.mean()
trend = np.polyfit(range(len(wl)), wl, 1)[0] # Linear trend
# Range and volatility
level_range = wl.max() - wl.min()
cv = wl.std() / wl.mean() if wl.mean() != 0 else 0 # Coefficient of variation
# Autocorrelation (lag 30 days)
if len(wl) > 30:
acf_30 = wl.autocorr(lag=30) if hasattr(wl, 'autocorr') else np.corrcoef(wl[:-30], wl[30:])[0, 1]
else:
acf_30 = np.nan
# Rate of change (average absolute day-to-day change)
if len(wl) > 1:
roc = wl.diff().abs().mean()
else:
roc = np.nan
return {
'WellID': well_id,
'n_measurements': len(wl),
'mean_level': mean_level,
'variance': variance,
'std_dev': wl.std(),
'level_range': level_range,
'cv': cv,
'trend': trend,
'acf_30': acf_30,
'rate_of_change': roc,
'min_level': wl.min(),
'max_level': wl.max()
}
except Exception as e:
return None
# Characterize all wells with HTEM data
well_responses = []
for well_id in wells_with_htem['WellID']:
response = characterize_well_response(loader, well_id)
if response:
well_responses.append(response)
well_response_df = pd.DataFrame(well_responses)
print(f"Wells with complete characterization: {len(well_response_df)}")
```
::: {.callout-tip icon=false}
## Why these specific metrics?
Each metric tests a different hypothesis about structure-function relationships:
- **Variance vs. Resistivity**: Do sandy (high-resistivity) zones have more variable water levels?
- **ACF vs. Resistivity**: Do clay-rich (low-resistivity) zones retain "memory" longer?
- **ROC vs. Heterogeneity**: Does structural variability cause erratic behavior?
By computing multiple metrics, we can test multiple hypotheses simultaneously and build a richer picture of how structure controls behavior.
:::
## Fusion Dataset Creation
This is where **data fusion happens**: we merge the structural fingerprint (from HTEM) with the behavioral fingerprint (from well time series) into a single dataset. Each row represents one well, with columns for both structure and response:
| Structure Features (from HTEM) | Response Features (from wells) |
|-------------------------------|-------------------------------|
| `resistivity_mean`, `resistivity_std` | `variance`, `std_dev` |
| `resistivity_min`, `resistivity_max` | `level_range`, `cv` |
| `material_mode` | `trend`, `acf_30` |
| `depth_mean`, `depth_range` | `rate_of_change` |
This fused dataset enables us to ask: **Does structure predict response?**
```python
#| code-fold: true
# Merge HTEM and well response data
fusion_df = pd.merge(
wells_with_htem,
well_response_df,
on='WellID',
how='inner'
)
print(f"Complete fusion dataset: {len(fusion_df)} wells")
print("\nSample:")
print(fusion_df[['WellID', 'resistivity_mean', 'variance', 'trend', 'acf_30']].head())
```
The fused dataset is now ready for analysis. Notice that we lose some wells in the merge—those without sufficient HTEM coverage or without enough time series data. This is expected and ensures we only analyze wells with reliable data from both sources.
## Correlation Analysis
### What Is Correlation Analysis?
**Correlation analysis** is a statistical method that measures how strongly two variables are related to each other. First introduced by Francis Galton in the 1880s and formalized by Karl Pearson in 1896, correlation quantifies whether changes in one variable tend to coincide with changes in another.
### Why Does It Matter for Aquifer Analysis?
In hydrogeology, we need to understand whether subsurface structure (clay vs. sand) actually controls groundwater behavior (stable vs. variable water levels). Correlation analysis provides an **objective, quantitative measure** of these relationships, moving beyond visual inspection to statistical confidence.
### How Does Correlation Work?
Imagine plotting one variable against another on a scatter plot:
- **Positive correlation**: Points trend upward—as one variable increases, the other increases
- **Negative correlation**: Points trend downward—as one increases, the other decreases
- **No correlation**: Points are scattered randomly—no predictable relationship
The **correlation coefficient** (r or ρ) ranges from -1 to +1:
- Values near +1 or -1 indicate strong relationships
- Values near 0 indicate weak or no relationship
### What Will You See Below?
We'll calculate correlations between HTEM structure features (resistivity, material type) and well response metrics (variance, autocorrelation, rate of change). Each correlation tests a specific hydrogeological hypothesis about how geology controls water behavior.
### How to Interpret Correlation Results
| Correlation Value | Strength | Interpretation for Management |
|------------------|----------|------------------------------|
| **\|r\| > 0.7** | Very strong | Structure is primary control—use HTEM to predict behavior |
| **\|r\| = 0.5-0.7** | Strong | Structure matters but other factors also important |
| **\|r\| = 0.3-0.5** | Moderate | Structure plays a role, investigate other controls |
| **\|r\| < 0.3** | Weak | Structure not dominant—look for pumping, streams, etc. |
**Critical distinction**: We use **Pearson correlation (r)** for linear relationships and **Spearman correlation (ρ)** for monotonic (but not necessarily linear) relationships. Spearman is more robust to outliers.
---
Now we test our hypotheses quantitatively. We use correlation analysis to measure the strength and direction of relationships between structure and response:
**Three key hypotheses:**
1. **Resistivity vs. Variance**: Sandy aquifers (high resistivity) should fluctuate more because water moves in and out easily
2. **Resistivity vs. Autocorrelation**: Sandy aquifers should have *lower* autocorrelation—they respond quickly and "forget" past conditions
3. **Structural heterogeneity vs. Rate of Change**: Wells in geologically variable areas should show more erratic behavior
We use Pearson correlation for linear relationships and Spearman correlation for monotonic (but potentially nonlinear) relationships.
```python
#| code-fold: true
# Hypothesis 1: High resistivity → High variance (unconfined aquifer)
corr_resist_var, p_resist_var = pearsonr(
fusion_df['resistivity_mean'].dropna(),
fusion_df['variance'].dropna()
)
# Hypothesis 2: High resistivity → Low autocorrelation (rapid response)
corr_resist_acf, p_resist_acf = spearmanr(
fusion_df['resistivity_mean'].dropna(),
fusion_df['acf_30'].dropna()
)
# Hypothesis 3: Resistivity variability → Higher rate of change
corr_resist_std_roc, p_resist_std_roc = pearsonr(
fusion_df['resistivity_std'].dropna(),
fusion_df['rate_of_change'].dropna()
)
print("Structure-Function Correlations:")
print(f"Resistivity vs Variance: r={corr_resist_var:.3f}, p={p_resist_var:.4f}")
print(f"Resistivity vs ACF-30: ρ={corr_resist_acf:.3f}, p={p_resist_acf:.4f}")
print(f"Resistivity Std vs Rate of Change: r={corr_resist_std_roc:.3f}, p={p_resist_std_roc:.4f}")
```
**Interpreting correlation results:**
- **r or ρ > 0.3** with **p < 0.05**: Moderate positive relationship, statistically significant
- **r or ρ < -0.3** with **p < 0.05**: Moderate negative relationship, statistically significant
- **|r| < 0.3**: Weak or no relationship—structure may not be the dominant control
If correlations are weak, consider:
1. **Confounding factors**: Pumping, nearby streams, or seasonal effects may dominate
2. **Nonlinear relationships**: The connection may exist but not be captured by linear correlation
3. **Scale mismatch**: 1 km radius may be too large or too small for the local hydrogeology
4. **Depth mismatch**: HTEM Unit D may not correspond to well screen depth
## Structure-Function Scatter
Correlation coefficients summarize relationships as single numbers, but **scatter plots reveal the full story**. They show:
- Whether the relationship is truly linear or curved
- Whether outliers are driving the correlation
- Whether distinct clusters exist (suggesting different aquifer types)
- How much scatter exists around the trend
The four-panel plot below visualizes our three hypotheses plus a comparison by material type.
```python
#| code-fold: true
from plotly.subplots import make_subplots
fig = make_subplots(
rows=2, cols=2,
subplot_titles=(
'Resistivity vs Variance',
'Resistivity vs Autocorrelation',
'Resistivity Std vs Rate of Change',
'Material Type vs Response'
),
specs=[
[{'type': 'scatter'}, {'type': 'scatter'}],
[{'type': 'scatter'}, {'type': 'box'}]
]
)
# Plot 1: Resistivity vs Variance
fig.add_trace(
go.Scatter(
x=fusion_df['resistivity_mean'],
y=fusion_df['variance'],
mode='markers',
marker=dict(
size=8,
color=fusion_df['n_measurements'],
colorscale='Viridis',
showscale=True,
colorbar=dict(x=0.46, len=0.45, title='# Meas.')
),
text=fusion_df['WellID'],
hovertemplate='<b>%{text}</b><br>Resistivity: %{x:.1f} Ω·m<br>Variance: %{y:.2f}<extra></extra>'
),
row=1, col=1
)
# Add trendline
z = np.polyfit(fusion_df['resistivity_mean'].dropna(), fusion_df['variance'].dropna(), 1)
p = np.poly1d(z)
x_trend = np.linspace(fusion_df['resistivity_mean'].min(), fusion_df['resistivity_mean'].max(), 100)
fig.add_trace(
go.Scatter(
x=x_trend,
y=p(x_trend),
mode='lines',
line=dict(color='red', dash='dash'),
showlegend=False
),
row=1, col=1
)
# Plot 2: Resistivity vs Autocorrelation
fig.add_trace(
go.Scatter(
x=fusion_df['resistivity_mean'],
y=fusion_df['acf_30'],
mode='markers',
marker=dict(size=8, color='coral', opacity=0.6),
text=fusion_df['WellID'],
hovertemplate='<b>%{text}</b><br>Resistivity: %{x:.1f} Ω·m<br>ACF(30): %{y:.3f}<extra></extra>'
),
row=1, col=2
)
# Plot 3: Resistivity Std vs Rate of Change
fig.add_trace(
go.Scatter(
x=fusion_df['resistivity_std'],
y=fusion_df['rate_of_change'],
mode='markers',
marker=dict(size=8, color='steelblue', opacity=0.6),
text=fusion_df['WellID'],
hovertemplate='<b>%{text}</b><br>Resist. Std: %{x:.1f} Ω·m<br>ROC: %{y:.3f} m/day<extra></extra>'
),
row=2, col=1
)
# Plot 4: Material type vs variance (box plot)
material_types = fusion_df['material_mode'].dropna().unique()
for mat_type in sorted(material_types)[:5]: # Top 5 material types
data = fusion_df[fusion_df['material_mode'] == mat_type]
fig.add_trace(
go.Box(
y=data['variance'],
name=f'Type {mat_type}',
boxmean='sd'
),
row=2, col=2
)
# Update axes
fig.update_xaxes(title_text='Mean Resistivity (Ω·m)', row=1, col=1)
fig.update_xaxes(title_text='Mean Resistivity (Ω·m)', row=1, col=2)
fig.update_xaxes(title_text='Resistivity Std (Ω·m)', row=2, col=1)
fig.update_xaxes(title_text='Material Type', row=2, col=2)
fig.update_yaxes(title_text='Water Level Variance', row=1, col=1)
fig.update_yaxes(title_text='Autocorrelation (lag 30d)', row=1, col=2)
fig.update_yaxes(title_text='Rate of Change (m/day)', row=2, col=1)
fig.update_yaxes(title_text='Variance', row=2, col=2)
fig.update_layout(
title_text='HTEM Structure Controls Groundwater Response',
height=800,
showlegend=False
)
fig.show()
```
**Reading the scatter plots:**
1. **Top-left (Resistivity vs. Variance)**: Look for an upward trend. Points colored by number of measurements help identify whether data-rich wells drive the pattern.
2. **Top-right (Resistivity vs. Autocorrelation)**: We expect a *negative* relationship—high resistivity should correlate with low autocorrelation (fast response). If you see the opposite, clay layers may be providing confinement that creates memory.
3. **Bottom-left (Resistivity Std vs. Rate of Change)**: This tests whether geological heterogeneity causes erratic well behavior. A positive trend supports this hypothesis.
4. **Bottom-right (Material Type vs. Variance)**: Box plots by material type show whether the discrete classification captures variance differences. If boxes overlap substantially, the continuous resistivity variable may be more informative.
## Resistivity Class Analysis
While continuous correlations are statistically powerful, **discrete classes are operationally useful**. A water manager can't act on "wells with resistivity > 87.3 Ω·m"—but they can manage "wells in sandy zones" differently from "wells in clay zones."
We classify wells into three groups based on surrounding HTEM resistivity:
| Class | Resistivity Range | Expected Lithology | Expected Behavior |
|-------|------------------|-------------------|-------------------|
| **Low** | <50 Ω·m | Clay/silt | Confined, stable, slow response |
| **Medium** | 50-120 Ω·m | Mixed sediments | Semi-confined, moderate variability |
| **High** | >120 Ω·m | Sand/gravel | Unconfined, high variability, rapid response |
```python
#| code-fold: true
# Define resistivity classes
def classify_resistivity(resist):
if resist < 50:
return 'Low (<50 Ω·m)'
elif resist < 120:
return 'Medium (50-120 Ω·m)'
else:
return 'High (>120 Ω·m)'
fusion_df['resist_class'] = fusion_df['resistivity_mean'].apply(classify_resistivity)
# Compare response by resistivity class
class_stats = fusion_df.groupby('resist_class').agg({
'variance': ['mean', 'std', 'count'],
'acf_30': ['mean', 'std'],
'rate_of_change': ['mean', 'std'],
'trend': ['mean', 'std']
}).round(3)
print("\nResponse Statistics by Resistivity Class:")
print(class_stats)
```
**Interpreting the class statistics:**
The table above shows mean and standard deviation for each response metric within each resistivity class. Key patterns to look for:
- **Variance increasing Low → Medium → High**: Confirms that sandy aquifers fluctuate more
- **ACF decreasing Low → Medium → High**: Confirms that sandy aquifers respond faster
- **High standard deviation within classes**: Indicates other factors (depth, pumping, proximity to streams) also matter
If classes don't show clear separation, the classification thresholds may need adjustment for this specific aquifer system.
## Resistivity Class Comparison
The box plots below provide a visual comparison of how well behavior differs across resistivity classes. Box plots show the median, interquartile range (IQR), and outliers—making it easy to see both central tendency and spread.
**What to look for:**
- **Non-overlapping boxes**: Strong separation between classes—structure is a dominant control
- **Overlapping boxes**: Weaker separation—other factors contribute substantially
- **Outliers**: Wells that don't fit the pattern—candidates for further investigation
```python
#| code-fold: true
fig = make_subplots(
rows=1, cols=3,
subplot_titles=('Variance', 'Autocorrelation (30d)', 'Rate of Change')
)
resist_classes = ['Low (<50 Ω·m)', 'Medium (50-120 Ω·m)', 'High (>120 Ω·m)']
colors = ['#e74c3c', '#f39c12', '#3498db']
for metric, col, title in [('variance', 1, 'Variance'),
('acf_30', 2, 'ACF(30)'),
('rate_of_change', 3, 'ROC')]:
for resist_class, color in zip(resist_classes, colors):
data = fusion_df[fusion_df['resist_class'] == resist_class]
fig.add_trace(
go.Box(
y=data[metric],
name=resist_class,
marker_color=color,
boxmean='sd',
showlegend=(col == 1)
),
row=1, col=col
)
fig.update_yaxes(title_text='Water Level Variance', row=1, col=1)
fig.update_yaxes(title_text='Autocorrelation', row=1, col=2)
fig.update_yaxes(title_text='Rate of Change (m/day)', row=1, col=3)
fig.update_layout(
title_text='Well Response by HTEM Resistivity Class<br><sub>Low=Clay, Medium=Mixed, High=Sand/Gravel</sub>',
height=500
)
fig.show()
```
If the hypothesis is correct, you should see:
- **Variance**: Increasing from Low (red) to High (blue)
- **Autocorrelation**: Decreasing from Low to High
- **Rate of Change**: Increasing from Low to High
The strength of these patterns indicates how much of well behavior can be explained by subsurface structure alone.
::: {.callout-note icon=false}
## 📊 Reading Box Plot Comparisons
**This 3-panel figure tests structure-function hypotheses:**
| Panel | Hypothesis Being Tested | Expected Pattern If Hypothesis Is True |
|-------|------------------------|----------------------------------------|
| **Left (Variance)** | High-resistivity zones have more variable water levels | Blue box higher than red box |
| **Middle (ACF)** | Clay zones retain "memory" longer | Red box higher than blue box |
| **Right (ROC)** | Sandy zones respond more rapidly | Blue box higher than red box |
**Interpreting Box Plot Features:**
- **Non-overlapping boxes**: Strong evidence for hypothesis (clear separation)
- **Overlapping boxes with different medians**: Moderate evidence (trend exists but variability high)
- **Completely overlapping boxes**: Hypothesis not supported (structure doesn't control this metric)
- **Outliers (dots beyond whiskers)**: Wells that don't fit the pattern—investigate individually
**Management Implications:**
| Observation | Interpretation | Action |
|-------------|---------------|--------|
| **Strong variance separation** | Structure dominates behavior | Use HTEM for well placement |
| **Weak variance separation** | Other factors dominate (pumping, climate) | Need additional data sources |
| **Many outliers** | Local complexity (faults, heterogeneity) | Site-specific investigation needed |
:::
## Predictive HTEM Modeling
### What Is Random Forest Regression?
**Random Forest** is a machine learning algorithm introduced by Leo Breiman in 2001. It creates hundreds of decision trees, each trained on a random subset of data, and combines their predictions through averaging. Think of it as consulting a "forest" of expert opinions rather than relying on a single expert.
### Why Does It Matter for Aquifer Analysis?
Correlation and classification are descriptive—they tell us relationships exist. **Predictive modeling** goes further: can we actually *forecast* how a new well will behave based solely on HTEM data at its location?
This has practical value:
- **Before drilling**: Predict expected water level behavior from HTEM alone
- **Monitoring design**: Estimate which locations need frequent vs. infrequent sampling
- **Model validation**: Test whether our understanding of structure-function relationships generalizes
We use **Random Forest regression**—an ensemble method that handles nonlinear relationships and automatically identifies important features.
```python
#| code-fold: true
# Feature matrix: HTEM properties
X_features = [
'resistivity_mean', 'resistivity_std', 'resistivity_min', 'resistivity_max',
'resistivity_p25', 'resistivity_p75', 'depth_mean', 'depth_range'
]
X = fusion_df[X_features].dropna()
# Target: Well variance (example)
y_variance = fusion_df.loc[X.index, 'variance']
# Train-test split
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(
X, y_variance, test_size=0.3, random_state=42
)
# Random Forest model
rf_model = RandomForestRegressor(n_estimators=100, random_state=42, max_depth=5)
rf_model.fit(X_train, y_train)
# Predictions
y_pred = rf_model.predict(X_test)
# Evaluate
from sklearn.metrics import r2_score, mean_absolute_error
r2 = r2_score(y_test, y_pred)
mae = mean_absolute_error(y_test, y_pred)
print(f"\nPredictive Model Performance (HTEM → Well Variance):")
print(f"R² Score: {r2:.3f}")
print(f"MAE: {mae:.3f}")
# Feature importance
feature_importance = pd.DataFrame({
'feature': X_features,
'importance': rf_model.feature_importances_
}).sort_values('importance', ascending=False)
print("\nFeature Importance:")
print(feature_importance)
```
### How to Interpret Model Performance Metrics
Understanding these metrics is critical for assessing whether the model is trustworthy enough for operational decisions:
**R² (Coefficient of Determination)** - Developed by Sewall Wright (1921)
What it measures: Proportion of variance in well behavior explained by HTEM structure
| R² Value | Interpretation | Management Action |
|----------|---------------|------------------|
| **R² > 0.7** | Strong predictive power | Structure dominates—use model for well placement |
| **R² = 0.3-0.7** | Moderate | Structure matters but validate with local data |
| **R² < 0.3** | Weak | Structure alone insufficient—investigate pumping, streams |
**MAE (Mean Absolute Error)**
What it measures: Average prediction error in original units (e.g., variance units)
How to interpret: Compare MAE to the standard deviation of the target variable:
- If MAE << std: Predictions are substantially better than random guessing
- If MAE ≈ std: Model barely improves on baseline predictions
- If MAE > std: Model is worse than simply predicting the mean
**Feature Importance**
What it measures: Which HTEM properties most influence predictions
How to interpret:
- **`resistivity_mean` dominates**: Average lithology is primary control—focus on bulk properties
- **`resistivity_std` is important**: Geological heterogeneity drives behavior—expect local variability
- **`depth_mean` matters**: Vertical position in aquifer is critical—confining layers at specific depths
## Predictive Model Results
```python
#| code-fold: true
fig = make_subplots(
rows=1, cols=2,
subplot_titles=('Predicted vs Actual Variance', 'Feature Importance')
)
# Predicted vs actual
fig.add_trace(
go.Scatter(
x=y_test,
y=y_pred,
mode='markers',
marker=dict(size=8, color='steelblue', opacity=0.6),
name='Test Data',
hovertemplate='Actual: %{x:.2f}<br>Predicted: %{y:.2f}<extra></extra>'
),
row=1, col=1
)
# Perfect prediction line
line_min, line_max = y_test.min(), y_test.max()
fig.add_trace(
go.Scatter(
x=[line_min, line_max],
y=[line_min, line_max],
mode='lines',
line=dict(color='red', dash='dash'),
name='Perfect Prediction',
showlegend=True
),
row=1, col=1
)
# Feature importance
fig.add_trace(
go.Bar(
x=feature_importance['importance'],
y=feature_importance['feature'],
orientation='h',
marker_color='coral'
),
row=1, col=2
)
fig.update_xaxes(title_text='Actual Variance', row=1, col=1)
fig.update_xaxes(title_text='Importance', row=1, col=2)
fig.update_yaxes(title_text='Predicted Variance', row=1, col=1)
fig.update_yaxes(title_text='HTEM Feature', row=1, col=2)
fig.update_layout(
title_text=f'HTEM-Based Prediction of Well Response (R²={r2:.3f})',
height=500,
showlegend=True
)
fig.show()
```
**Reading the results plot:**
- **Left panel (Predicted vs. Actual)**: Points should cluster along the red dashed line (perfect prediction). Scatter around the line indicates prediction uncertainty. Systematic deviations (curve, bias) suggest the model is missing something.
- **Right panel (Feature Importance)**: Shows which HTEM properties drive the predictions. This is scientifically informative—it tells us which aspects of structure most influence groundwater behavior.
If the model performs well (R² > 0.5), we can use it to **predict behavior at unmonitored locations** based on HTEM data alone.
## Predicted Response Map
The ultimate goal of structure-function fusion is **spatial prediction**: using dense HTEM coverage to fill gaps between sparse well observations. The map below shows predicted water level variance across the study area, based solely on HTEM structure.
**Applications:**
- Identify high-variability zones that need more monitoring
- Locate stable zones suitable for long-term water supply
- Guide placement of new observation wells
- Prioritize areas for contamination vulnerability assessment
```python
#| code-fold: true
# Predict variance for all wells with HTEM data
X_all = wells_with_htem[X_features].dropna()
wells_with_predictions = wells_with_htem.loc[X_all.index].copy()
wells_with_predictions['predicted_variance'] = rf_model.predict(X_all)
# Map
fig = go.Figure()
fig.add_trace(go.Scattermapbox(
lat=wells_with_predictions['Latitude'],
lon=wells_with_predictions['Longitude'],
mode='markers',
marker=dict(
size=12,
color=wells_with_predictions['predicted_variance'],
colorscale='RdYlBu_r',
showscale=True,
colorbar=dict(title='Predicted<br>Variance')
),
text=[f"Well {w}<br>Resistivity: {r:.1f} Ω·m<br>Predicted Var: {v:.2f}"
for w, r, v in zip(wells_with_predictions['WellID'],
wells_with_predictions['resistivity_mean'],
wells_with_predictions['predicted_variance'])],
hovertemplate='<b>%{text}</b><extra></extra>'
))
fig.update_layout(
title='Predicted Well Response from HTEM Structure',
mapbox=dict(
style='open-street-map',
center=dict(lat=wells_with_predictions['Latitude'].mean(),
lon=wells_with_predictions['Longitude'].mean()),
zoom=9
),
height=600
)
fig.show()
```
**Interpreting the prediction map:**
Hot colors (red/orange) indicate zones where we predict high water level variability—these are likely unconfined, sandy areas that respond strongly to recharge and pumping. Cool colors (blue) indicate predicted stability—likely confined, clay-rich zones.
Compare this map to the HTEM resistivity map at the beginning of the chapter. You should see correspondence: high-resistivity zones should generally show high predicted variance. Any discrepancies reveal where the relationship breaks down—perhaps due to local pumping, stream interaction, or 3D complexity not captured by Unit D alone.
## Physical Interpretation
::: {.callout-important icon=false}
## 🔍 Structure-Function Relationships
**Confirmed Hypotheses:**
1. **High resistivity → High variance** (r={corr_resist_var:.2f}): Sand/gravel aquifers respond strongly to recharge/pumping
2. **High resistivity → Lower autocorrelation** (ρ={corr_resist_acf:.2f}): Unconfined aquifers have shorter "memory"
3. **Heterogeneity → Higher rate of change**: Spatially variable geology causes localized responses
**Predictive Power:**
- HTEM structure explains **{r2*100:.1f}%** of variance in well response
- Most important feature: **{feature_importance.iloc[0]['feature']}**
**Aquifer Classification:**
- **Low resistivity (<50 Ω·m)**: Confined, stable, slow response
- **Medium resistivity (50-120 Ω·m)**: Semi-confined, moderate variability
- **High resistivity (>120 Ω·m)**: Unconfined, highly responsive
:::
## Management Applications
### 1. Well Placement Strategy
```python
#| code-fold: true
# Identify zones with different management needs
wells_with_predictions['management_class'] = pd.cut(
wells_with_predictions['predicted_variance'],
bins=[0, 1, 3, 100],
labels=['Stable', 'Moderate', 'High Variability']
)
print("Wells by Management Class:")
print(wells_with_predictions['management_class'].value_counts())
# Monitoring frequency recommendations
print("\nRecommended Monitoring Frequency:")
print("- Stable zones: Quarterly")
print("- Moderate zones: Monthly")
print("- High variability zones: Weekly or continuous")
```
### 2. Vulnerability Assessment
```python
#| code-fold: true
# High variance + shallow depth = vulnerability
wells_with_predictions['vulnerability_score'] = (
wells_with_predictions['predicted_variance'] *
(1 / wells_with_predictions['depth_mean']) # Shallow = higher score
)
# Normalize to 0-100
wells_with_predictions['vulnerability_score'] = (
100 * (wells_with_predictions['vulnerability_score'] /
wells_with_predictions['vulnerability_score'].max())
)
# Categorize
wells_with_predictions['vulnerability'] = pd.cut(
wells_with_predictions['vulnerability_score'],
bins=[0, 30, 60, 100],
labels=['Low', 'Moderate', 'High']
)
print("Vulnerability Assessment:")
print(wells_with_predictions['vulnerability'].value_counts())
```
## Key Insights
::: {.callout-tip icon=false}
## 💡 Fusion Insights
**What HTEM tells us about groundwater:**
- **Resistivity** predicts response magnitude
- **Heterogeneity** predicts variability
- **Depth** modulates sensitivity
**What groundwater tells us about HTEM:**
- Wells validate HTEM interpretation
- Response patterns confirm lithology
- Discrepancies reveal 3D complexity
**Synergy:**
- HTEM provides spatial coverage (dense grid)
- Wells provide temporal dynamics (continuous monitoring)
- Together: 4D understanding of aquifer system
:::
## Limitations and Uncertainties
1. **Depth mismatch**: HTEM Unit D depth range may not match well screen depth
2. **Scale effects**: HTEM resolution (~100m) vs well point measurement
3. **Temporal assumption**: HTEM is static, but geology can change (land subsidence)
4. **Confounding factors**: Pumping, nearby wells affect response independent of structure
## References
- Barlow, P. M., & Leake, S. A. (2012). Streamflow depletion by wells. *USGS Scientific Investigations Report 2012-5095*.
- Fitterman, D. V., & Stewart, M. T. (1986). Transient electromagnetic sounding for groundwater. *Geophysics*, 51(4), 995-1005.
- Siemon, B., et al. (2009). A review of helicopter-borne electromagnetic methods for groundwater exploration. *Near Surface Geophysics*, 7(5-6), 629-646.
## Next Steps
→ **Chapter 6**: Weather-Response Fusion - Climate forcing of aquifer dynamics
**Cross-Chapter Connections:**
- HTEM data introduced in Part 1
- Well response metrics from Part 2
- Structure informs recharge pathways (Chapter 3)
- Foundation for 4D fusion (Chapter 8)
---
## Summary
HTEM-groundwater fusion reveals **structure controls on aquifer response**:
✅ **Spatial structure matters** - Resistivity patterns correlate with well response characteristics
✅ **Scale bridging achieved** - 100m HTEM grid linked to point well measurements
✅ **Validation pathway established** - Wells provide ground truth for HTEM interpretation
⚠️ **Depth mismatch possible** - HTEM unit depths may not match well screen depths
⚠️ **Temporal assumption** - HTEM is static, wells are dynamic
**Key Insight**: HTEM provides the **spatial framework** that explains **why** wells at different locations behave differently.
---
## Reflection Questions
- In your own study area, what are some qualitative signs (for example, rapid storm responses, long drought recovery, or very flat hydrographs) that subsurface structure is controlling groundwater behavior, and how might HTEM-style data help confirm those impressions?
- Suppose the model in this chapter shows high variance and low autocorrelation in certain high-resistivity zones. How would that influence where you place new monitoring wells or where you would be cautious about additional pumping?
- When HTEM-based predictions and observed well responses disagree at a location, what are at least three possible explanations (data issues, structural complexity, pumping, boundary effects), and how would you go about diagnosing which one dominates?
- How could structure–response fusion like this feed into later chapters on forecasting, optimization, or value-of-information—for example, deciding where extra HTEM flights or new wells would most reduce uncertainty?
---
## Related Chapters
- [HTEM Survey Overview](../part-1-foundations/htem-survey-overview.qmd) - Source HTEM data
- [Well Network Analysis](../part-1-foundations/well-network-analysis.qmd) - Well response data
- [Aquifer Material Map](../part-2-spatial/aquifer-material-map.qmd) - Spatial material distribution
- [Weather-Response Fusion](weather-response-fusion.qmd) - Climate forcing analysis