54  Terminology Translation Guide

The Rosetta Stone for interdisciplinary aquifer data science

55 Why Translation Matters

Environmental data science requires collaboration across disciplines that often use different words for the same concepts, or worse, the same words for different concepts. This chapter serves as a living translation guide to bridge these gaps.

Who this helps: - Computer scientists learning hydrogeology terminology - Hydrogeologists learning data science methods - Statisticians understanding domain context - Geophysicists connecting EM theory to analysis - Students navigating multiple disciplines

How to use this guide: - Search: Use Ctrl+F / Cmd+F to find any term quickly. - Read across: Check equivalent concepts in other disciplines. - Check confusions: See “Common Confusion” sections for pitfalls. - See examples: “In This Project” shows concrete applications.

If you are completely new to groundwater, start with the Plain-Language Basics below. You do not need to memorize definitions; come back here whenever a chapter uses a term you do not recognize.


56 Plain-Language Basics

These are the core ideas that appear throughout the playbook, written for readers with no water background.

Term Plain Description Why It Matters in This Playbook Real Example
Groundwater Water stored in pores and cracks of rocks and sediments underground. It is the main water source we are trying to understand and manage. When you drill a well 50 meters deep and water fills it to 10 meters below surface, that’s groundwater from the aquifer.
Aquifer A body of rock or sediment that can store and transmit usable amounts of groundwater—like a buried sponge. Most of the analyses ask: Where is the aquifer? How full is it? How does it respond to weather and pumping? Unit D (Mahomet Aquifer) is a buried sand/gravel valley 12-96m deep that stores billions of gallons of water.
Confining layer A layer of clay or rock that does not let much water pass through. Protects deeper aquifers from quick changes at the land surface, making them respond more slowly. Unit E (clay layer above Unit D) prevents surface spills from quickly reaching the drinking water aquifer below.
Recharge Water that soaks down from the surface (rain, snowmelt, irrigation) to refill the aquifer. Links weather and land-surface processes to long-term groundwater levels. Spring rains in Illinois soak through soil → percolate down through sand → raise water levels in Unit D over weeks to months.
Well A hole drilled into the ground to reach groundwater, often with sensors that measure water level. Provides direct observations of how the aquifer is behaving at specific locations. Our 356 observation wells measure water levels every 15 minutes, creating a 1-million-record time series.
Water level The height of groundwater in a well, usually measured relative to a reference point. Rising or falling water levels tell us if the aquifer is gaining or losing storage. If water level rises 2 meters in spring, the aquifer gained storage (recharged). If it drops 1 meter in summer, it lost storage.
HTEM Helicopter-borne geophysical survey that measures how the ground resists electrical currents. Gives us a 3D picture of underground materials without drilling, which we link to aquifer properties. 2008 helicopter survey mapped 2,361 km² in weeks—would take decades and millions of dollars to drill that many wells.
Resistivity A measure of how strongly a material resists electric current; clays are low, sands and gravels are high. Used as a proxy for material type and aquifer quality in HTEM maps. Clay: 5-30 Ω·m (low), Sand: 100-200 Ω·m (high). High resistivity = good aquifer material.
Confined aquifer An aquifer trapped between confining layers, reacting mainly to pressure changes, not directly to surface water table. Explains why some wells show tiny seasonal swings but long-term memory of past conditions. When Unit D is sealed by clay above/bedrock below, water levels change slowly (±0.5m) but track multi-year climate patterns.
Unconfined aquifer An aquifer whose top surface is the water table, directly connected to the surface. Responds quickly to rain and drought with larger seasonal swings. Shallow aquifers near streams can swing 3-5 meters seasonally, rising quickly after rain, dropping in summer.
Hydraulic head The potential energy of water at a point (combination of elevation and pressure). Water flows from high head to low head—this determines groundwater flow direction. If Well A has head of 200m and Well B has 195m (5km away), water flows from A→B at ~1 meter drop per kilometer.
Transmissivity How easily water flows horizontally through the full thickness of an aquifer. High transmissivity = wells produce more water, aquifer recovers faster from pumping. Good sand aquifer: T = 1000 m²/day (productive). Clay layer: T = 1 m²/day (poor, can’t supply wells).
Storativity The volume of water an aquifer releases (or stores) per unit area per unit head change. Determines how much water level drops when you pump, or rises when it rains. Unconfined: S = 0.15 (15% of aquifer volume drainable). Confined: S = 0.0001 (only 0.01% released by pressure).

57 Core Concept Translations

57.1 Master Translation Table

Computer Science Hydrogeology Statistics Geophysics Unified Meaning
Outlier detection Anomalous water levels Statistical anomaly Measurement error Identifying observations that deviate from expected patterns - requires domain context to interpret
Feature engineering Aquifer properties Predictor variables Material parameters Transforming raw observations into model inputs that capture relevant physics
Clustering Aquifer compartments Spatial grouping Material zones Identifying natural groupings where similar properties occur together
Classification Lithology mapping Categorical prediction Material identification Assigning observations to discrete categories (e.g., sand vs clay)
Regression Empirical relationships Continuous prediction Forward modeling Predicting continuous values (e.g., water level, resistivity)
Time series forecasting Water level prediction ARIMA/Prophet - Extrapolating temporal patterns into the future
Dimensionality reduction Stratigraphic simplification PCA/Factor analysis Layer averaging Reducing complexity while preserving essential information
Interpolation Spatial estimation Kriging/IDW Grid generation Estimating values at unobserved locations from nearby measurements
Cross-validation Independent validation Model assessment Test-train split Evaluating model performance on data not used for training
Hyperparameter tuning Model calibration Parameter optimization Inversion tuning Finding optimal configuration for model performance
Supervised learning Training on known lithology Labeled data modeling Constrained inversion Learning from observations with known outcomes
Unsupervised learning Exploratory analysis Pattern discovery Data-driven zonation Finding structure in data without predefined labels
Ensemble methods Multi-model prediction Bagging/Boosting Combined inversions Combining multiple models to improve predictions
Neural networks Non-linear modeling Deep learning Complex mapping Flexible models that learn hierarchical patterns
Gradient descent Optimization Iterative minimization Inversion algorithm Iteratively improving model by following error gradient
Loss function Misfit function Error metric Data residual Quantifies difference between model predictions and observations
Overfitting Over-parameterization Poor generalization Non-unique solution Model fits training data perfectly but fails on new data
Regularization Parsimony constraint Penalized regression Damping/Smoothing Constraining model complexity to prevent overfitting
Batch processing Bulk analysis - Survey-wide processing Processing multiple records simultaneously for efficiency
Pipeline Workflow Processing chain Analysis sequence Series of automated steps from raw data to results

58 Spatial Analysis Translations

Computer Science Hydrogeology Statistics Unified Meaning
Spatial autocorrelation Aquifer continuity Tobler’s First Law Nearby locations are more similar than distant ones
Variogram Spatial structure Covariance function How similarity decreases with distance
Kriging Optimal interpolation BLUE estimation Best Linear Unbiased Estimator for spatial data
Neighborhood search Zone of influence Local estimation Determining which nearby points affect prediction
Anisotropy Directional permeability Directional correlation Properties vary differently in different directions
Range Correlation distance Spatial dependence limit Maximum distance where spatial correlation exists
Sill Total variance Asymptotic variance Variance at distances beyond correlation
Nugget Measurement error Small-scale variance Discontinuity at zero distance

59 Temporal Analysis Translations

Computer Science Hydrogeology Statistics Unified Meaning
Autocorrelation System memory Temporal dependence Current values depend on past values
Lag Response time Time shift Delay between cause and effect
Trend Long-term change Systematic component Non-stationary mean over time
Seasonality Annual cycle Periodic component Repeating patterns at fixed intervals
Stationarity Equilibrium Constant statistics Statistical properties don’t change over time
Differencing Change analysis Detrending Removing non-stationarity by subtracting previous values
Decomposition Component separation STL/Seasonal Breaking time series into trend, seasonal, residual
Change point Regime shift Structural break Time when system behavior fundamentally changes
Wavelet analysis Multi-scale patterns Time-frequency Identifying patterns at multiple timescales

60 Data Quality Translations

Computer Science Hydrogeology Statistics Unified Meaning
Missing data Measurement gaps NA/NaN values Observations not recorded or lost
Imputation Gap-filling Missing value estimation Estimating missing values from available data
Normalization Unit conversion Standardization Scaling variables to common range
Filtering Data cleaning Outlier removal Removing erroneous or irrelevant observations
Resampling Time aggregation Temporal binning Changing temporal resolution (hourly → daily)
Data fusion Multi-source integration Data combination Merging different data types for joint analysis
Quality flags Data codes Data qualifiers Indicators of reliability or issues

61 Model Performance Translations

Computer Science Statistics Hydrogeology Unified Meaning
Accuracy Correct classification rate Prediction success Fraction of predictions that are correct
Precision Positive predictive value - Of predicted positives, how many are correct
Recall Sensitivity / TPR - Of actual positives, how many were found
F1 score Harmonic mean - Balanced measure of model performance
RMSE Root mean squared error Prediction error Average magnitude of prediction errors
Coefficient of determination Variance explained Proportion of variance captured by model
AIC/BIC Information criterion Model parsimony Balances model fit with complexity
Confusion matrix Classification table Contingency table Cross-tabulation of predicted vs actual classes

62 Common Confusion Points

62.1 1. Spatial Autocorrelation

62.1.1 What Each Discipline Says

Computer Science: “Data points that are close together have similar values. This violates the i.i.d. assumption of most ML algorithms.”

Hydrogeology: “Aquifer properties vary smoothly across space due to depositional processes. Tobler’s First Law: Everything is related, but near things are more related.”

Statistics: “The covariance structure depends on distance. We model this with variograms and use spatial cross-validation instead of random splits.”

62.1.2 Why It Matters

  • Standard train/test splits fail (nearby points in train and test leak information)
  • Must use spatial CV or block CV
  • Predictions inherit spatial structure from training data
  • Uncertainty quantification requires spatial correlation modeling

62.1.3 In This Project

  • HTEM resistivity shows strong spatial autocorrelation (range ~500-1000m)
  • Well measurements are spatially correlated (aquifer continuity)
  • See: Part 2 - Spatial Patterns for variogram analysis

62.2 2. Feature vs Property

62.2.1 What Each Discipline Says

Computer Science: “Features are input variables to a model. We engineer features by transforming raw data (e.g., log transform, polynomial features, interactions).”

Hydrogeology: “Aquifer properties are physical characteristics: transmissivity (T), storativity (S), hydraulic conductivity (K). These come from pumping tests and geological analysis.”

62.2.2 How They Connect

  • CS features ← Derived from → Hydro properties
  • feature_depth = Z-coordinate → Physical: confining_pressure
  • feature_resistivity_log → Physical: clay_content

62.2.3 In This Project

  • HTEM resistivity → Feature for predicting material type
  • Depth, elevation, neighboring values → Features
  • True aquifer properties (K, T, S) are target variables OR constraints

62.3 3. Outlier vs Anomaly

62.3.1 What Each Discipline Says

Computer Science: Outlier = Statistical anomaly (>3σ from mean)

Hydrogeology: Anomaly = Unexpected measurement (could be real or error)

62.3.2 What It Could Be

  1. Measurement error (sensor malfunction) → Remove
  2. Pump test (intentional drawdown) → Flag, don’t remove
  3. Natural event (earthquake, flood) → Keep, study
  4. Contamination plume (localized change) → Key finding!

62.3.3 Decision Rule

Don’t auto-remove outliers! Investigate with domain knowledge.

62.3.4 In This Project

  • Water level “outliers” often = pump tests (intentional)
  • Resistivity “outliers” often = geological contacts (real)
  • See: Part 1 - Data Quality for outlier handling

62.4 4. Training vs Calibration

62.4.1 What Each Discipline Says

Computer Science: Training = Fitting model parameters to minimize loss function on labeled data

Hydrogeology: Calibration = Adjusting model parameters until model matches field observations

Geophysics: Inversion = Estimating subsurface structure from surface measurements (ill-posed problem)

62.4.2 Similarities

  • All optimize parameters to match observations
  • All risk overfitting

62.4.3 Differences

  • Training: Large labeled dataset, many samples
  • Calibration: Few observations, physics-based model
  • Inversion: Underdetermined (infinite solutions), requires regularization

62.4.4 In This Project

  • HTEM interpretation = Inversion (resistivity → lithology)
  • Material type classifier = Training (supervised learning)
  • Groundwater model = Calibration (match observed heads)

62.5 5. Prediction vs Forecast

62.5.1 Statistics Perspective

  • Prediction: Estimating unknown values (spatial or temporal)
  • Forecast: Predicting future values (temporal only)
  • Projection: Conditional “what-if” scenarios

62.5.2 Hydrogeology Perspective

  • Prediction: Where to drill for water (spatial)
  • Forecast: Water levels next month (temporal)
  • Projection: Impact of climate change (scenario)

62.5.3 Key Distinction

Forecasts assume trends continue. Projections explore alternatives.

62.5.4 In This Project

  • Well productivity prediction (spatial)
  • 30-day water level forecast (temporal)
  • Climate change projections (scenarios)

62.6 6. Clustering Purposes

62.6.1 Different Goals

Computer Science: Find groups that minimize within-cluster variance

Hydrogeology: Delineate aquifer compartments with similar flow properties

Statistics: Identify distinct statistical populations

62.6.2 Critical Difference

  • CS: k chosen by elbow plot or silhouette score
  • Hydro: k should match expected geological units
  • Stats: k validated by mixture model BIC

62.6.3 In This Project

We constrain k=6 for stratigraphic units (domain knowledge) rather than optimize k statistically.


62.7 7. Depth vs Elevation

62.7.1 Common Confusion

These are NOT interchangeable!

Warning⚠️ Critical Beginner Mistake

Many newcomers confuse depth and elevation. This causes serious analysis errors!

The key difference:

  • Depth goes DOWN from where you stand (depth = 0 at surface, increases downward)
  • Elevation goes UP from sea level (elevation increases upward, like a mountain)

They move in opposite directions!

62.7.2 Depth to Water (DTW)

  • What it is: How far down to reach water (feet or meters)
  • Direction: Increases when water level drops (more depth to reach water)
  • Reference: Measured from land surface (where you stand)
  • Used for: Well drilling depth, pumping lift
  • Example: “Water is 10 meters deep” = 10 meters below your feet

62.7.3 Water Surface Elevation (WSE)

  • What it is: Height of water surface above sea level
  • Direction: Decreases when water level drops (surface is lower)
  • Reference: Measured from mean sea level (like mountain heights)
  • Used for: Hydraulic gradients, flow direction
  • Comparable: Can compare between wells at different surface elevations
  • Example: “Water surface elevation is 195 meters” = 195 meters above sea level

62.7.4 Formula

WSE = Land_Surface_Elevation - Depth_To_Water
NoteConcrete Example

Well A:

  • Land surface elevation: 210 meters above sea level
  • Depth to water: 15 meters below surface
  • Water surface elevation: 210 - 15 = 195 meters above sea level

Well B (1 km away):

  • Land surface elevation: 205 meters above sea level
  • Depth to water: 8 meters below surface
  • Water surface elevation: 205 - 8 = 197 meters above sea level

What this tells us:

  • Water flows from B (197m) to A (195m) because elevation is higher at B
  • Even though Well A has greater depth to water (15m vs 8m), water is actually lower in Well A
  • You cannot compare depths directly between wells at different elevations!

Why WSE matters: It tells you which direction groundwater flows (high → low elevation), regardless of surface topography.

62.7.5 In This Project

  • Database stores: DTW (measured directly)
  • Analysis uses: WSE (calculated from formula)
  • Flow direction: Determined by WSE gradients, not DTW
  • See: Data Dictionary for database schema and column definitions

62.8 8. Resistivity vs Conductivity

62.8.1 Geophysics Clarification

Resistance (Ω): - Property of a specific object - Depends on geometry

Resistivity (Ω·m): - Material property (independent of geometry) - What HTEM measures - Inverse of electrical conductivity

Electrical Conductivity (S/m or mS/m): - Inverse of resistivity: σ = 1/ρ - Higher in saline water, lower in fresh water

Hydraulic Conductivity (m/day): - Completely different! (flow property, not electrical) - Can correlate with resistivity (sand = high K, high ρ)

62.8.2 In This Project

  • HTEM measures resistivity (ρ in Ω·m)
  • Clay: 1-10 Ω·m (low resistivity, low electrical conductivity)
  • Sand: 100-1000 Ω·m (high resistivity, high hydraulic conductivity)

63 Discipline-Specific Glossaries

63.1 Computer Science → Hydrogeology

When you say…Hydrogeologists mean…

  • “Training data” → Wells with known lithology
  • “Test data” → New wells or blind validation set
  • “Features” → Geophysical measurements + spatial coordinates
  • “Labels” → Material types from drill logs
  • “Model prediction” → Lithology interpretation
  • “Model uncertainty” → Geological uncertainty / non-uniqueness
  • “Hyperparameter tuning” → Model calibration
  • “Feature importance” → Sensitivity analysis
  • “Ensemble model” → Multiple scenarios / realizations
  • “Cross-validation” → Independent validation wells

63.2 Hydrogeology → Computer Science

When you say…Computer scientists mean…

  • “Hydraulic head” → Target variable (regression)
  • “Transmissivity” → Derived feature (from multiple sources)
  • “Aquifer heterogeneity” → High data variance / noise
  • “Anisotropy” → Directional features matter
  • “Boundary conditions” → Model constraints
  • “Calibration” → Training / fitting
  • “Validation” → Test set evaluation
  • “Conceptual model” → Model architecture choice
  • “Uncertainty” → Prediction confidence intervals
  • “Sensitivity analysis” → Feature importance / ablation study

63.3 Statistics → Hydrogeology

When you say…Hydrogeologists mean…

  • “Random variable” → Measured quantity with uncertainty
  • “Probability distribution” → Range of plausible values
  • “Spatial process” → Geological property field
  • “Stochastic simulation” → Multiple equally likely realizations
  • “Bayesian inference” → Updating understanding with new data
  • “Prior distribution” → Geological expectation before data
  • “Likelihood” → Consistency with observations
  • “Posterior distribution” → Updated geological understanding

64 Quick Reference Cards

64.1 For Computer Scientists

64.1.1 Key Concepts

  1. Hydraulic head = Potential energy of water (drives flow)
  2. Darcy’s Law = Q = -K·A·(dh/dl) (groundwater’s Ohm’s Law)
  3. Aquifer types: Confined (pressurized) vs Unconfined (water table)
  4. Transmissivity = How easily water flows horizontally (T = K × b)
  5. Storativity = How much water is stored/released
  6. Recharge = Water entering aquifer (from precipitation)
  7. Discharge = Water leaving aquifer (to wells, streams)

64.1.2 Physics Constraints Required

  • Water flows downhill (hydraulic gradient)
  • Conservation of mass (water balance)
  • Properties vary smoothly (geological continuity)
  • Anisotropic (horizontal K ≠ vertical K)

64.2 For Hydrogeologists

64.2.1 Key Concepts

  1. Supervised learning = You provide examples (wells + lithology)
  2. Features = Variables input to model (depth, resistivity, etc.)
  3. Overfitting = Model memorizes training data, fails on new data
  4. Cross-validation = Test on data not used for training
  5. Regularization = Penalizing overly complex models
  6. Ensemble methods = Combining multiple models (like multiple realizations)
  7. Neural networks = Flexible non-linear models (like complex transfer functions)

64.2.2 Common Pitfalls

  • Don’t trust models on data outside training range (extrapolation)
  • Spatial autocorrelation violates independence assumptions
  • More features ≠ better (curse of dimensionality)
  • Correlation ≠ causation (even strong correlations)

64.3 For Statisticians

64.3.1 Key Concepts

  1. Physical constraints limit model flexibility (water flows downhill)
  2. Geological processes create spatial structure (not random)
  3. Measurement errors are often systematic (sensor drift, calibration)
  4. Missing data is rarely random (wells where water is needed)
  5. Outliers often = interesting phenomena (not errors)
  6. Time scales matter (recharge takes weeks, regional flow takes years)

64.3.2 Statistical Challenges

  • Small sample sizes (expensive to drill wells)
  • High-dimensional but sparse (many features, few samples)
  • Non-stationary processes (climate change, land use)
  • Censored data (detection limits, regulatory thresholds)

65 Project-Specific Examples

65.1 Example 1: K-means HTEM

65.1.1 Computer Science Perspective

from sklearn.cluster import KMeans
kmeans = KMeans(n_clusters=6)  # Minimize within-cluster variance
clusters = kmeans.fit_predict(resistivity_features)

65.1.2 Hydrogeology Perspective

“We’re grouping similar resistivity values to delineate geological units (A-F). The k=6 is chosen because we expect 6 stratigraphic layers, not from elbow plot optimization.”

65.1.3 Statistics Perspective

“This is mixture modeling with hard assignments. We assume 6 Gaussian components. Could validate with BIC, but domain knowledge constrains k.”


65.2 Example 2: ARIMA Forecasting

65.2.1 Statistics Perspective

from statsmodels.tsa.arima.model import ARIMA
model = ARIMA(water_levels, order=(1,1,1), seasonal_order=(1,1,1,12))
forecast = model.predict(steps=30)

65.2.2 Hydrogeology Perspective

“We’re predicting future water levels accounting for seasonal recharge cycles (12-month period) and short-term trends. The AR(1) component captures aquifer memory.”

65.2.3 Computer Science Perspective

“Time series model that uses past values to predict future. The (p,d,q) and seasonal orders are hyperparameters chosen by AIC/BIC or domain knowledge (12-month cycle).”


65.3 Example 3: Interpolation Choice

65.3.1 When to Use Kriging

  • Need uncertainty estimates (kriging variance)
  • Data follows Gaussian assumptions
  • Spatial autocorrelation is primary pattern
  • Interpretability matters

65.3.2 When Use ML

  • Non-stationary processes
  • Multiple covariates available
  • Non-linear relationships
  • Large datasets (>100k points)

65.3.3 In This Project

We use both, compare results, and choose based on validation metrics.


66 Visual Concept Map

graph TD
    A[Environmental Data Science] --> B[Computer Science]
    A --> C[Hydrogeology]
    A --> D[Statistics]
    A --> E[Geophysics]

    B --> B1[Algorithms]
    B --> B2[Data Structures]
    B --> B3[Software Engineering]

    C --> C1[Aquifer Properties]
    C --> C2[Flow Systems]
    C --> C3[Water Quality]

    D --> D1[Spatial Statistics]
    D --> D2[Time Series]
    D --> D3[Uncertainty]

    E --> E1[EM Theory]
    E --> E2[Inversion]
    E --> E3[Material Properties]

    B1 -.-> C2
    C1 -.-> D1
    E3 -.-> C1
    D2 -.-> C2

67 Operations & Decision Support Terminology

This section covers terms commonly used in Part 5 (Predictive Operations) that bridge machine learning, optimization, and water management.

67.1 Model Performance Metrics

Term What It Means When to Use Example
R² (R-squared) Proportion of variance explained by model (0-1). Higher = better fit. Comparing models, regression tasks R² = 0.85 means model explains 85% of water level variation
RMSE Root Mean Square Error - average prediction error in original units Understanding “how wrong” predictions are RMSE = 0.3 m means predictions typically off by 0.3 meters
MAE Mean Absolute Error - average absolute prediction error Robust to outliers MAE = 0.2 m means average absolute error is 0.2 meters
Accuracy Percentage of correct classifications Classification tasks (sand vs clay) 86% accuracy = 86 of 100 predictions correct
Precision Of predictions labeled “positive”, how many were correct When false positives are costly 90% precision = 9 of 10 “sand” predictions were actually sand
Recall Of actual positives, how many did model find When false negatives are costly 80% recall = found 8 of 10 actual sand locations

67.2 Optimization Terminology

Term What It Means Water Management Context
Pareto frontier Set of solutions where improving one objective worsens another Trade-off between well yield (want high) and drilling cost (want low)
Multi-objective optimization Finding best trade-offs across competing goals Balancing yield, cost, uncertainty, and sustainability simultaneously
Constraint Hard limit that cannot be violated “Well must be >500m from contamination source”
Objective function Mathematical formula being optimized Maximize: 0.35×Yield + 0.25×(1-Cost) + 0.25×Confidence + 0.15×Sustainability
Risk-adjusted NPV Net Present Value accounting for uncertainty Expected value × probability of success

67.3 Explainability & Trust

Term What It Means Why It Matters
SHAP values Feature contribution to individual predictions “This well predicted as sand because: 40% from resistivity, 30% from depth, 20% from location”
Feature importance Global ranking of which inputs matter most “Across all predictions, resistivity is most important (35%), then depth (25%)”
Black box model Model where internal logic is hidden Neural networks - accurate but hard to explain to stakeholders
Interpretable model Model with transparent logic Decision trees - can show exact rules: “If resistivity > 100 AND depth < 50m → Sand”
Confidence interval Range where true value likely falls “Yield = 135 GPM ± 15 GPM (95% CI)” means 95% chance true yield is 120-150 GPM
Prediction interval Range where future observations likely fall Wider than CI because includes both model and data uncertainty

67.4 Common Confusion Pairs

Term 1 Term 2 The Difference
Parameter (ML) Parameter (Hydro) ML: Weights learned during training. Hydro: Physical properties (transmissivity, storativity)
Hyperparameter Parameter Hyperparameter: Set before training (e.g., tree depth). Parameter: Learned during training
Training Calibration Training = ML term. Calibration = Hydro term. Both mean fitting model to data
Validation Verification Validation: Does model perform well? Verification: Is model coded correctly?
Uncertainty Error Uncertainty: Range of possible values. Error: Difference between prediction and actual
Forecast Prediction Forecast: Future values (time-dependent). Prediction: Any estimated value
Overfitting Over-parameterization Both mean: Model too complex for available data, won’t generalize

67.5 Autocorrelation Interpretation Guide

When analyzing water level time series, autocorrelation (ACF) values tell you about aquifer “memory”:

ACF at Lag Physical Meaning Aquifer Type Indication
ACF(1 month) = 0.95 Very high memory - this month almost completely predicts next month Confined aquifer, slow response
ACF(1 month) = 0.50 Moderate memory - this month gives ~25% information about next month (0.5²) Semi-confined or deep unconfined
ACF(1 month) = 0.20 Low memory - rapid response, levels change quickly Shallow unconfined, stream-connected
ACF(12 months) = 0.50 Annual cycle explains ~25% of variance Strong seasonal forcing (annual recharge pattern)
ACF decays slowly Long-term persistence, multi-year droughts/wet periods Climate-dominated system, slow recovery
ACF decays quickly Short-term memory only Weather-dominated system, fast recovery

Rule of thumb: Confined aquifers typically show ACF(12 months) > 0.3; unconfined aquifers show ACF(12 months) < 0.2.


68 Contributing to This Guide

68.1 How to Contribute

68.1.1 Found a Term That Needs Translation?

Submit a PR or issue with: - The term in your discipline - How it’s used in context - Potential equivalents in other disciplines - Example from this project

68.1.2 Disagree with a translation?

Translations are nuanced! Start a discussion: - Explain your perspective - Provide references if available - Suggest alternative phrasing

68.1.3 Add Discipline?

We welcome additional perspectives: - Climate science - Ecology - Economics - Policy/regulation - Engineering


69 Further Resources

69.1 Books Bridging Disciplines

  • CS ↔︎ Hydrogeology: “Hydrogeological Data Analysis” by Kitanidis
  • Stats ↔︎ Spatial: “Statistics for Spatial Data” by Cressie
  • ML ↔︎ Hydrology: “Data-Driven Modeling of Environmental Systems” by Reichstein et al.

69.2 Online Glossaries

69.3 Community Forums


70 Summary

This translation guide serves as a living bridge between disciplines. As the project evolves, so will this guide.

Goal: When a computer scientist, hydrogeologist, statistician, or geophysicist reads the same analysis, they should each understand it in their own terms while appreciating what the other disciplines contribute.

Next Steps: 1. Bookmark this page for quick reference 2. Use Ctrl+F to search for terms as needed 3. Suggest additions via issues/PRs 4. Share with colleagues from other disciplines

Questions? Open an issue with the terminology label.


Last Updated: November 26, 2025 Contributors: Open to all License: CC-BY-4.0 (attribution required)