ES
← Back to Portfolio
Research & Thesis December 2019

Information-Theoretic Sampling for Geological Image Recovery

Doctoral thesis addressing optimal sensor placement — where to place N measurements to minimize posterior uncertainty in binary random fields. Introduced AdSEMES algorithm with submodularity guarantees. Published in Mathematical Geosciences and Natural Resources Research.

Algorithm
AdSEMES
Theory
Shannon Entropy, Mutual Information
Publications
3 journal papers
Applications
Drill hole placement, facies recovery
Information-Theoretic Sampling for Geological Image Recovery — Architecture
#phd-thesis#information-theory#entropy#geostatistics#optimal-sampling#mining

Business Context

The Optimal Sensor Placement problem is fundamental to mineral exploration, environmental monitoring, and any domain where data collection is expensive. Given a budget of N measurements, where should they be placed to learn as much as possible about an unknown spatial field? For a modest 50x50 field with 20 measurements, the search space exceeds 10^26 possible configurations. Exhaustive search is NP-hard, and traditional approaches — regular grids, random placement — ignore spatial information content entirely.

Strategic Value

The thesis introduces AdSEMES (Adaptive Sequential Empirical Maximum Entropy Sampling), exploiting a key mathematical property: entropy maximization in this setting is submodular, guaranteeing that greedy sequential selection achieves at least (1-1/e) ≈ 63.2% of the global optimum — a provable bound, not an empirical observation. The framework compares six sampling strategies with spatial penalty functions and three reconstruction methods (nearest neighbor, indicator kriging, entropy-weighted inverse distance). Applied to drill hole placement for mineral resource estimation and ore-waste boundary discrimination. Published in Mathematical Geosciences (2019) and Natural Resources Research (2020).

The Challenge

Given N measurements budget, where to place them to minimize posterior uncertainty? The combinatorial search over C(H×W, K) candidates is NP-hard. Traditional sampling strategies (regular grids, random) ignore spatial information content.

Our Approach

AdSEMES (Adaptive Sequential Empirical Maximum Entropy Sampling) algorithm exploiting submodularity for (1-1/e) approximation to global optimum. Compares six sampling strategies with spatial penalty functions and three reconstruction methods. Applied to drill hole placement for mineral resource estimation and ore-waste boundary discrimination.

Key Performance Indicators

KPIBaselineResultImpact
Optimality GuaranteeHeuristic placement(1-1/e) ≈ 63.2% of global optimumProvable quality bound
PublicationsN/A3 journal papers (Math Geosci, NRR)Peer-reviewed validation

Architecture

ids owp

ids owp

The Question

Given a budget of N measurements, where should you place them to learn as much as possible about an unknown spatial field? This is the Optimal Sensor Placement problem — fundamental to mineral exploration (where to drill next), environmental monitoring (where to install sensors), and any domain where data collection is expensive.

The search space is C(H×W, K) — combinatorially explosive. For a modest 50×50 field with 20 measurements, that’s over 10²⁶ possible configurations. Exhaustive search is impossible.

The Information-Theoretic Approach

The thesis frames the problem through Shannon entropy and mutual information:

  • H(X) = -Σ p(x) log p(x) — total uncertainty about the unknown field
  • H(X^f | X_f) — residual uncertainty after observing at locations f
  • I(X_f; X^f) = H(X^f) - H(X^f | X_f) — information gained by measuring at f

The goal: choose locations that maximize mutual information. The breakthrough: entropy maximization in this setting satisfies submodularity — adding a measurement to a small set yields more information gain than adding it to a large set. This mathematical property guarantees that greedy sequential selection achieves at least (1 - 1/e) ≈ 63.2% of the global optimum. Not an empirical observation — a provable bound.

The AdSEMES (Adaptive Sequential Empirical Maximum Entropy Sampling) algorithm implements this with spatial penalty functions that prevent clustering and three reconstruction methods (nearest neighbor, indicator kriging, entropy-weighted inverse distance) for recovering the full field from sparse observations.

Publications

  1. “Sampling Strategies for Uncertainty Reduction in Categorical Random Fields” — Mathematical Geosciences, 2019
  2. “Optimal Sampling Strategy for Spatial Estimation of Ore-Waste Contacts” — Natural Resources Research, 2020
  3. “Geological Facies Recovery Based on Weighted L1-Regularization” — Mathematical Geosciences, 2019

Technology Stack

PythonNumPySciPyInformation TheoryGeostatisticsIndicator KrigingLaTeX

Application Screenshots

Information-Theoretic Sampling for Geological Image Recovery

Technical Diagrams

owp adsemes

owp adsemes

owp information theory

owp information theory

owp resolvability

owp resolvability

owp sampling comparison

owp sampling comparison