Fundamental Radiative Transfer Theory

Content
Home

MCMC vs Optimal Estimation : Nonlinear Example

Some important notations and defintion:

Kernel:

$$A = \frac{\partial \hat{x}}{\partial x^{\text{true}}} = \left(K^T S_\epsilon^{-1} K + S_a^{-1}\right) K^T S_\epsilon^{-1}K$$

Gain Matrix:

$$\hat{S} K^T S_\epsilon^{-1}$$ Therefore the $$\hat{x} = x_a + G (y-Kx)$$

Posterior error covariance:

$$\hat{S} = \left(K^T S_\epsilon^{-1} K + S_a^{-1}\right)^{-1}$$

Goal: Build a clean, self-contained mathematical example (no physics).

1. The Setup

Nonlinear forward model:

$$F(x) = x^2$$

Observation:

$$y = 9.5 ~~(\text{true value is}~~ x_\text{true} = 3, ~~\text{but measurement has noise})$$

Measurement noise:

$$σ_\epsilon = 0.5$$

Prior:

$$x_a = 2.0, \sigma_a = 2.0$$

2. Posterior

$$ P(x|y) \propto P(y|x) · P(x) = exp(−(y − F(x))^2 / 2σ_ε^2) · exp(−(x − x_a)^2 / 2σ_a^2) $$

Plugging in $F(x) = x^2$:

$$ P(x|y) \propto exp(−(9.5 − x^2)^2 / 0.5) · exp(−(x − 2)^2 / 8) $$

Key point: Not Gaussian due to $x^2$ → nonlinear model.

3. Cost Function J(x)

$$ J(x) = (9.5 − x²)² / 0.25 + (x − 2)² / 4 $$

Derivative:

$$ dJ/dx = −16x(9.5 − x²) + 0.5(x − 2) = 0 $$

This is a cubic equation → must solve iteratively.

4. Gauss-Newton Iteration

Linearisation: Linearisation around $x_i$

$$ F(x) \approx F(x_i) + K_i \, (x - x_i)\\ \text{where } \quad K_i = \left.\frac{dF}{dx}\right|_{x_i} = 2x_i ~~~~~~~~~~ \leftarrow \text{the Jacobian} $$

Gauss-Newton update formula:

$$ x_{i+1} = x_a + \left( \frac{K_i^2}{\sigma_\epsilon^2} + \frac{1}{\sigma_a^2} \right)^{-1} \cdot \frac{K_i}{\sigma_\epsilon^2} \cdot \left[ y - F(x_i) + K_i (x_i - x_a) \right] $$

Let's iterate numerically, starting from prior $x_0 = 2.0$:

Iteration 1

$$ x_i = 2.0 \\ F(2.0) = 4.0 \\ K_i = 2 x_i = 4.0 \\ \text{residual} = y - F(x_i) = 9.5 -4.0 = 5.5 \\ \text{Hessian} = \frac{K^2}{\sigma_\epsilon^2} + \frac{1}{\sigma_a^2} = \frac{16}{0.25} + \frac{1}{4} = 64.25 \\ \text{rhs} = \frac{K}{\sigma_\epsilon^2} \times [\text{residual} + K(x_i - x_a)] = \frac{4}{0.35} \times [5.5 + 4\times (2-2)] =88.0 \\ x_{i+1} = x_a + \text{rhs}/\text{Hessian} = 2.0 + 88.0/64.25 = 2.0 + 1.370 = 3.370 $$

Iteration 2

$$ F(3.370) = 11.357 \\ K_i = 2 \times 3.370 = 6.740 \\ \text{residual} = 9.5 - 11.357 = -1.857\\ \text{Hessian} = 6.74^2/0.25 + 0.25 = 181.8\\ \text{rhs} = 6.74/0.25 \times [-1.857 + 6.74\times(3.370-2)] = 26.96 \times [-1.857 + 9.232] = 26.96 \times 7.375 = 198.8 \\ x_{i+1} = 2.0 + 198.8/181.8 = 2.0 + 1.094 = 3.094 $$

Iteration 3

$$ F(3.094) = 9.573 \\ K_i = 6.188 \\ \text{residual} = 9.5 - 9.573 = -0.073 \\ x_{i+1} ≈ 3.081 $$

Iteration 4

$$ x_i = 3.080 \rightarrow x_{i+1} =3.080 \Rightarrow \text{converged} $$

5. Posterior Covariance

At the solution: $\hat{x} = 3.080$, $K=2\times 3.080 = 6.160$ $$ S_x = (K²/σ_ε² + 1/σ_a²)⁻¹ = (151.9 + 0.25)⁻¹ = 0.00657 $$

Standard deviation:

$$σ_{post} = \sqrt{0.00657} = 0.081$$

Result:

$$\hat{x} = 3.080 ± 0.081$$

6. Averaging Kernel

$$ A = S_x \frac{K^2}{\sigma_\epsilon^2} = 0.998 $$

→ Measurement dominates, prior is weak.

$A\approx 1.0$ means the retrieval is almost entirely driven by the measurement, not the prior. The prior $\sigma_a =2.0$ was broad, so the measurement dominates.

7. NOW See Why the Gaussian Assumption is Wrong : True Posterior Shape

OE gave us $\hat{x} = 3.080 \pm 0.081$ -- a symmetric Gaussian.

But look at the true posteriro shape. The cost function is: $$ J(x) = \frac{(9.5 - x²)^2}{0.25} + \frac{(x − 2)^2}{4} $$ We can draw a graph for this function as: Cost function plot

There is a second mode near $x = -3.08$ because $F(x) =x^2$ is symmetric. OE completely misses this -- it found the positive solution. MCMC would find both.

Also $J(x)$ rises much faster on the right $x> 3.08$ than on the left -- the posterior is asymmetric. OE's symmetric $\pm 0.081$ is therefore an approximation.

Important:

Second mode exists at x ≈ −3.08
Posterior is asymmetric
OE misses this completely

8. What MCMC Would Do Differently?

MCMC samples $ P(x|y) \propto \text{exp}\left(-\frac{J(x)}{2}\right)$ directly. It would reveal:

Mode 1: $x \approx +3.08 $ (positive root)
Mode 2: $x \approx -3.08 $ (negative root - missed by OE!)
Asymmetry: Longer tail on left side of each mode

9. Summary — The Two Assumptions Visualised

                TRUE POSTERIOR                                  OE APPROXIMATION
                (what MCMC finds)                               (what OE assumes)

                    │                                           │
                    │  *                                        │    *
                    │ ***   *                                   │   ***
                    │*****  **                                  │  *****
                ────┼──────────── x                         ────┼──────────── x
                -3.08  +3.08                                only +3.08

                Bimodal + asymmetric                        single Gaussian

10. Final Insight

The nonlinear model F(x) = x² causes:

Non-Gaussian posterior -- asymmetric tails
Multiple solutions -- two solutions OE cannot see
Asymmetry

This is exactly what happens in real CO₂ retrieval problems when degeneracies exist.

Reference

Go to Content