Weighting - Least squares revisited

This chapter is a classic refinement of the least squares algorithm. It consists of giving a different weight to measurements in function of their accuracy.

Non-weighted and weighted least squares

As we saw earlier, the sum of the squares of the residuals is given by the following formula:

Sum of the squares of the residuals

Non-weighted

\[ \mathcal{S}(p) = \sum_{ i=1 }^{ n } \left( y_i - f(p)_i \right)^2 \]

To refine our process, we might want to give a different weight \(\omega_i\) to each measurement, or to each type of measurement. This can be achieved with a very simple modification of the previous formula:

Sum of the squares of the residuals

Weighted

\[ \mathcal{S}(p) = \sum_{ i=1 }^{ n } \omega_i \left( y_i - f(p)_i \right)^2 \]

What kind of weight could we use, and why?

A good idea and typical scheme is be to give a bigger weight to measurements that have a better level of accuracy. We do have an idea of the characteristics of the instruments, so we can use this information to give more importance to the best quality observations. This is simple common sense. So if the uncertainty of a measurement is known to be \( \sigma_i \), then we can use \(\omega_i = 1/\sigma_i^2 \) as a weight in the algorithm.

Sum of the squares of the residuals

Weighted according to measurement uncertainties

\[ \omega_i = \frac{1}{\sigma_i^2} \]

\[ \mathcal{S}(p) = \sum_{ i=1 }^{ n } \left( \frac{y_i - f(p)_i}{\sigma_i} \right)^2 \]

This weighting method has the additional advantage that all quantities are now dimensionless. The measurement \(y_i\) and its uncertainty \(\sigma_i\) have the same unit. So if one measurement was measured in meters and the other in seconds, they are all now reduced without units.

Let's now study how the weighting translates in the matrix formulation.

Linearization and matrix notation (non-weighted)

As a reminder, if we linearize the non-weighted problem, we get the following results:

Quadratic approximation of \( \mathcal{S} \)

Non-weighted

\[ Q(p) = \sum_{ i=1 }^{ n } { \left( AX - B \right)_i }^2 \]

\[ Q(p) = \,^tX^tA AX - 2\,^t(\,^tAB)X + \,^tBB \]

\[ X_{min} = (\,^tAA)^{-1} \,^tAB \]

Linearization and matrix notation (weighted)

If the case of weighted residuals, it is very similar.

We simply introduce the following diagonal matrix: \(\Pi = diag(\omega_1, ..., \omega_i, ..., \omega_n) = diag(1/\sigma_1^2, ..., 1/\sigma_i^2, ..., 1/\sigma_n^2) \).

Weight matrix

Diagonal matrix

\[ \Pi = diag\left(\frac{1}{\sigma_1^2}, ..., \frac{1}{\sigma_i^2}, ..., \frac{1}{\sigma_n^2}\right) \]

And the results, after linearization, can then be written as this:

Quadratic approximation of \( \mathcal{S} \)

Weighted

\[ Q(p) = \sum_{ i=1 }^{ n } { \left( \sqrt{\Pi} AX - \sqrt{\Pi} B \right)_i^2 } \]

\[ Q(p) = \,^tX^tA \Pi AX - 2\,^t(\,^tA \Pi B)X + \,^tB \Pi B \]

\[ X_{min} = (\,^tA \Pi A)^{-1} \,^tA \Pi B\]

This is very similar to the non-weighted variant. In terms of calculation, you can simply replace \(N\) by \(\,^tA \Pi A\), \(S\) by \(\,^tA \Pi B \) and \(\alpha\) by \(\,^tB \Pi B\), and everything will work as usual.