The Extended Kalman Filter: An Interactive Tutorial for Non-Experts

Part 14: Sensor Fusion Example

To get a feel for how sensor fusion works, let's restrict ourselves again to a system with just one state value.[15] To simplify things even further, we'll assume we have no knowledge of the state-transition model ($A$ matrix) and so have to rely only on the sensor values. Perhaps we are measuring the temperature outside with two different thermometers. So we'll just set our state-transition matrix to 1: \[\hat{x}_k = A \hat{x}_{k-1} = 1 * \hat{x}_{k-1} = \hat{x}_{k-1}\] Lacking a state-transition model for our thermometer, we just assume that the current state is the same as the previous state.

For sensor fusion we will of course need more than one sensor value in our observation vector $z_k$, which for this example we can treat as the current readings of our two thermometers. We'll assume that both sensors contribute equally to our temperature estimation, so our $C$ matrix is just a pair of 1's: \[ z_k = C x_k + v_k = \begin{bmatrix} 1 \\ 1 \end{bmatrix} x_k + v_k\] We now have two matrices ($A$, $C$) of the three ($A$, $C$, $R$) that we need for the prediction and update equations. So how do we obtain $R$?

Recall that for our single-sensor example, we defined $r$ as the variance of the observation noise signal $v_k$; that is, how much it varies around its mean (average) value. For a system with more than two sensors, $R$ is a matrix containing the covariance between each pair of sensors. The elements on the diagonal of this matrix will be the $r$ value for each sensor, i.e., that sensor's variance with itself. Elements off the diagonal represent how much one sensor's noise varies with another's. For this example, and many real-world applications, we assume that such values are zero. Let's say that we've observed both our thermometers under climate-controlled conditions of steady temperature, and observed that their values fluctuate by an average of 0.8 degrees; i.e., the standard deviation of their readings is 0.8, making the variance 0.8 * 0.8 = 0.64. This gives us the $R$ matrix: \[ R =\begin{bmatrix} 0.64 & 0\\ 0 & 0.64\end{bmatrix} \]

Now we can also see why $P_k$ and $G_k$ must be matrices: as mentioned in a footnote earlier, $P_k$ is the covariance of the estimation process at step $k$; so, like the sensor covariance matrix $R$, $P_k$ is also a matrix. And, since $G_k$ is the gain associated with these matrices at each step, $G_k$ must be likewise be a matrix, containing a gain value for each covariance value in these matrices. The sizes of these matrices $P_k$ and $G_k$ are of course determined by what they represent. In our present example, the size of $P_k$ is $1 \times 1$ (i.e., a single value), because it represents the covariance of the single value estimate value $\hat{x}_k$ with itself. And the gain $G_k$ is a $1 \times 2$ (one row, two column) matrix, because it relates the single state estimate $\hat{x}_k$ two the two sensor observations in $z_k$.

With this understanding of sensor fusion, let's set aside our thermometers and return to our airplane example. Putting it all together, we get the following equations for prediction and update for our airplane (using covariance noise values between 0 and 200 feet, as before):

Predict: \[\hat{x}_k = A \hat{x}_{k-1} = 1 * \hat{x}_{k-1} = \hat{x}_{k-1}\] \[P_k = A P_{k-1} A^T = 1 * P_{k-1} * 1 = P_{k-1}\] Update:

$G_k = P_k C^T (C P_k C^T + R)^{-1} =$ $P_k \begin{bmatrix} 1~~1 \end{bmatrix} (\begin{bmatrix} 1 \\ 1 \end{bmatrix} P_k \begin{bmatrix} 1~~1 \end{bmatrix} + \begin{bmatrix} 200 & 0\\ 0 & 180 \end{bmatrix})^{-1}$

$\hat{x}_k \leftarrow \hat{x}_k + G_k(z_k - C \hat{x}_k) = $ $\hat{x}_k + G_k(z_k - \begin{bmatrix} 1 \\ 1 \end{bmatrix} \hat{x}_k)$

$P_k \leftarrow (I - G_k C) P_k =$ $(1 - G_k\begin{bmatrix} 1 \\ 1 \end{bmatrix}) P_k$

As it turns out, our impoverished state-transition model can get us into trouble if we don't reintroduce something we mentioned earlier: process noise. Recall that our complete equation for the state transition in a single-variable system was \[ x_k = a x_{k-1} + w_k \] where $w_k$ is the process noise at a given time. With our linear algebra knowledge we would now of course write this equation as \[ x_k = A x_{k-1} + w_k \] but the fact remains that we still have not accounted for the process noise in our prediction / update model. Doing this turns out to be pretty easy. Just as we used $R$ to represent the covariance of the measurement noise $v_k$, we use $Q$ to represent the covariance of the process noise $w_k$. Then we make a slight modification to our $P_k$ prediction, simply adding in this covariance:

$P_k = A P_{k-1} A^T + Q$

The interesting thing is, even very small values for the nonzero elements of this $Q$ matrix turn out to be very helpful in keeping our estimated state values on track.

So here is at last is a little sensor-fusion demo, allowing you to experiment with the values in $R$ and $Q$, and also to change the amount of bias (constant inaccuracy; or mean value of the noise) in each of the two sensors. As you can see, when sensors are biased in different directions, sensor fusion can provide a closer approximation to the “true” state of the system than you can get with a single sensor alone.



Plot: Actual $x_k$   Sensor 1 $z_{1k}$   Sensor 2 $z_{2k}$   Estimated $\hat{x}_k$

Sensor 1 Bias = -1 Sensor 2 Bias = 1
 
R1 = 0.64 R2 = 0.64 Q = .05

Previous:       Next:


[15] I adapted this material from the example in Antonio Moran's excellent slides on Kalman filtering for sensor fusion. Matlab / Octave users may want to try out the version I've posted on Github, which includes a more general implementation of the Kalman filter.