We briefly discuss the dynamics of the Structure from Motion problem. As shown earlier, it is often the case (i.e. in cinematographic post-production, robotics, etc.) that cameras do not teleport around the scene and objects do not move about too suddenly. These bodies are governed by physical dynamics and it thus makes sense to constrain the possible configurations of the camera to have some smooth temporal changes over a causal time sequence. For instance, we consider the typical dynamic system: 4
![]() |
= | ![]() |
(9) |
![]() |
= | ![]() |
(10) |
Here, the observations are the 2D features (in u,v coordinates)
which are concatenated into an observation vector
for each
moment in time. The observations are caused by the internal state of
the system,
which contains the scene's 3D structure, the
relative 3D motion between the camera and the scene and the camera's
internal geometry. The mapping from
to
is tricky
in SfM since it is nonlinear (
varies with
)
and
is also corrupted by some noise. Here, the noise is represented as an
additive Gaussian (normal
)
process with zero-mean and
time-varying covariance Rt. The matrix Rt probabilistically
encodes the accuracy of the measured 2D feature coordinates and can
represent features that are missing in certain frames when large
variances are imputed into Rt appropriately.
In addition, the dynamics of the internal state are constrained. The
3D structure, 3D motion and camera geometry do not vary wildly but are
linearly dependent (via )
on their previous values at the past
time interval plus Gaussian noise. The noise process is additive with
zero-mean and covariance Q. For generality, we assume that the
motion of the camera through the scene is not known a priori and thus,
is set to identity. Therefore, the internal state varies only
through some Gaussian random noise process. This can be seen as a
'random walk' type of internal state space. In other words, the vector
varies randomly but smoothly with small deltas from its past
values.
This dynamic system encodes the causal and dynamic nature of the SfM
problem and allows an elegant integration of multiple frames from
image sequences. It is also a probabilistic framework for representing
uncertainty. These dynamical systems have been extensively studied are
routinely solved via reliable Kalman Filtering (KF) techniques. In our
nonlinear case, an Extended Kalman Filter (EKF) is utilized which
linearizes
at each time step.
The representation of the measurement vector
is simply the
concatenation of the 2D feature point measurements. We now turn our
attention to the representation of the internal state
of
the unknowns of the system: the 3D structure, 3D motion and internal
camera geometry. This step is critical since the effectiveness of the
Kalman filtering framework depends strongly on the representation.