Integration of Virtual Objects into Live Video Scenes in Real-Time

The CaTS Camera Tracking System

Problem

Virtual studios are an innovative way of extending the functionality of traditional TV studios: Background and props for a TV show are no longer built from real material but generated and rendered within a graphics computer. The camera image showing a real scene is merged with a computer-generated image to achieve the visual effect of real people acting in a virtual environment. The merge of real and virtual worlds is daily practice, e.g., in the TV weather forecast where the moderator points to a virtual map that does not exist in the studio. Currently, the camera settings (position, pan, tilt, roll, zoom) are carefully adjusted manually to the settings of the graphics transformation pipeline and have to remain unchanged to secure proper alignment of the two images. In the case of a moving camera, real and synthetic image have to be adjusted frame by frame. For real time applications, the integration has to be done automatically on the fly in a time close to the video frame rate (1/25 sec). A satisfactory solution of this problem has not yet be found. The settings of the real camera can be measured by using sensors. However, this implies special equip-ment which is expensive, clumsy, and not available everywhere-for instance not outside of studios.

Our approach

In the CaTS system, the settings of the real camera is calculated solely by analyzing the video image. By applying specialized methods of image processing, pattern recogni-tion, and computer vision, given reference points and anchor objects in the video are found and followed in each frame. These points allow the calculation of the proper perspective projection and a proper alignment of real and virtual worlds.

Scenario

Virtual props extend the classical props used in theater: they are synthetic objects included into a real scene. They share the capabilities of real props and computer controlled items: they can be live, can grow and move, and can be manipulated by persons within the real scene.

(132 KB) (132 KB) (148 KB)
Figure 1.
Real scene with anchor point in the centre of the table
Figure 2.
Virtual cube at position of anchor point
Figure 3.
The virtual cube retains proper position and size after camera move and zoom

Figure 1 shows a table with an anchor point in the centre. The flip chart in the background contains reference points for calculating the projection in use. By picture analysis, the anchor and reference points are found and the projection is calculated. Then a virtual object, e.g., a box, can be inserted on the table at the position of the anchor point under correct projection (figure 2). When moving, panning, and zooming the camera, a different view of the table is reached. However, by tracking the reference points in real-time, the box always remains at the proper position in the proper size and view with respect to the real table (figure 3). Similarily, the real scene could be mixed with an arbitrary three-dimensional background which behaves correctly under all camera changes.

CaTS is a contribution to the virtual studio built up at GMD. It has been implemented using Silicon Graphics workstations Indigo and Onyx equipped with Galileo and Sirius boards. The project has started in 1994 and endet in 1996.


Papers


Research Staff: Marko Jahnke, Klaus Kansy, Ralf Ratering, Günther Schmitgen, Peter Wisskirchen


Please contact: Klaus Kansy