Derivation of studio camera position and motion from the camera image

Derivation of studio camera position and motion from the camera image
RE38420

Studio camera position and motion may be derived from the camera image by separating out the background and deriving from a background having a number of areas of hue and/or brightness different from adjacent areas estimates of movement from one image to the next. The initial image is used as a reference and amended with predicted motion value. The amended image is compared with incoming images and the result used to derive translation and scale change information. Once the proportion of the reference image contained in an incoming image falls below a threshold a fresh reference image is adopted.

PTO Wrapper PDF
Dossier Espace Google

Patent RE38420
Priority Aug 12 1992
Filed Mar 25 1998
Issued Feb 10 2004
Expiry Aug 12 2013
Inventors Thomas, Gr…
Assg.orig British Br…
Assg.curr British Br…
Entity Large
Referenced by 37
References 20
Maint.: all paid

FIELD OF THE INVENTI…
BACKGROUND TO THE IN…
DESCRIPTION OF PRIOR…
DESCRIPTION OF BEST …

0. 38. A method of deriving a measure of translation and scale change from a camera output signal, the method comprising:

providing a background with a pattern comprising a plurality of areas of a first tone and a plurality of areas of a second tone arranged so that transitions between areas occur along at least two axes, whereby the transitions provide reference points from which measures of both translation and scale change may be determined;

receiving said camera output signal defining a camera image from a camera viewing a scene containing at least a portion of said background;

detecting said first and second tones in the camera output signal by chroma-keying; and

deriving from results of said detecting step a measure of translation and scale change.

0. 57. Apparatus for deriving at least one parameter representing the movement of a camera from a camera image, said camera viewing at least a portion of a background having a pattern comprising a plurality of areas of a first tone and a plurality of areas of a second tone arranged so that transitions between areas occur along at least two axes, whereby the transitions provide reference points from which measures of both translation and scale change may be determined, the apparatus comprising:

means for receiving an output signal from said camera, said signal defining a camera image;

means for detecting said first and second tones in the camera image by chroma-keying; and

means for deriving from results of said detecting means a measure of translation and scale change.

0. 77. Apparatus for deriving at least one parameter representing the movement of a camera from a camera image, said camera viewing at least a portion of a background having a pattern comprising a plurality of areas of a first tone and a plurality of areas of a second tone arranged so that transitions between areas occur along at least two axes, whereby the transitions provide reference points from which measures of both translation and scale change may be determined, the apparatus comprising:

an input receiving an output signal from said camera, said signal defining a camera image; and

a signal processing system that 1

) detects said first and second tones in the camera image by chroma-keying; and 2

) derives from the results of said detection a measure of translation and scale change.

16. Apparatus for measuring the translation and scale change in a sequence of video images, comprising:

means for acquiring said sequence of images;

storage means for storing a first image in said sequence to form a stored image;

means for forming a prediction of the translation and scale change from said stored image to a further image in said sequence;

means for comparing said stored image to said further image by transforming at least one of said stored image by and said further image based on said prediction of the translation and scale change to form a transformed first image;

means for comparing said transformed first image with a further image in the sequence ;

means for deriving measurements of translation and scale change between said stored image and said further image from said comparison and said prediction; and

means for replacing said first image with a new incoming image when the image area common to both said first image and incoming image falls below a given proportion of the whole image area of said incoming images.

0. 31. Apparatus for measuring the translation and scale change in a sequence of video images, comprising:

an image storage unit that stores a first image in said sequence to form a stored image;

a prediction unit that forms a prediction of the translation and scale change from said stored image to a further image in said sequence;

an image transformer that compares said stored image to said further image by transforming at least one of said stored image and said further image based on said prediction of the translation and scale change;

a measurement unit that derives measurements of translation and scale change between said stored image and said further image from said comparison and said prediction; and

a refresh signal generator that replaces said first image with a new incoming image when the image area common to both said first image and incoming image falls below a given proportion of the whole image area of said incoming images.

1. A method of measuring the translation and scale change in a sequence of video images, the method comprising:

storing a first image in said sequence to form a stored image;

forming a prediction of the translation and scale change from said stored image to a further image in said sequence;

comparing said stored image to said further image by transforming at least one of said stored image by and said further image based on said prediction of the translation and scale change; to form a transformed first image;

comparing a further image in said sequence with said transformed first image;

deriving measurements of translation and scale change between said stored image and said further image from said comparison and said prediction; and

replacing said first image with a new incoming image when the image area common to both said first image and incoming image falls below a given proportion of the whole image area of said incoming image.

0. 35. Apparatus for measuring the translation and scale change in a sequence of video images derived by a camera, comprising:

an image storage unit that stores a first image of said sequence to form a stored image, said image comprising a single component signal derived by a camera viewing a scene containing a background of near-uniform color, said background being divided into a plurality of areas each having a hue and/or brightness different from the hue and/or brightness of adjacent areas to allow generation of a key signal by chroma-key techniques;

a signal former that forms said single-component signal from said camera signal so as to accentuate differences in hue and/or brightness of individual areas of the background to enable motion estimation;

a prediction unit that forms a prediction of the translation and scale change from said stored image to a further image in said sequence;

an image transformer that compares said stored image to said further image by transforming at least one of the stored image and said further image based on said prediction of the translation and scale change; and

a measurement unit that derives, from said comparison and said prediction, measurements of translation of scale change between said stored image and said further image.

6. A method of measuring the translation and scale change in a sequence of video images of a camera signal derived by a camera, comprising:

storing a first image of said sequence to form a stored image said image comprising a single component signal derived by a camera viewing a scene containing a background of near-uniform color, and said background being divided into a plurality of areas each having a hue and/or brightness different from the hue and/or brightness of adjacent areas to allow generation of a key signal by chroma-key techniques;

forming said single-component signal from said camera signal so as to accentuate differences in hue and/or brightness of individual areas of the background to enable motion estimation;

forming a prediction of the translation and scale change from said stored image to a further image in said sequence;

comparing a further image in said sequence with said transformed first image; and

deriving measurements of translation and scale change between said stored image and said further image from said comparison and said prediction.

20. Apparatus for measuring the translation and scale change in a sequence of video images derived by a camera, comprising:

means for storing a first image of said sequence to form a stored image, said image comprising a single component signal derived by a camera viewing a scene containing a background of near-uniform color, said background being divided into a plurality of areas each having a hue and/or brightness different from the hue and/or brightness of adjacent areas to allow generation of a key signal by chroma-key techniques;

means for forming said single-component signal from said camera signal so as to accentuate differences in hue and/or brightness of individual areas of the background to enable motion estimation;

means for forming a prediction of the translation and scale change from said stored image to a further image in said sequence;

means for comparing said stored image to said further image by transforming at least one of the stored image by a and said further image based on said prediction of the translation and scale change to form a transformed first image;

means for comparing a further image in said sequence with said transformed first image; and

means for deriving, from said comparison and said prediction, measurements of translation of scale change between said stored image and said further image.

2. The method of claim 1, wherein only the background areas of said video images are used in the measurement of translation and scale change.

3. The method of claim 2, wherein a signal is used to separate foreground and background and said signal is derived using chroma-key techniques.

4. The method of claim 2, wherein a signal is used to separate foreground and background portions of the images and said signal is derived using motion detection methods to identify objects moving with a different motion from that predicted for the background.

5. The method of claim 1, wherein:

each image is a single-component signal derived from a camera viewing a scene containing a background of near-uniform color;

said background is divided into a plurality of areas each having a hue and/or brightness different from the hue and/or brightness of adjacent areas to allow the generation of a key signal by chroma-key techniques; and

wherein said single-component signal is formed from a three-component camera signal so to accentuate differences in hue and/or brightness of individual areas of said background to enable motion estimation.

7. The method of claim 6, wherein said background is divided into a plurality of areas, each area having one of two hues and/or brightnesses.

8. The method of claim 6, wherein said areas of the background are square.

9. The method of claim 1 or 6, wherein said translation and scale change are predicted by the computation of a number of simultaneous equations each of which relate said translation and scale change to spatial and temporal gradients at a point in the reference image and which are solved to yield a least-squares solution for the motion parameters.

10. The method of claim 1 or 6, comprising:

selecting a number of measurement points in said first image for motion estimation; and

replacing said first image with a new incoming image when the number of measurement points which lie in areas of background visible in both said first image and a given incoming image falls below a given proportion of the total number of measurement points.

11. The method of claim 1 or 6, comprising replacing said first image if the scale change between an incoming image and the reference image exceeds a given factor.

12. The method of claim 1 or 6, comprising spatially prefiltering said first image prior to storage and comparison.

13. The method of claim 10, wherein said measurement points lie in a regular array in the image.

14. The method of claim 10 wherein said measurement points are chosen to lie at points of high spatial gradient.

15. The method of claim 1 or 6, comprising storing replaced reference images for later use.

17. The apparatus of claim 16, wherein said means for deriving operates only on the background areas of the images, said means for deriving comprising means for separating foreground and background portions of the images using motion techniques to identify objects moving with a different motion from that predicted for the background.

18. The apparatus of claim 16, wherein said means for deriving operates only on the background areas of the images, and said means for deriving comprising a key generator for generating a chroma-key to separate foreground and background.

19. The apparatus of claim 16, comprising:

means for generating a single-component signal from a camera viewing a scene containing a background of near-uniform color, said background being divided into a plurality of areas each having a hue and/or brightness different from the hue and/or brightness of adjacent areas to allow the generation of a key signal by chroma-key techniques, and wherein the single-component signal is formed from a three-component camera signal so as to accentuate differences in hue and/or brightness of individual areas of the background to enable motion estimation.

21. The apparatus of claim 19 or 20, wherein said background is divided into a plurality of areas each area having one or two hues and/or brightnesses.

22. The apparatus of claim 19 or 20, wherein said areas of the background are square.

23. The apparatus of claim 19 or 20, comprising:

means for predicting the translation and scale change by the computation of a number of simultaneous equations each of which relate the translation and scale change to spatial and temporal gradients at a point in the first image and which are solved to yield a least-squares solution for the motion parameters.

24. The apparatus of claim 19 or 20, wherein the replacing means comprises:

means for selecting a number of measurement points in the first image for motion estimation and for replacing the first image with a new incoming image when the number of measurement points which lie in areas of background visible in both the first image and a given incoming image falls below a given proportion of the total number of measurement points.

25. The apparatus of claim 16 or 20, comprising a spatial filter for filtering the images prior to storage and comparison.

26. The apparatus of claim 16 or 20, comprising a further storage means for storing a replaced reference image for future use.

0. 27. The method of claim 1 wherein said step of comparing said stored image to said further image comprises substeps of:

transforming said stored image by said prediction of the translation and scale change to form a transformed first image; and

comparing a further image in said sequence with said transformed first image.

0. 28. The method of claim 6 wherein said step of comparing said stored image to said further image comprises substeps of:

transforming said stored image by said prediction of the translation and scale change to form a transformed first image; and

comparing a further image in said sequence with said transformed first image.

0. 29. The apparatus of claim 16 wherein said means for comparing said stored image to said further image comprises:

means for transforming said stored image by said prediction of the translation and scale change to form a transformed first image; and

means for comparing a further image in said sequence with said transformed first image.

0. 30. The apparatus of claim 20 wherein said means for comparing said stored image to said further image comprises:

means for transforming said stored image by said prediction of the translation and scale change to form a transformed first image; and

means for comparing a further image in said sequence with said transformed first image.

0. 32. The apparatus of claim 31, wherein said measurement unit operates only on the background areas of the images, said measurement unit separating foreground and background portions of the images using motion techniques to identify objects moving with a different motion from that predicted for the background.

0. 33. The apparatus of claim 31, wherein said measurement unit operates only on the background areas of the images, and said measurement unit comprises a key generator for generating a chroma-key to separate foreground and background.

0. 34. The apparatus of claim 31, comprising:

means for generating a single-component signal from a camera viewing a scene containing a background of near-uniform color, said background being divided into a plurality of areas each having a hue and/or brightness different from the hue and/or brightness of adjacent areas to allow the generation of a key signal by chroma-key techniques, and wherein the single-component signal is formed from a three-component camera signal so as to accentuate differences in hue and/or brightness of individual areas of the background to enable motion estimation.

0. 36. The apparatus of claim 35 wherein said background is divided into a plurality of areas each area having one or two hues and/or brightnesses.

0. 37. The apparatus of claim 35, wherein said areas of the background are square.

0. 39. A method according to claim 38, wherein the first and second tones differ in hue.

0. 40. A method according to claim 38, wherein the first and second tones differ in brightness.

0. 41. A method according to claim 38, wherein the first and second tones are of substantially equal brightness.

0. 42. A method according to claim 38, wherein the pattern comprises a chequered pattern.

0. 43. A method according to claim 38, wherein said chroma-keying is used to remove foreground objects from the image prior to determination of translation and scale change.

0. 44. A method according to claim 38 wherein the camera output signal is spatially filtered prior to storage and comparison.

0. 45. A method according to claim 38, wherein said measures of translation and scale change are determined using a stored reference image.

0. 46. A method according to claim 45, wherein translation and scale change are determined by transforming the reference image and comparing with the camera image.

0. 47. A method according to claim 46, wherein said transforming is based on a prediction of the translation and scale change.

0. 48. A method according to claim 45, wherein translation and scale change are determined by computation of a number of simultaneous equations each of which relate the translation and scale change to spatial and temporal gradients at a point in the reference image and which are solved to yield a least-squares solution for the motion parameters.

0. 49. A method according to claim 45, wherein the stored reference image is replaced when the accumulated translation and scale change exceeds a given threshold.

0. 50. A method according to claim 38, wherein translation and scale change are determined from a sub-set of the pixels in the camera image.

0. 51. A method according to claim 50, wherein the sub-set of pixels lie in a regular array in the camera image.

0. 52. A method according to claim 50, wherein the sub-set of pixels are chosen to lie at points of high spatial gradient.

0. 53. A method according to claim 38, wherein a stored reference image is compared to the camera image after transforming the reference image based on a prediction of translation and scale change and a measure of translation and scale change derived from the comparison.

0. 54. A method according to claim 53, wherein the reference image comprises an earlier camera image.

0. 55. A method according to claim 54, wherein the reference image is replaced when the overlap between the reference image and the current camera image falls below a given proportion.

0. 56. A method according to claim 38, wherein data obtained from a sensor on the camera is used in derivation of said measures of translation and scale change.

0. 58. Apparatus according to claim 57 further comprising said background.

0. 59. Apparatus according to claim 57 further comprising said camera.

0. 60. Apparatus according to claim 57 further comprising said background and said camera.

0. 61. Apparatus according to claim 57, wherein the first and second tones differ in hue.

0. 62. Apparatus according to claim 57, wherein the first and second tones differ in brightness.

0. 63. Apparatus according to claim 57, wherein the first and second tones are of substantially equal brightness.

0. 64. Apparatus according to claim 57, wherein the pattern comprises a chequered pattern.

0. 65. Apparatus according to claim 57, wherein said detecting means is arranged to remove foreground objects from the image prior to determination of translation and scale change.

0. 66. Apparatus according to claim 57 comprising means for spatially filtering the camera image prior to storage and comparison.

0. 67. Apparatus according to claims 57, further comprising an image store that stores a reference image for use by said deriving means.

0. 68. Apparatus according to claim 57, wherein the deriving means is configured to determine translation and scale change by transforming the reference image and comparing with the camera image.

0. 69. Apparatus according to claim 68, wherein said transforming is based on a prediction of the translation and scale change.

0. 70. Apparatus according to claim 67, wherein translation and scale change are determined by computation of a number of simultaneous equations each of which relate the translation and scale change to spatial and temporal gradients at a point in the reference image and which are solved to yield a least-squares solution for the motion parameters.

0. 71. Apparatus according to claims 67 wherein the stored reference image is replaced when the accumulated translation and scale change exceeds a given threshold.

0. 72. Apparatus according to claim 57, wherein said deriving means is configured to determine translation and scale change from a sub-set of the pixels in the camera image.

0. 73. Apparatus according to claim 72, wherein the sub-set of pixels lie in a regular array in the camera image.

0. 74. Apparatus according to claim 72 wherein the sub-set of pixels are chosen to lie at points of high spatial gradient.

0. 75. Apparatus according to claim 57, further comprising a sensor on the camera that provides data for use in derivation of said measures of translation and scale change.

0. 76. Apparatus according to claim 57, wherein said deriving means comprises digital signal processing apparatus.

FIELD OF THE INVENTION

This invention relates to the derivation of information regarding the position of a television camera from image data acquired by the camera.

BACKGROUND TO THE INVENTION

In television production, it is often required to video live action in the studio and electronically superimpose the action on a background image. This is usually done by shooting the action in front of a blue background and generating a `key` from the video signal to distinguish between foreground and background. In the background areas, the chosen background image can be electronically inserted.

One limitation to this technique is that the camera in the studio cannot move, since this would generate motion of the foreground without commensurate background movement. One way of allowing the camera to move is to use a robotic camera mounting that allows a predefined camera motion to be executed, the same camera motion being used when the background images are shot. However the need for predefined motion places severe artistic limitations on the production process.

Techniques are currently under development that aim to be able to generate electronically background images that can be changed as the camera is moved so that they are appropriate to the present camera position. Thus a means of measuring the position of the camera in the studio is required. One way in which this can be done is to attach sensors to the camera to determine its position and angle of view; however the use of such sensors is not always practical.

The problem being addressed here is a method to derive the position and motion of the camera using only the video signal from the camera. Thus it can be used on an unmodified camera without special sensors.

DESCRIPTION OF PRIOR ART

The derivation of the position and motion of a camera by analysis of its image signal is a task often referred to as passive navigation; there are many examples of approaches to this problem in the literature, the more pertinent of which are as follows:

DESCRIPTION OF BEST MODE

The algorithm chosen for measuring global translation and scale change must satisfy the following criteria:

1. The chosen algorithm cannot be too computationally intensive, since it must run in real-time;

2. It must be capable of highly accurate measurements, since measurement errors will manifest themselves as displacement errors between foreground and background;

3. Measurement errors should not accumulate to a significant extent as the camera moves further away from its starting point.

Embodiment: Motion Estimation Followed by Global Motion Parameter Determination

An example of one type of algorithm that could be used is one based on a recursive spario-temporal gradient technique described in reference 4 [Netravali and Robbins 1979]. This kind of algorithm is known to be computationally efficient and to be able to measure small displacements to a high accuracy. Other algorithms based on block matching described in reference 6 [Uomori et al. 1992] or phase correlation described in reference 5 [Thomas 1987] may also be suitable.

The algorithm may be used to estimate the motion on a sample-by-sample basis between each new camera image and a stored reference image. The reference image is initially that viewed by the camera at the start of the shooting, when the camera is in a known position. Before each measurement, the expected translation and scale change is predicted from previous measurements and the reference image is subject to a translation and scale change by this estimated amount. Thus the motion estimation process need only measure the difference between the actual and predicted motion.

The motion vector field produced is analyzed to determine the horizontal and vertical displacement and scale change. This can be done by selecting a number of points in the vector field likely to have accurate vectors (for example in regions having both high image detail and uniform vectors). The scale change can be determined by examining the difference between selected vectors as a function of the spatial separation of the points. The translation can then be determined from the average values of the measured vectors after discounting the effect of the scale change. The measured values are added to the estimated values to yield the accumulated displacement and scale change for the present camera image.

More sophisticated methods of analysing the vector field could be added in future, for example in conjunction with means for determining the depth of given image points, to extend the flexibility of the system.

As the accumulated translation and scale change get larger, the translated reference image will begin to provide a poor approximation to the current camera image. For example, if the camera is panning to the right, picture material on the right of the current image will not be present in the reference image and so no motion estimate can be obtained for this area. To alleviate this problem, once the accumulated values exceed a given threshold the reference image is replaced by the present camera image. Each time this happens however, measurement errors will accumulate.

All the processing will be carried out on images that have been spatially filtered and subsampled. This will reduce the amount of computation required, with no significant loss in measurement accuracy. The filtering process also softens the image; this is known to improve the accuracy and reliability of gradient-type motion estimators. Further computational savings can be achieved by carrying out the processing between alternate fields rather than for every field; this will reduce the accuracy with which rapid acceleration can be tracked but this is unlikely to be a problem since most movements of studio cameras tend to be smooth.

Software to implement the most computationally-intensive pans of the processing has been written and benchmarked, to provide information to aid the specification and design of the hardware accelerator. The benchmarks showed that the process of filtering and down-sampling the incoming images is likely to use over half of the total computation time.

Embodiment 2: Direct Estimation of Global Motion Parameters An alternative and preferred method of determining global translation and scale change is to derive them directly from the video signal. A method of doing this is described in reference 7 by [Wu and Kittel Kittler 1990]. We have extended this method to work using a stored reference image and to use the predicted motion values as a staging point. Furthermore, the technique is applied only at a sub-set of pixels in the image, that we have termed measurement points, in order to reduce the computational load. As in the previous embodiment the RGB video signal is matrixed to form a single-component signal and spatially low-pass filtered prior to processing. As described previously, only areas identified by a key signal as background are considered.

The method is applied by considering a number of measurement points in each incoming image and the corresponding points in the reference image, displaced according to the predicted translation and scale change. These predicted values may be calculated, for example, by linear extrapolation of the measurements made in the preceding two images. The measurement points may be arranged as a regular array, as shown in FIG. 2. A more sophisticated approach would be to concentrate measurement points in areas of high luminance gradient, to improve the accuracy when a limited number of points are used. We have found that 500-1000 measurement points distributed uniformly yields good results. Points falling in the foreground areas (as indicated by the key signal) are discarded, since it is the motion of the background that it is to be determined.

At each measurement point, luminance gradients are calculated as shown in FIG. 3. These may be calculated, for example, by simply taking the difference between pixels either side of the measurement point. Spatial gradients are also calculated for the corresponding point in the reference image, offset by the predicted motion. Sub-pixel interpolation may be employed when calculating these values. The temporal luminance gradient is also calculated; again sub-pixel interpolation may be used in the reference image. An equation is formed relating the measured gradients to the motion values as follows:

Gradients are (approximately) related to displacement and scale changes by the equation

g_xX+g_yY+(Z-1).(g_xx+g_yy)=g_t

where

g_x=(gr_x+gc_x)/2

g_y=(gr_y+gc_y)/2

are the horizontal and vertical luminance gradients averaged between the two images:

gc_x, gc_yare horizontal and vertical luminance gradients in current image;

gr_x, gr_yare horizontal and vertical luminance gradients in reference image; and

g_tis the temporal luminance gradient.

X and Y are the displacements between current and reference image and Z is the scale change (over and above those predicted).

An equation is formed for each measurement point and a least-squares solution is calculated to obtain values for X, Y, and Z.

Derivation of the equation may be found in reference 7 [ Wu and Kinel 1990] Kittler (this reference includes the effect of image rotation; we have omitted rotation since it is of little relevance here as studio cameras tend to be mounted such that they cannot rotate about the optic axis).

The set of simultaneous linear equations derived in this way (one for each measurement point) is solved using a standard least-squares solution method to yield estimates of the difference between the predicted and the actual translation and scale change. The calculated translation values are then added to the predicted values to yield the estimated translation between the reference image and the current image.

Similarly, the calculated and predicted scale changes are multiplied together to yield the estimated scale change. The estimated values thus calculated are then used to derive a prediction for the translation and scale change of the following image.

As described earlier, the reference image is updated when the camera has moved sufficiently far away from its initial position. This automatic refreshing process may be triggered, for example, when the area of overlap between the incoming and reference image goes below a given threshold. When assessing the area of overlap, the key signal needs to be taken account of, since for example an actor who obscured the left half of the background in the reference image might move so that he obscures the right half, leaving no visible background in common between incoming and reference images. One way of measuring the degree of overlap is to count the number of measurement points that are usable (ie. that fall in visible background areas of both the incoming and reference image). This number may be divided by the number of measurement points that were usable when the reference image was first used to obtain a measure of the usable image area as a fraction of the maximum area obtainable with that reference image. If the initial number of usable points in a given reference image was itself below a given threshold, this would indicate that most of the image was taken up with foreground rather than background, and a warning message should be produced.

It can also be advantageous to refresh the reference image if the measured scale change exceeds a given range (eg, if the camera zooms in a long way). Although in this situation the number of usable measurement points may be very high, the resolution of the stored reference image could become inadequate to allow accurate motion estimation.

When the reference image is updated, it can be retained in memory for future use, together with details of its accumulated displacement and scale change. When a decision is made that the current reference image is no longer appropriate, the stored images can be examined to see whether any of these gave a suitable view of the scene. This assessment can be carried out using similar criteria to those explained above. For example, if the camera pans to the left and then back to its starting position, the initial reference image may be re-used as the camera approachis this position. This ensures the measurements of camera orientation made at the end of the sequence will be as accurate as those made at the beginning.

Referring back to FIG. 1, apparatus for putting each of the two motion estimation methods into practice is shown. A camera 10, derives a video signal from the background 12 which, as described previously, may be patterned in two tones as shown in FIG. 4. The background cloth shown in FIG. 4 shows a two-tone arrangement of squares. Squares 30 of one tone are arranged adjacent sequences 32 of the other tone. Shapes other than squares may be used and it is possible to use more than two different tones. Moreover, the tones may differ in both hue and brightness or in either hue or brightness. At present, it is considered preferable for the brightness to be constant as variations in brightness might show in the final image.

Although the colour blue is the most common for the backcloth other colours, for example, green or orange are sometimes used when appropriate. The technique described is not peculiar to any particular background colour but requires a slight variation in hue and/or brightness between a number of different areas of the background, and that the areas adjacent to a given area have a brightness and/or hue different from that of the given area. This contrast enables motion estimation from the background to be performed.

Red, green and blue (RGB) colour signals formed by the camera are matrixed into a single colour signal and applied to a spatial low-pass filer (at 14). The low-pass output is applied to an image store 16 which holds the reference image data and whose output is transformed at 18 by applying the predicted motion for the image. The motion adjusted reference image data is applied, together with the low-pass filtered image to a unit 20 which measures the net motion in background areas between an incoming image at input I and a stored reference image at input R. The unit 20 applies one of the motion estimation algorithms described. The net motion measurement is performed under the control of a key signal K derived by a key generator 22 from the unfiltered RGB output from the camera 10 to exclude foreground portions of the image from the measurement. The motion prediction signal is updated on the basis of previous measured motion thus ensuring that the output from the image store 16 is accurately interpolated. When, as discussed previously, the camera has moved sufficiently away from its initial position a refresh signal 24 is sent from the net motion measurement unit 20 to the image store 16. On receipt of the refresh signal 24 a fresh image is stored in the image store and used as the basis for future net motion measurements.

The output from the net motion measurement unit 20 is used to derive an indication of current camera position and orientation as discussed previously.

Optionally, sensors 26 mounted on the camera can provide data to the net motion measurement unit 20 which augment or replace the image-derived motion signal.

The image store 16 may comprise a multi-frame store enabling storage of previous reference images as well as the current reference image.

The technique described can also be applied to image signals showing arbitrary picture material instead of just the blue background described earlier. If objects are moving in the scene, these can be segmented out by virtue of their motion rather than by using a chroma-key signal. The segmentation could be performed, for example, by discounting any measurement points for which the temporal luminance gradient (after compensating for the predicted background motion) was above a certain threshold. More sophisticated techniques for detecting motion relative to the predicted background motion can also be used.

It will be understood that the techniques described may be implemented either by special purpose digital signal processing equipment, by software in a computer, or by a combination of these methods. It will also be clear that the technique can be applied equally well to any television standard.

INVENTORS:

Thomas, Graham Alexander

THIS PATENT IS REFERENCED BY THESE PATENTS:

Patent	Priority	Assignee	Title
10867396,	Dec 18 2018	DEERE & CO	Automatic vision sensor orientation
10957054,	Jan 29 2016	GOOGLE LLC	Detecting motion in images
11082633,	Nov 18 2013	PIXMAP	Method of estimating the speed of displacement of a camera
11153534,	Jun 02 2004	Robert Bosch GmbH; Bosch Security Systems, Inc.	Virtual mask for use in autotracking video camera images
11341656,	Dec 18 2018	DEERE & CO	Automatic vision sensor orientation
11625840,	Jan 29 2016	GOOGLE LLC	Detecting motion in images
6912313,	May 31 2001	RAKUTEN GROUP, INC	Image background replacement method
7197074,	Feb 20 2003	Regents of the University of California, The	Phase plane correlation motion vector determination method
7379566,	Jan 07 2005	Qualcomm Incorporated	Optical flow based tilt sensor
7382400,	Feb 19 2004	Robert Bosch GmbH	Image stabilization system and method for a video camera
7430312,	Jan 07 2005	Qualcomm Incorporated	Creating 3D images of objects by illuminating with infrared patterns
7570805,	Jan 07 2005	Qualcomm Incorporated	Creating 3D images of objects by illuminating with infrared patterns
7574020,	Jan 07 2005	Qualcomm Incorporated	Detecting and tracking objects in images
7742077,	Feb 19 2004	Robert Bosch GmbH	Image stabilization system and method for a video camera
7822267,	Jan 07 2005	Qualcomm Incorporated	Enhanced object reconstruction
7827698,	May 17 2005	Qualcomm Incorporated	Orientation-sensitive signal output
7848542,	Jan 07 2005	Qualcomm Incorporated	Optical flow based tilt sensor
7853041,	Jan 07 2005	Qualcomm Incorporated	Detecting and tracking objects in images
7953271,	Jan 07 2005	Qualcomm Incorporated	Enhanced object reconstruction
8015718,	May 17 2005	Qualcomm Incorporated	Orientation-sensitive signal output
8144118,	Jan 21 2005	Qualcomm Incorporated	Motion-based tracking
8170281,	Jan 07 2005	Qualcomm Incorporated	Detecting and tracking objects in images
8212872,	Jun 02 2004	Robert Bosch GmbH	Transformable privacy mask for video camera images
8213686,	Jan 07 2005	Qualcomm Incorporated	Optical flow based tilt sensor
8218858,	Jan 07 2005	Qualcomm Incorporated	Enhanced object reconstruction
8230610,	May 17 2005	Qualcomm Incorporated	Orientation-sensitive signal output
8483437,	Jan 07 2005	Qualcomm Incorporated	Detecting and tracking objects in images
8559676,	Dec 29 2006	Qualcomm Incorporated	Manipulation of virtual objects using enhanced interactive system
8670611,	Oct 24 2011	KYNDRYL, INC	Background understanding in video data
8717288,	Jan 21 2005	Qualcomm Incorporated	Motion-based tracking
8983139,	Jan 07 2005	Qualcomm Incorporated	Optical flow based tilt sensor
9129380,	Oct 24 2011	KYNDRYL, INC	Background understanding in video data
9210312,	Jun 02 2004	Bosch Security Systems, Inc.; Robert Bosch GmbH	Virtual mask for use in autotracking video camera images
9234749,	Jan 07 2005	Qualcomm Incorporated	Enhanced object reconstruction
9460349,	Oct 24 2011	KYNDRYL, INC	Background understanding in video data
9858483,	Oct 24 2011	KYNDRYL, INC	Background understanding in video data
9876993,	Nov 27 2002	Bosch Security Systems, Inc.; Robert Bosch GmbH	Video tracking system and method

THIS PATENT REFERENCES THESE PATENTS:

Patent	Priority	Assignee	Title
4227212,	Sep 21 1978	Micron Technology, Inc	Adaptive updating processor for use in an area correlation video tracker
4393394,	Aug 17 1981		Television image positioning and combining system
4739401,	Jan 25 1985	Raytheon Company	Target acquisition system and method
4823184,	Apr 09 1984	CINTEL INC	Color correction system and method with scene-change detection
4984078,	Sep 02 1988	North American Philips Corporation	Single channel NTSC compatible EDTV system
5018215,	Mar 23 1990	Honeywell Inc.	Knowledge and model based adaptive signal processor
5049991,	Feb 20 1989	Victor Company of Japan, LTD	Movement compensation predictive coding/decoding method
5073819,	Apr 05 1990	ROADWARE CORPORATION	Computer assisted video surveying and method thereof
5103305,	Sep 27 1989	Kabushiki Kaisha Toshiba	Moving object detecting system
5195144,	Nov 15 1988	Thomson-CSF	Process for estimating the distance between a stationary object and a moving vehicle and device for using this process
5351086,	Dec 31 1991	ZTE Corporation	Low-bit rate interframe video encoder with adaptive transformation block selection
5357281,	Nov 07 1991	Canon Kabushiki Kaisha	Image processing apparatus and terminal apparatus
EP236519,
EP360698,
EP366136,
GB2259625,
JP5793788,
WO8001977,
WO9016131,
WO9120155,

ASSIGNMENT RECORDS Assignment records on the USPTO

Executed on	Assignor	Assignee	Conveyance	Frame	Reel	Doc
Mar 25 1998		British Broadcasting Corporation	(assignment on the face of the patent)
Jun 25 1998	THOMAS, GRAHAM ALEXANDER	British Broadcasting Corporation	ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS	009358	0784	pdf

MAINTENANCE FEES AND DATES: Maintenance records on the USPTO

Date	Maintenance Fee Events
Jul 27 2007	M1553: Payment of Maintenance Fee, 12th Year, Large Entity.

Date	Maintenance Schedule
Feb 10 2007	4 years fee payment window open
Aug 10 2007	6 months grace period start (w surcharge)
Feb 10 2008	patent expiry (for year 4)
Feb 10 2010	2 years to revive unintentionally abandoned end. (for year 4)
Feb 10 2011	8 years fee payment window open
Aug 10 2011	6 months grace period start (w surcharge)
Feb 10 2012	patent expiry (for year 8)
Feb 10 2014	2 years to revive unintentionally abandoned end. (for year 8)
Feb 10 2015	12 years fee payment window open
Aug 10 2015	6 months grace period start (w surcharge)
Feb 10 2016	patent expiry (for year 12)
Feb 10 2018	2 years to revive unintentionally abandoned end. (for year 12)