Apparatus and method for processing groups of fields in a video data compression system to encode a single frame as an I-field and a P-field

Apparatus and method for processing groups of fields in a video data compression system to encode a single frame as an I-field and a P-field
RE36507

A video compression system which is based on the image data compression system developed by the Motion Picture Experts Group (MPEG) uses various group-of-fields configurations to reduce the number of binary bits used to represent an image composed of odd and even fields of video information, where each pair of odd and even fields defines a frame. According to a first method, each field in the group of fields is predicted using the closest field which has previously been predicted as an anchor field. According to a second method, intra fields (I-fields) and predictive fields (p-fields) are distributed in the sequence so that no two I-fields and/or no two p-fields are at adjacent locations in the sequence. According to a third method, t The number of I-fields and p-fields in the encoded sequence is reduced by encoding one field in a given frame as a p-field or a B-field where the other field is encoded as an I-field and encoding one field in a further frame as a B-field where the other field is encoded as a p-field.

PTO Wrapper PDF
Dossier Espace Google

Patent RE36507
Priority Oct 21 1997
Filed Oct 21 1997
Issued Jan 18 2000
Expiry Oct 21 2017
Inventors Iu, Siu L.
Assg.orig Matsushita…
Assg.curr Panasonic …
Entity Large
Referenced by 72
References 19
Maint.: all paid

12. Apparatus for automatically encoding a sequence of video image fields comprising:

means for encoding each field in said sequence of video image fields in a predetermined order to produce a sequence of encoded fields, wherein a plurality of the fields in the sequence are bidirectionally predictively encoded;

means for decoding each field in the sequence of encoded fields to produce a sequence of decoded fields; and

means for storing each field in said sequence of decoded fields to provide a sequence of stored fields;

wherein, each field in said sequence of video image fields which is bidirectionally predictively encoded uses data from one of said stored fields which is closest in position in said sequence of video image fields

to the field being bidirectionally predictively encoded.13. A method of automatically encoding a sequence of video image frames, each frame including first and second fields designated odd and even fields, the method comprising the steps of:

encoding the first field of a first frame in the sequence of video image frames using only video information in the first field to produce an I-field;

decoding the encoded I-field to produce a decoded I-field;

predictively encoding the second field of the first frame using only video information in the second field and in the decoded I-field to produce a p-field;

decoding the encoded p-field to produce a decoded p-field

storing the decoded I-field and the decoded p-field; and

considering the video information in the stored I- and p-fields, which together represent the odd and even fields of the first frame, to produce

a p-field of a subsequent frame.14. Apparatus for automatically encoding a sequence of video image frames, each frame including first and second fields, the apparatus comprising:

means for encoding the first field of a first frame in the sequence of video image frames using only video infomation in the first field to produce an I-field;

means for decoding the encoded I-field to produce a decoded I-field;

means for predictively eacoding the second field of the first frame using only video information in the second field and in the decoded I-field to produce a p-field;

means for decoding the encoded p-field to produce a decoded p-field;

means for storing the decoded I-field and the decoded p-field; and

means for considering the video information in the stored I- and p-fields

to produce a next p-field. 15. A method of automatically encoding a sequence of video image frames, each frame including first and second fields designated odd and even fields, the method comprising the steps of:

encoding the first field of a first frame in the sequence of video image frames using video information in the first field exclusive of video information in any other field to produce an I-field;

decoding the encoded I-field to produce a decoded I-field;

predictively encoding the second field of the first frame using video information in the second field and in the decoded I-field exclusive of video information in any other field to produce a p-field;

decoding the encoded p-field to produce a decoded p-field;

storing the decoded I-field and p-field; and

considering the video information in the stored I- and p-fields, which together represent the odd and even fields of the first frame, to produce a p-field of a subsequent frame.16. Apparatus for automatically encoding a sequence of video image frames, each frame including first and second fields, the apparatus comprising:

means for encoding the first field of a first frame in the sequence of video image frames using video information in the first field exclusive of video information in any other field to produce an I-field;

means for decoding the encoded I-field to produce a decoded I-field;

means for predicively encoding the second field of the first frame using video information in the second field and in the decoded I-field exclusive of video information in any other field to produce a p-field;

means for decoding the encoded p-field to produce a decoded p-field;

means for storing the decoded I-field and the decoded p-field; and

means for considering the video information in the stored I- and p-fields

to produce a next p-field.17. A method of automatically encoding a sequence of video image frames, each frame including first and second fields designated odd and even fields, the method comprising the steps of:

encoding the first field of a first frame in the sequence of video image frames using video information in the first field exclusive of video infomation in any other field to produce an I-field;

decoding the encoded I-field to produce a decoded I-field;

predictively encodig the second field of the first frame using video information in the second field and in the decoded I-field independently of video information in either field of a frame which immediately follows the first frame in the image sequence to produce a p-field;

decoding the encoded p-field to produce a decoded p-field;

storing the decoded I-field and the decoded p-field; and

considering the video infornadon in the stored I- and p-fields, which together represent the odd and even fields of the first frame to produce a

p-field of a subsequent frame.18. Apparatus for automatically ecoding a sequence of video image frames, each frame including first and second fields, the apparatus comprising:

means for enclosing the first field of a first frame in the sequence of video image frames using video information in the first field exclusive of video information in any other field to produce an I-field;

means for decoding the encoded I-field to produce a decoded I-field;

means for predictively encoding the second field of the first frame using video information in the second field and in the decoded I-field independently of video information in either field of a frame which immediately follows the first frame in the image sequence to produce a p-field;

means for decoding the encoded p-field to produce a decoded p-field;

means for store the decoded I-field and the decoded p-field; and

means for considering the video information in the stored I- and p-fields to produce a next p-field.

1. A method for automatically encoding a sequence of video image fields comprising the steps of:

encoding each field in said sequence of image fields in a predetermined order to produce a sequence of encoded fields wherein a plurality of the fields in the sequence of video image fields are bidirectionally predictively encoded;

decoding each field in the sequence of encoded fields to produce a sequence of decoded fields; and

storing each field in said sequence of decoded fields to produce a sequence of stored fields;

wherein, each field in said sequence of video image fields which is bidirectionally predictively encoded is encoded using data from one of said stored fields which is closest in position in said sequence of video image fields to the field being bidirectionally predictively encoded.

2. The method of claim 1, wherein said sequence of video image fields are interleaved even and odd fields arranged so that each pair of even and odd fields forms a frame, and said method further includes the step of predictively encoding one of the odd and even fields of the one frame as a p-field when the other one of the odd and even fields in the one frame has been encoded as an I-field, using only information in the one other field.

3. A method for automatically encoding sequential fields of video information comprising the steps of:

encoding a first one of said sequential fields using only the video information in the first field to produce an I-field;

predictively encoding a second one of said sequential fields, separated from said first field by a plurality of field intervals, using the video information in the first and second fields to produce a p-field;

predictively encoding a third one of said sequential fields, occupying a position in the sequence between said first field and said second field, using the video information in the third field and in one of the first and second fields to produce a first B-field; and,

predictively encoding a fourth one of said sequential fields, occupying a position in the sequence between said first field and said third field, using the video information in the fourth field and in one of the first,

second and third fields to produce a second B-field.

4. A method according to claim 3, wherein said sequential fields of video information are interleaved even and odd fields arranged so that each pair of even and odd fields forms a frame, and said method further includes the step of encoding one of the odd and even fields of one frame as a B-field when the other one of the odd and even fields of the one frame has been encoded as a p-field.

5. A method according to claim 4, wherein the encoded fields are arranged in the same sequence as said sequential fields and each p-field is separated from the next p-field by at least one B-field.

6. A method according to claim 4, wherein the encoded fields are arranged in the same sequence as said sequential fields and each I-field is separated from the next I-field by at least one B-field.

7. The method of claim 3, wherein said sequential fields of video information are interleaved even and odd fields arranged so that each pair of even and odd fields forms a frame, and said method further includes the step of encoding one of the odd and even fields of one frame as a B-field when the other one of the odd and even fields of the one frame has been encoded as an I-field.

8. A method for automatically encoding sequential interleaved even and odd fields of video information wherein each pair of even and odd fields forms a frame of video information, said method comprising the steps of:

encoding one of the even fields of video information predictively using only information in the one even field and in a predecessor field occurring earlier in the sequence; and

encoding the odd field in the same frame as the one even field, bidirectionally predictively using information in the odd field, information in the one even field and information in a successor field occurring later in the sequence than the odd field.

9. The method of claim 8, further including the step of encoding an even field which immediately follows said one even field and said odd field in the sequence bidirectionally predictively using information in the odd field, information in the one even field and information in the successor field.

10. A video data compression system which encodes sequential fields of video information comprising:

means for encoding a first one of said sequential fields using only the video information in the first field to produce an I-field.

means for predictively encoding a second one of said sequential fields, separated from said first field by a plurality of field intervals, using the video information in the first and second fields to produce a p-field;

means for predictively encoding a third one of said sequential fields, occupying a position in the sequence between said first field and said second field, using the video information in the third field and in one of the first and second fields to produce a first B-field; and

means for predictively encoding a forth one of said sequential fields, occupying a position in the sequence between said second field and said third field, using the video information in the fourth field and in one of the first, second and third fields to produce a second B-field.

11. A video data compression system which encodes sequential interleaved even and odd fields of video information wherein each pair of even and odd fields forms a frame of video information, said system comprising:

means for encoding one of even fields of video information predictively using only information in the one even field and in a predecessor field occurring earlier in the sequence; and

means for encoding the odd field in the same frame as the one even field, bidirectionally predictively using information in the odd field, information in the one even field and information in a successor field occurring later in the sequence than the odd field.

FIGS. 4-9 areFIGS. 2-9 which illustrate the various methods. FIG. 8.

The multi-field memory 24 supplies a signal to an input port of multiplexer 22. This signal is used for processing B-fields. The multiplexer 22 is controlled by a signal MX1 to select either the signal from source 20 or one of the signals provided by the multi-field memory 24, to the plus input port of a subtracter 26. The minus input port of the subtracter 26 is coupled to a multiplexer 34 which may be controlled by a signal MX6 to provide either a zero value or the output signal A from motion a compensator circuit 36, as described below. The exemplary subtracter 26 is actually 256 eight-bit subtracters which are configured to simultaneously subtract four 8 by 8 pixel blocks provided by the motion compensator circuit 36 from four corresponding 8 by 8 pixel blocks provided by the multiplexer 22. The arrangement of four 8 by 8 blocks as one 16 by 16 pixel block is defined in the MPEG standard as a macroblock. In the exemplary embodiment of the invention, all motion compensation is performed on the basis of a macroblock.

The motion compensator receives five input signals, a Backward Motion vector (BMv) and a Forward Motion vector (FMv) from a motion estimator 32, the output signal of the multiplexer 22, and macroblocks of pixel data from one or two reconstructed fields which are held in a second multi-field memory 48. These macroblocks are provided via the multiplexers 50 and 52.

When the field being encoded is a B-field, the motion compensator 36 selects pixel values indicated by one or both of the signals FMv and BMv from one of the two macroblocks provided by the multi-field memory 48 or the average values of the two macroblocks. If the field being encoded is a P-field, the compensator selects the pixel values indicated by the signal FMv from the forward macroblock provided by the multi-field memory 48.

These pixel values are applied to the minus input port of the subtracter 26 while the corresponding pixel values from the field to be encoded are applied to the plus input port. The signal provided by the subtracter 26 is the predictive code residue of the input macroblock provided by the multiplexer 22. That is to say, the input macroblock minus the macroblock provided by the multiplexer 34.

The motion vectors, BMv and FMv, are provided by motion estimator circuitry 32 which receives respective input signals, each representing at least a macroblock of pixels, from multiplexer 22, multiplexer 28 and multiplexer 30. These multiplexers, in turn, are coupled to receive signals representing stored video data from the multi-field memory 24. The motion estimator 32 used in this embodiment of the invention, simultaneously compares a macroblock of data provided by the multiplexer 22 with corresponding overlapping macroblocks of data from one or two fields held by the multi-field memory 24. The exemplary motion estimator 32 is a high-performance processor which simultaneously compares a target macroblock of 16 by 16 pixels, provided by the multiplexer 22, with 256 overlapping 16 by 16 macroblocks of pixels provided from a single field. A motion estimator suitable for use as the estimator 32 may be constructed from multiple conventional motion estimation chips, for example, the ST-13-220 integrated circuit available from SGS Thomson semiconductors. Each macroblock of pixels processed by the motion estimator 32 represents a possible displacement of the target macroblock of pixels in the previous field. The 256 overlapping macroblocks of pixels define a 48 by 48 pixel block in the anchor field which is centered about the position of the target block and which defines the area that is processed to find a reference macroblock. The macroblock in the search area having, for example, the smallest difference with respect to the target macroblock is selected as the reference to be used to predict the target macroblock.

The output signal of subtracter 26 is either a macroblock of pixels from an I-field or a macroblock of residue pixels which represents either the difference between a P-field and its anchor I-field or the difference between a B-field and one or both of its anchor I and P-fields.

The next step in the process is to diagonally (i.e. zigzag) scan each of the four blocks within the macroblock and to transform the diagonally scanned data into DCT coefficients using a Discrete Cosine Transform processor 38. In the exemplary embodiment of the invention, the DCT processor 38 is able to simultaneously process the four blocks of data that make up the macroblock provided by the subtracter 26 to produce four sets of DCT data. Once transformed, the DCT coefficients are quantized in parallel by quantizer 40.

The quantizer 40 assigns differing numbers of bits (i.e. uses a different quantization resolution) to represent the magnitude of each of the DCT coefficients, based in part on how people see video information at the frequency represented by the DCT coefficient. Since people are more sensitive to the quantization of image data at low spatial frequencies than to the quantization of data at high spatial frequencies, the coefficients representing the high spatial frequencies may be quantized more coarsely than the coefficients that represent low spatial frequencies.

The output signal of quantizer 40 is applied to Variable Length Coder (VLC) 54 as well as to an inverse quantizer 42. The VLC 54 encodes the quantized DCT coefficients by their amplitudes, at least one of the forward and backward motion vectors (FMv and BMv) and a mode signal provided by the motion compensator 36 for each block. The VLC 54 applies both run-length encoding and variable length code, such as a Huffman code to the block data. The data provided by the VLC 54 is then stored in a first-in-first-out (FIFO) memory device 56 that buffers the data, which may be supplied at varying rates, for transmission to a receiver through a signal conveyor 58.

To ensure that the average rate at which data is encoded matches the transmission rate, the FIFO 56 is coupled to a buffer control circuit 60. The circuit 60 monitors the amount of data in the FIFO 56 to change the size of the quantization steps applied by the quantizer 40. If the amount of data in the FIFO 56 is relatively low, then the quantization steps may be relatively fine, reducing any quantization related errors in the decoded video signal. If, however, the FIFO 56 is almost at its capacity, the buffer control 60 conditions the quantizer 40 to coarsely quantize the DCT coefficients, thus reducing the volume of data used to represent an image.

As described above, the quantized DCT coefficients from the quantizer 40 are also applied to an inverse quantizer circuit 42. This circuit reverses the process performed by the quantizer to recover the DCT coefficients with the precision of the assigned quantization resolution. Once the signal has been dequantized, it is subject to an Inverse Discrete Cosine Transform operation (IDCT) as represented by element 44. This element reverses the process performed by the DCT element 38 to recover macroblocks of image data from the quantized data stream.

If an I-field is being encoded, the data provided by the IDCT circuit 44 represents macroblocks of the signal as it would be reconstructed at the receiver. This signal is summed with zero-valued pixels, as provided by the multiplexer 34, in an adder 46 and stored in the second multi-field memory 48.

If, however, a P or B-field is being encoded, the output signal provided by the IDCT circuit 44 is added, by adder 46, to the selected macroblock of pixels from the anchor field (provided by the multiplexer 34) to produce a reconstructed macroblock of pixels. This macroblock is then stored in the multi-field memory 48 as a portion of a reconstructed version of the P or B-field which is being encoded. As described below, the reconstructed fields of pixels stored in the multi-field memory 48 may be used by the motion compensator 36 and subtracter 26 to generate the residue data for predictively encoding other P and B-fields.

Turning to the methods of field processing, FIGS. 2 through 9EP FIGS. 2, 3 and 8 show exemplary group-of-field configurations for a field-oriented MPEG system. Prior to describing these configurations, a discussion of the notation used in these figures is in order. The vertical lines in these figures represent successive fields of a video signal. The solid lines represent even fields and the broken lines represent odd fields. The letter above each line describes the type of field (i.e. either I, P or B) with the subscript indicating the number of the field in the represented sequence. I and P-field designators are surrounded by squares and circles, respectively.

Fields marked with a square are intra-coded. These fields are encoded using only data in the field. A field to which an arrow points, a target field, is predictively coded. The order in which the fields are encoded is indicated by the vertical position of the arrow or the square. A dot is placed on the arrow where it crosses a field to indicate that the dotted field may be used to predictively code the target field. An arrow having dots on two fields indicates that the target field may be predictively coded using either of the dotted fields as an anchor field. only one field is selected for prediction, however, based on some measure of difference between the anchor field and the target field. Exemplary difference measures include the absolute magnitude of the differences between the anchor and target fields, and the mean squared magnitude of the differences between the anchor field and the target field.

A field to which two arrows point is a bidirectionally coded field. In a standard MPEG system, a B-field is coded using a preceding field, a following field or an average of the preceding and following fields as the anchor field. B-fields which have two dotted fields on each arrow indicate that two preceding and two following fields are compared to determine which preceding field and which following field have the smallest measure of difference with the target field. This determination is made on a macroblock basis by the motion estimator 32 which produces the motion vector output signals FMv and BMv. The determined preceding and following fields are then processed according to the MPEG method to predictively encode the target field.

FIGS. 2 and 3 show two commonly used group-of-fields configurations for field-oriented MPEG systems. In these configurations, there are two I-fields, eight P-fields and twenty B-fields in a one-half second interval. Using these configurations, the predictive coding is refreshed at one-half second intervals.

FIG. 2 shows the generation, without prediction, of fields I₀ and I₁, the even and odd fields of the I-frame, respectively. As these fields are encoded, the pixel values from the source 20 are stored into respective field stores in the multi-field memory 24 while pixel values representing reconstructed versions of the image data are stored in the multi-field memory 48.

Next, image data from source 20 which will be encoded as the fields B₂ through B₅ is stored in respectively different field stores of the multi-field memory 24. Then, as represented by the dots and arrows, the fields I₀ and I₁ are used to successively predict the even P-fields, P₆ and P₇ as they are provided by the source 20 and stored into the multifield memory 24. To calculate the motion vectors for field P₆, for example, the control circuitry 21 conditions the multiplexers 28 and 30 to provide blocks of pixels from field I₀.

The exemplary embodiment of the invention uses a search area of 32 by 32 pixels from the anchor field to locate possible reference macroblocks for a field that is displaced by one frame interval (i.e. two field intervals) from the anchor. Since the search area is referenced to the center pixel of the macroblock, pixels from the reference field which may be used to calculate the residue and, thus the motion vectors, are defined by a 48 by 48 pixel block (i.e. 8+32+8 by 8+32+8).

In the exemplary sequence shown in FIG. 2, each P-field is separated from its anchor I-field by three frame intervals. Thus, the search area for the motion vectors defines a 96 by 96 block of pixels and, to calculate motion vectors for this sequence which cover the same range of motions as is covered by a single frame vector, a block of 12544 pixels (8+96+8=112 by 112) from the anchor field would be required. This scheme would use a relatively large data path and a motion estimator 32 that could simultaneously process a very large number of combinations to achieve equivalent performance to the single frame motion estimation.

Alternatively, the motion vector may be calculated in steps using a number of method collectively known as telescoping. By these methods, the motion vector from I₀ to P₆ would be calculated in steps, using the intervening field data in the multi-field memory 24. In an exemplary telescoping scheme, the motion vector from B₄ to P₆ would be calculated and recorded, next, the motion vector from B₂ to B₄ would be calculated and recorded, and finally, the motion vector from I₀ to B₂ would be calculated. All motion vectors are calculated based on a 32 by 32 pixel search area. The equivalent motion vector from I₀ to P₆ may be determined by summing the final vector with the recorded intermediate vectors. This method uses a smaller data path from the multi-field memory 24 to the motion estimator 32, but uses more time to calculate the motion vector since it involves a sequence of steps. Some of this time may be recovered by using pipeline processing to calculate the motion vectors and/or by saving the intermediate motion vectors for use when the motion vectors for the B-fields are calculated.

As each of the P-fields is encoded, a reconstructed version of the field is stored in the multi-field memory 48. Once the reconstructed even and odd I-fields and the even and odd P-fields have been stored in the multi-field memory 48, the intermediate even and odd B-fields (i.e. B₂, B₃, B₄ and B₅ which are held in the multi-field memory 24) can be predicted using the fields I₀, I₁, P₆ and P₇ fields as anchor fields. In the exemplary group-of-fields sequence shown in FIG. 2 B₂ and B₄ are predicted from I₀ and P₆ while B₃ and B₅ are predicted from I₁ and P₇.

The circuitry shown in FIG. 1 encodes these fields as follows. The control circuitry 21, via signal M1C, conditions the field memory 24 to provide the stored data for field B₂, one macroblock at a time, to the multiplexer 22. At the same time, the circuitry 21 uses the signals MX2 and MX3 to provide corresponding 48 by 48 pixel blocks from each of the fields I₀ and P₆ to the motion estimator 32 via the multiplexers. Motion vectors from I₀ to B₂ may be calculated by the motion estimator 32 in one step. Motion vectors from B₂ to P₆, on the other hand, may be calculated by at least two methods. First, an 80 by 80 pixel block may be provided to the motion estimator by the multi-field memory 24 and the motion vector may be calculated using conventional methods, over this larger block. Second, the motion vector may be calculated by any one of a number of well known telescoping techniques.

The circuitry 21 uses the signal MX1 to condition the multiplexer 22 to apply the B₂ macroblocks to the subtracter 22, to the motion estimator 32 and to the motion compensator 36. The motion estimator 32 uses the data from fields I₀ and P₆ to calculate the best backward and forward motion vectors (BMv and FMv) for the macroblock that is currently being processed from field B₂.

The motion vectors BMv and FMv are applied to the motion compensator 36 and to the control circuitry 21. Based on these vectors, the circuitry 21 conditions the multi-field memory 48 and the multiplexers 50 and 52 to apply the indicated macroblocks to the motion compensator 36. The motion compensator 36 calculates three residue values, one for forward motion, using I₀ as the anchor field; one for backward motion, using P₆ as the anchor field; and one in which the anchor field is the average of the anchor macroblocks from I₀ and P₆.

Of these three residues, one is selected as the best based on a measure of the entropy of the residue. Exemplary measures include the absolute difference and the mean squared difference between the anchor and target macroblocks. The macroblock which produces the best residue is applied, by the motion compensator 36, to the subtracter 26 via the multiplexer 34. As described above, subtracter 26 generates the residue and applies it to the DCT circuitry 38 and to the quantization circuitry 40 which encodes it. The encoded data is then combined with the motion vectors BMv and FMv provided by the motion estimator 32 in a variable length coder 54. The signal provided by the coder 54 is transmitted by the signal conveyor 58 to a remote destination.

In the sequence shown in FIG. 2, the B₂ -B₅ fields are not used to generate any other fields. Accordingly, they are not decoded and stored in the multi-field memory 48.

FIG. 3 shows a method which is similar to that shown in FIG. 2 except that, instead of the corresponding odd and even I and P-fields being used to predict other odd and even fields, either the odd I and P-fields or the even I and P-fields may be used as an anchor to predict an individual odd or even predictive field. In the field sequence of FIG. 3, I₀ and I₀ are created the same way as in FIG. 2; however, P₆ is now predicted based on the minimum difference value (or some other criterion) developed using I₀ or I₁ as the anchor field. The same is true for P₇, both the odd and even fields of the I frame, I₀ and I₁, are used to predict the odd field, P₇. Finally, the individual B-fields, odd or even, are predicted using the both fields of each of the I and P-frames: B₂, B₃, B₄ and B₅ are all predicted using the best match obtained from I₀, I₁, P₆ and P₇ or from combinations of one of the I-fields and one of the P-fields.

The above configurations derive directly from the frame-oriented MPEG system. The present invention, as illustrated by the group of field configurations described below, differs from these schemes by taking advantage of the field-oriented MPEG system to decrease the prediction time interval and the predictive refresh time. In addition, these schemes reduce the number of bits used to convey the image by substituting P-fields for I-fields and B-fields for P-fields where appropriate. So, progressing from the traditional methods of field processing, FIGS. 4 through 10 show FIG. 8 shows new and better group of field configurations for image processing.

FIG. 4 shows a configuration which can be characterized as using the available closer fields to do the predictions. As in the configurations shown in FIGS. 2 and 3, I₀ and I₁ are encoded using intrafield processing. Then, these are used as the anchor frame to predict the even field of the next anchor frame, P₆. To predict the odd field, P₇, of the next anchor frame however, I₁ and P₆ are used and not I₀ and I₁. The use of field P₆, instead of field I₀, to predict field P₇ reduces the prediction time span from 7 field intervals to 1 field interval. Thus, it is likely that the prediction of P₇ based on I₁ and P₆ will produce a residue signal that can be encoded in fewer bits than the prediction of P₇ based on I₀ and I₁.

Similarly, this method is applied to the bidirectional B-field prediction. Field B₂ is predicted as shown above in FIG. 3 as the minimum residue of I₀, I₁, P₆ and P₇ or as the residue of the average of one of the I-fields and one of the P-fields if that residue is smaller. Field B₅, however, is calculated as the minimum residue of the fields I₁, B₂, P₆ and P₇. Similarly, B₃ is calculated from fields I₁, B₂, B₅ and P₆ and B₄ is calculated from fields B₂, B₃, B₅ and P₆. In order to avoid the error propagation among B-fields, the use of B-fields for predicting other B-fields is restricted to be within the boundaries of the anchor frames on either side of the B-fields.

The processing of the I₀, I₁, P₆ and P₇ fields is essentially the same as outlined above with reference to FIGS. 2 and 3. The processing for field B₂, however, is different; since this field is later used to predict fields B₃, B₄ and B₅, B₂ is reconstructed and stored in the multi-field memory 48. In addition, the processing of the fields B₃, B₄ and B₅ is different since these fields are encoded with reference to reconstructed B-fields. These fields are also encoded in a different order: B₂, B₅, B₃ and B₄ instead of B₂, B₃, B₄ and B₅. Since the anchor B-field is often the closest in time to the field that is being encoded, it is likely that it will provide better motion compensation than the other anchor field. The inventor has determined that this method significantly reduces the number of bits needed to encode a sequence of video fields compared to the methods described above with reference to FIGS. 2 and 3.

In the system shown in FIG. 4, fields B₂, B₃ and B₅ are stored in the memory 48 while none of the B-fields are stored when the group of fields configuration shown in FIGS. 2 and 3 is used. In the configuration shown in FIG. 5, however, field B₂ may overwrite field I₀ and field B₃ may overwrite field I₁. Consequently, only one additional field of storage is used for the configuration shown in FIG. 5 compared to those shown in FIGS. 2 and 3.

Another variation on the methods shown in FIGS. 2 and 3 which has produced a significant increase in video data compression is to distribute the I-fields and the P-fields among the B-fields. FIG. 5 shows an exemplary group of fields configuration in which the P-fields are not grouped in P-frames, as in FIGS. 2, 3 and 4, but occur as single fields separated by intervening B-fields.

The increase in data compression achieved by using this scheme results from a reduction in the prediction time span relative to the configurations shown in FIGS. 2 and 3. In the group-of-fields configuration FIG. 5 the first predictive field is P₄, the fourth field rather than the sixth field. Thus, the time span for the prediction is three or four field intervals rather than five or six as in the configuration shown in FIG. 3. Furthermore, the second predictive field, P₇ is generated either from field I₁ or from field P₄ depending on which has the smaller residue. As described above, for images of moving objects, especially if the objects do not move by simple translation, the prediction of P₇ based on P₄ will generally produce a smaller residue than the prediction based on I₀.

In addition to shortening the time span over which P-fields are predicted, the configuration shown in FIG. 5 also reduces the time span over which B-fields are predicted. As shown in FIG. 5, fields B₂ and B₃ are predicted from fields I₀, I₁, P₄ and P₇, while fields B₅ and B₆ are predicted from fields I₁, P₄, P₇ and P₁0.

FIG. 6 shows a configuration in which both the P-fields and I-fields are distributed among the B-fields. In addition to reducing the time span over which P-fields and B-fields are predicted, this scheme refreshes the prediction more frequently and, so, reduces the visibility of any errors that may occur in the prediction process.

In FIG. 6, field P₃ is predicted from field I₀, field P₆ is predicted from I₀ and P₃ and field P₉ is predicted from P₃ and P₆. Fields B₁ and B₂ are predicted from fields I₀, P₃ and P₆, while fields B₄ and B₅ are predicted from fields I₀, P₃, P₆ and P₉. Each of the B-fields may be predicted over a time span of one field interval while each of the P-fields may be predicted over a time span of three field intervals.

If the prediction refresh time is kept the same as in the sequences shown in FIGS. 2 and 3, the number of I-fields can be decreased by one-half. This results in fewer bits on the average being used to encode a group of fields.

FIG. 7 illustrates a group-of-fields configuration in which P-fields and I-fields are distributed among the B-fields and the closer available field is used to predict B-fields. The processing of the first six fields for this group-of-fields configuration is the same as for the configuration shown in FIG. 6 except for field B₅. In the scheme shown in FIG. 7, this field is predicted from fields P₃, B₄, P₆ and P₉ while in the scheme shown in FIG. 6 it was predicted from fields I₀, P₃, P₆ and P₉. This reduction in the predictive time span for one of the anchor fields from three field intervals to one field interval increases the likelihood of producing a predictive residue that has a relatively small average magnitude.

The group-of-fields configuration shown in FIG. 8 reduces the number of I-fields and P-fields used to represent the image, and at the same time, uses the closer available I, P or B-field to predict each B-field. This scheme reduces the total number of bits needed to encode the image since, in general, P-fields use fewer bits than I-fields and B-fields use fewer bits than P-fields.

In the configuration shown in FIG. 8, field P₁ is predicted from field I₀ and field P₇ is predicted from I₀ and P₁. Field B₆ is predicted from three fields, I₀, P₁ and P₇, while B₂ is predicted from four fields, I₀, P₁, B₆ and P₇. Field B₅ is predicted using fields P₁, B₂, B₆ and P₇. Field B₅ is then used along with fields B₆, P₁ and B₂ to predict field B₃. Finally, field B₄ is predicted entirely from B-fields: B₂, B₃, B₅ and B₆.

In this configuration, predictive field P₁ is used in the same manner as the intra field I₁ was used in the configuration shown in FIG. 4 while the field B₆ is used in the same manner as field P₆ in FIG. 4.

A final group-of-fields configuration is shown in FIG. 9. This configuration is an extension of that shown in FIG. 8. Instead of substituting P-fields for I-fields the configuration shown in FIG. 9 substitutes B-fields for I-fields. This scheme achieves a lower average bit-rate than the scheme shown in FIG. 8 since, on the average, fewer bits are used to encode a B-field than are used to encode a P-field.

In addition to the group-of-field configurations shown in FIGS. 4-9, it is contemplated that other configurations based on other combinations of the described techniques may be used to efficiently encode images. Furthermore, it is contemplated that several of these group-of-fields configurations could be used to encode a single image sequence by adding a code at the start of a sequence to define the group-of-fields configuration to the receiver. A particular group-of-fields configuration may be automatically selected by an image signal preprocessor, for example, based on the amount and type of motion in an image or upon the level of detail in the image.

INVENTORS:

Iu, Siu L.

THIS PATENT IS REFERENCED BY THESE PATENTS:

Patent	Priority	Assignee	Title
10063863,	Jul 18 2003	Microsoft Technology Licensing, LLC	DC coefficient signaling at small quantization step sizes
10116959,	Jun 03 2002	Microsoft Technology Licesning, LLC	Spatiotemporal prediction for bidirectionally predictive (B) pictures and motion vector prediction for multi-picture reference motion compensation
10284843,	Jan 25 2002	Microsoft Technology Licensing, LLC	Video coding
10341688,	Dec 30 2004	Microsoft Technology Licensing, LLC	Use of frame caching to improve packet loss recovery
10554985,	Jul 18 2003	Microsoft Technology Licensing, LLC	DC coefficient signaling at small quantization step sizes
10659793,	Jul 18 2003	Microsoft Technology Licensing, LLC	DC coefficient signaling at small quantization step sizes
6157676,	Jul 31 1997	JVC Kenwood Corporation	Digital video signal inter-block interpolative predictive encoding/decoding apparatus and method providing high efficiency of encoding
7149247,	Jan 22 2002	Microsoft Technology Licensing, LLC	Methods and systems for encoding and decoding video data to enable random access and splicing
7280700,	Jul 05 2002	Microsoft Technology Licensing, LLC	Optimization techniques for data compression
7317839,	Sep 07 2003	Microsoft Technology Licensing, LLC	Chroma motion vector derivation for interlaced forward-predicted fields
7352905,	Sep 07 2003	Microsoft Technology Licensing, LLC	Chroma motion vector derivation
7408990,	Nov 30 1998	Microsoft Technology Licensing, LLC	Efficient motion vector coding for video compression
7426308,	Jul 18 2003	Microsoft Technology Licensing, LLC	Intraframe and interframe interlace coding and decoding
7499495,	Jul 18 2003	Microsoft Technology Licensing, LLC	Extended range motion vectors
7529302,	Sep 07 2003	Microsoft Technology Licensing, LLC	Four motion vector coding and decoding in bi-directionally predicted interlaced pictures
7567617,	Sep 07 2003	Microsoft Technology Licensing, LLC	Predicting motion vectors for fields of forward-predicted interlaced video frames
7577198,	Sep 07 2003	Microsoft Technology Licensing, LLC	Number of reference fields for an interlaced forward-predicted field
7577200,	Sep 07 2003	Microsoft Technology Licensing, LLC	Extended range variable length coding/decoding of differential motion vector information
7590179,	Sep 07 2003	Microsoft Technology Licensing, LLC	Bitplane coding of prediction mode information in bi-directionally predicted interlaced pictures
7599438,	Sep 07 2003	Microsoft Technology Licensing, LLC	Motion vector block pattern coding and decoding
7609762,	Sep 07 2003	Microsoft Technology Licensing, LLC	Signaling for entry point frames with predicted first field
7609763,	Jul 18 2003	Microsoft Technology Licensing, LLC	Advanced bi-directional predictive coding of video frames
7616692,	Sep 07 2003	Microsoft Technology Licensing, LLC	Hybrid motion vector prediction for interlaced forward-predicted fields
7620106,	Sep 07 2003	Microsoft Technology Licensing, LLC	Joint coding and decoding of a reference field selection and differential motion vector information
7623574,	Sep 07 2003	Microsoft Technology Licensing, LLC	Selecting between dominant and non-dominant motion vector predictor polarities
7630438,	Sep 07 2003	Microsoft Technology Licensing, LLC	Direct mode motion vectors for Bi-directionally predicted interlaced pictures
7646810,	Jan 25 2002	Microsoft Technology Licensing, LLC	Video coding
7664177,	Sep 07 2003	Microsoft Technology Licensing, LLC	Intra-coded fields for bi-directional frames
7680185,	Sep 07 2003	Microsoft Technology Licensing, LLC	Self-referencing bi-directionally predicted frames
7685305,	Mar 12 1999	Microsoft Technology Licensing, LLC	Media coding for loss recovery with remotely predicted data units
7734821,	Mar 12 1999	Microsoft Technology Licensing, LLC	Media coding for loss recovery with remotely predicted data units
7738554,	Jul 18 2003	Microsoft Technology Licensing, LLC	DC coefficient signaling at small quantization step sizes
7839930,	Nov 13 2003	Microsoft Technology Licensing, LLC	Signaling valid entry points in a video stream
7852919,	Sep 07 2003	Microsoft Technology Licensing, LLC	Field start code for entry point frames with predicted first field
7852936,	Sep 07 2003	Microsoft Technology Licensing, LLC	Motion vector prediction in bi-directionally predicted interlaced field-coded pictures
7885327,	Feb 14 2005	Samsung Electronics Co., Ltd.	Video coding and decoding methods with hierarchical temporal filtering structure, and apparatus for the same
7924920,	Sep 07 2003	Microsoft Technology Licensing, LLC	Motion vector coding and decoding in interlaced frame coded pictures
7924921,	Sep 07 2003	Microsoft Technology Licensing, LLC	Signaling coding and display options in entry point headers
7925774,	May 30 2008	ZHIGU HOLDINGS LIMITED	Media streaming using an index file
7949775,	May 30 2008	ZHIGU HOLDINGS LIMITED	Stream selection for enhanced media streaming
7961786,	Sep 07 2003	Microsoft Technology Licensing, LLC	Signaling field type information
8064520,	Sep 07 2003	Microsoft Technology Licensing, LLC	Advanced bi-directional predictive coding of interlaced video
8085844,	Sep 07 2003	Microsoft Technology Licensing, LLC	Signaling reference frame distances
8107531,	Sep 07 2003	Microsoft Technology Licensing, LLC	Signaling and repeat padding for skip frames
8116380,	Sep 07 2003	Microsoft Technology Licensing, LLC	Signaling for field ordering and field/frame display repetition
8189666,	Feb 02 2009	Microsoft Technology Licensing, LLC	Local picture identifier and computation of co-located information
8213779,	Sep 07 2003	Microsoft Technology Licensing, LLC	Trick mode elementary stream and receiver system
8254455,	Jun 30 2007	Microsoft Technology Licensing, LLC	Computing collocated macroblock information for direct mode macroblocks
8340181,	Feb 14 2005	Samsung Electronics Co., Ltd.	Video coding and decoding methods with hierarchical temporal filtering structure, and apparatus for the same
8370887,	May 30 2008	ZHIGU HOLDINGS LIMITED	Media streaming with enhanced seek operation
8374245,	Jun 03 2002	Microsoft Technology Licensing, LLC	Spatiotemporal prediction for bidirectionally predictive(B) pictures and motion vector prediction for multi-picture reference motion compensation
8379722,	Jul 19 2002	Microsoft Technology Licensing, LLC	Timestamp-independent motion vector prediction for predictive (P) and bidirectionally predictive (B) pictures
8406300,	Jan 25 2002	Microsoft Technology Licensing, LLC	Video coding
8548051,	Mar 12 1999	Microsoft Technology Licensing, LLC	Media coding for loss recovery with remotely predicted data units
8625669,	Sep 07 2003	Microsoft Technology Licensing, LLC	Predicting motion vectors for fields of forward-predicted interlaced video frames
8634413,	Dec 30 2004	Microsoft Technology Licensing, LLC	Use of frame caching to improve packet loss recovery
8638853,	Jan 25 2002	Microsoft Technology Licensing, LLC	Video coding
8687697,	Jul 18 2003	Microsoft Technology Licensing, LLC	Coding of motion vector information
8774280,	Jul 19 2002	Microsoft Technology Licensing, LLC	Timestamp-independent motion vector prediction for predictive (P) and bidirectionally predictive (B) pictures
8819754,	May 30 2008	ZHIGU HOLDINGS LIMITED	Media streaming with enhanced seek operation
8873630,	Jun 03 2002	Microsoft Technology Licensing, LLC	Spatiotemporal prediction for bidirectionally predictive (B) pictures and motion vector prediction for multi-picture reference motion compensation
8917768,	Jul 18 2003	Microsoft Technology Licensing, LLC	Coding of motion vector information
9077960,	Aug 12 2005	Microsoft Technology Licensing, LLC	Non-zero coefficient block pattern coding
9148668,	Jul 18 2003	Microsoft Technology Licensing, LLC	Coding of motion vector information
9185427,	Jun 03 2002	Microsoft Technology Licensing, LLC	Spatiotemporal prediction for bidirectionally predictive (B) pictures and motion vector prediction for multi-picture reference motion compensation
9232219,	Mar 12 1999	Microsoft Technology Licensing, LLC	Media coding for loss recovery with remotely predicted data units
9313501,	Dec 30 2004	Microsoft Technology Licensing, LLC	Use of frame caching to improve packet loss recovery
9313509,	Jul 18 2003	Microsoft Technology Licensing, LLC	DC coefficient signaling at small quantization step sizes
9571854,	Jun 03 2002	Microsoft Technology Licensing, LLC	Spatiotemporal prediction for bidirectionally predictive (B) pictures and motion vector prediction for multi-picture reference motion compensation
9866871,	Dec 30 2004	Microsoft Technology Licensing, LLC	Use of frame caching to improve packet loss recovery
9888237,	Jan 25 2002	Microsoft Technology Licensing, LLC	Video coding
9918085,	Mar 12 1999	Microsoft Technology Licensing, LLC	Media coding for loss recovery with remotely predicted data units

THIS PATENT REFERENCES THESE PATENTS:

Patent	Priority	Assignee	Title
4217609,	Feb 28 1978	Kokusai Denshin Denwa Kabushiki Kaisha	Adaptive predictive coding system for television signals
4771439,	Aug 30 1984	Fujitsu Limited	Differential coding circuit with reduced critical path applicable to DPCM
4951140,	Feb 22 1988	Kabushiki Kaisha Toshiba	Image encoding apparatus
4985768,	Jan 20 1989	Victor Company of Japan, LTD	Inter-frame predictive encoding system with encoded and transmitted prediction error
4999705,	May 03 1990	AT&T Bell Laboratories	Three dimensional motion compensated video coding
5025482,	May 24 1989	MITSUBISHI DENKI KABUSHIKI KAISHA,	Image transformation coding device with adaptive quantization characteristic selection
5063443,	May 18 1989	NEC Corporation	Codec system encoding and decoding an image signal at a high speed
5089889,	Apr 28 1989	Victor Company of Japan, LTD	Apparatus for inter-frame predictive encoding of video signal
5093720,	Aug 20 1990	CIF LICENSING, LLC	Motion compensation for interlaced digital television signals
5103307,	Jan 20 1990	VICTOR COMPANY OF JAPAN, LTD , 12, MORIYA-CHO 3-CHOME, KANAGAWA-KU, YOKOHAMA-SHI, KANAGAWA-KEN, JAPAN	Interframe predictive coding/decoding system for varying interval between independent frames
5146325,	Apr 29 1991	RCA THOMSON LICENSING CORPORATION A CORP OF DE	Video signal decompression apparatus for independently compressed even and odd field data
5150432,	Mar 26 1990	Kabushiki Kaisha Toshiba	Apparatus for encoding/decoding video signals to improve quality of a specific region
5168356,	Feb 27 1991	General Electric Company	Apparatus for segmenting encoded video signal for transmission
5175618,	Oct 31 1990	Victor Company of Japan, LTD	Compression method for interlace moving image signals
5185819,	Apr 29 1991	General Electric Company	Video signal compression apparatus for independently compressing odd and even fields
5191414,	Mar 27 1990	Victor Company of Japan, LTD	Interfield predictive encoder and decoder for reproducing a signal subjected to predictive encoding by encoder into an image signal
5193004,	Dec 03 1990	TRUSTEES OF COLUMBIA UNIVERSITY IN THE CITY OF NEW YORK, THE	Systems and methods for coding even fields of interlaced video sequences
5228028,	Feb 06 1990	ALCATEL ITALIA S P A	System including packet structure and devices for transmitting and processing output information from a signal encoder
WO9210061,

ASSIGNMENT RECORDS Assignment records on the USPTO

Executed on	Assignor	Assignee	Conveyance	Frame	Reel	Doc
Oct 21 1997		Matsushita Electric Corporation of America	(assignment on the face of the patent)
Nov 23 2004	Matsushita Electric Corporation of America	Panasonic Corporation of North America	MERGER SEE DOCUMENT FOR DETAILS	016237	0751	pdf

MAINTENANCE FEES AND DATES: Maintenance records on the USPTO

Date	Maintenance Fee Events
Aug 16 2001	M184: Payment of Maintenance Fee, 8th Year, Large Entity.
Jun 30 2005	M1553: Payment of Maintenance Fee, 12th Year, Large Entity.

Date	Maintenance Schedule
Jan 18 2003	4 years fee payment window open
Jul 18 2003	6 months grace period start (w surcharge)
Jan 18 2004	patent expiry (for year 4)
Jan 18 2006	2 years to revive unintentionally abandoned end. (for year 4)
Jan 18 2007	8 years fee payment window open
Jul 18 2007	6 months grace period start (w surcharge)
Jan 18 2008	patent expiry (for year 8)
Jan 18 2010	2 years to revive unintentionally abandoned end. (for year 8)
Jan 18 2011	12 years fee payment window open
Jul 18 2011	6 months grace period start (w surcharge)
Jan 18 2012	patent expiry (for year 12)
Jan 18 2014	2 years to revive unintentionally abandoned end. (for year 12)