methods for motion estimation with adaptive motion accuracy of the present invention include several techniques for computing motion vectors of high pixel accuracy with a minor increase in computation. One technique uses fast-search strategies in sub-pixel space that smartly searches for the best motion vectors. An alternate technique estimates high-accurate motion vectors using different interpolation filters at different stages in order to reduce computational complexity. Yet another technique uses rate-distortion criteria that adapts according to the different motion accuracies to determine both the best motion vectors and the best motion accuracies. Still another technique uses a VLC table that is interpreted differently at different coding units, according to the associated motion vector accuracy.
|
0. 29. A video processing method comprising:
performing a motion compensation using a motion vector having a fractional accuracy level; and
computing the motion vector and a fractional accuracy level which indicates two or more levels of a fractional accuracy expressed by 1/N pel (N is an arbitrary integer) of the motion vector, wherein
the motion compensation is performed by interpolation with a filter corresponding to the fractional accuracy level,
the fractional accuracy level is set frame-by-frame so that different frames could use different motion accuracies and is computed frame-by-frame,
computing the fractional accuracy level separately from the motion vector by using a variable length code, and
computing the motion vector for each block in a block by block manner.
0. 28. A motion compensated video encoding method comprising:
performing a motion compensation using a motion vector having a fractional accuracy level; and
encoding the motion vector and a fractional accuracy level which indicates two or more levels of a fractional accuracy expressed by 1/N pel (N is an arbitrary integer) of the motion vector, wherein
the motion compensation is performed by interpolation with a filter corresponding to the fractional accuracy level,
the fractional accuracy level is set frame-by-frame so that different frames could use different motion accuracies and is sent frame-by-frame,
encoding a variable length fractional accuracy level which indicates the fractional accuracy level, separately from encoding the motion vector, and
encoding the motion vector for each block in a block by block manner.
0. 27. A motion compensated video encoding apparatus comprising:
a motion compensator that compensates a motion using a motion vector having a fractional accuracy level; and
an encoder that encodes the motion vector and a fractional accuracy level which indicates two or more levels of a fractional accuracy expressed by 1/N pel (N is an arbitrary integer) of the motion vector, wherein
the motion compensation is performed by interpolation with a filter corresponding to the fractional accuracy level,
the fractional accuracy level is set frame-by-frame so that different frames could use different motion accuracies and is sent frame-by-frame,
the encoder encodes a variable length fractional accuracy level which indicates the fractional accuracy level, separately from encoding the motion vector, and
the encoder encodes the motion vector for each block in a block by block manner.
0. 1. A fast-search adaptive motion accuracy search method for estimating motion vectors in motion-compensated video coding by finding a best motion vector for a macroblock, said method comprising the steps of:
(a) searching a first set of motion vector candidates in a grid of sub-pixel resolution of a predetermined square radius centered on V1, to find a best motion vector V2 using a first criteria;
(b) searching a second set of motion vector candidates in a grid of sub-pixel resolution of a predetermined square radius centered on V2 to find a best motion vector V3 using a second criteria;
(c) searching a third set of motion vector candidates in a grid of sub-pixel resolution of a predetermined square radius centered on V3 to find said best motion vector of said macroblock using a third criteria, and
(d) wherein at least one of said first criteria, said second criteria, and said third criteria is a rate-distortion criteria.
0. 2. The method of
0. 3. The method of
0. 4. The method of
0. 5. The method of
0. 6. The method of
0. 7. The method of
0. 8. The method of
(a) searching three candidates of ⅓-pel accuracy V2 and a ½-pel location with the next lowest rate-distortion cost if V2 is at the center;
(b) searching four vector candidates of ⅓-pel accuracy that are closest to V2 if V2 is a corner vector; and
(c) determining which of two corners has lower rate-distortion cost and searching four vector candidates of ⅓-pel accuracy that are closest to a line between said corner with lower rate-distortion cost, if V2 is between two corners vectors.
0. 9. An adaptive motion accuracy search method for estimating motion vectors in motion-compensated video coding by finding a best motion vector for a macroblock, said method comprising the steps of:
(a) searching a first set of motion vector candidates in a grid centered on V1 using a first criteria to find a best motion vector V2 using a first filter to do a first interpolation;
(b) searching a second set of motion vector candidates in a grid centered on V2 using a second criteria to find a best motion vector V3 using a second filter to do a second interpolation; and
(c) searching a third set of motion vector candidates in a grid centered on V3 using a third criteria to find said best motion vector of said macroblock using a third filter to do a third interpolation;
(d) wherein at least one of said first criteria, said second criteria, and said third criteria is a rate-distortion criteria.
0. 10. The method of
0. 11. The method of
0. 12. The method of
0. 13. The method of
0. 14. The method of
0. 15. The method of
0. 16. An adaptive motion accuracy search method for estimating motion vectors in motion-compensated video coding by finding a best motion vector for a macroblock, said method comprising the steps of:
(a) searching at a first motion accuracy for a first best motion vector of said macroblock;
(b) encoding said first best motion vector and said first motion accuracy;
(c) searching for at least one second best motion vector of said macroblock at an at least one second motion accuracy;
(d) encoding said at least one second best motion vector and said at least one second motion accuracy; and
(e) selecting the best motion vector of said first and at least one second best motion vectors using rate-distortion criteria.
0. 17. The method of
0. 18. The method of
0. 19. The method of
0. 20. An adaptive motion accuracy search method for estimating motion vectors in motion-compensated video coding by finding a best motion vector for a macroblock, said method comprising the steps of:
(a) searching at a motion accuracy for a best motion vector of said macroblock using rate-distortion criteria;
(b) encoding said motion accuracy using a code from a VLC table that is interpreted differently at different coding units according to the associated motion vector accuracy; and
(c) encoding said best motion vector in the respective accuracy space.
0. 21. A system for estimating motion vectors in motion-compensated video coding by finding a best motion vector for a macroblock, said system comprising:
(a) a first encoder for searching a first set of motion vector candidates in a grid of sub-pixel resolution of a predetermined square radius centered on V1 using a first criteria to find a best motion vector V2;
(b) a second encoder for searching a second set of motion vector candidates in a grid of sub-pixel resolution of a predetermined square radius centered on V2 using a second criteria to find a best motion vector V3; and (c) a third encoder for searching a third set of motion vector candidates in a grid of sub-pixel resolution of a predetermined square radius centered on V3 using a third criteria to find said best motion vector of said macroblock;
(d) wherein at least one of said first criteria, said second criteria, and said third criteria is a rate-distortion criteria.
0. 22. The system of
0. 23. A fast-search adaptive motion accuracy search method for estimating motion vectors in motion-compensated video coding by finding a best motion vector for a macroblock, said method comprising the steps of:
(a) searching a first set of motion vector candidates in a grid of sub-pixel resolution of a predetermined square radius centered on V1 to find a best motion vector V2;
(b) searching a second set of motion vector candidates in a grid of sub-pixel resolution of a predetermined square radius centered on V2 to find a best motion vector V3;
(c) searching a third set of motion vector candidates in a grid of sub-pixel resolution of a predetermined square radius centered on V3 to find said best motion vector of said macroblock, and
(d) using V2 as the motion vector for the macroblock if V2 has the smallest rate-distortion cost and skipping step (c).
0. 24. The method of
0. 25. The method of
0. 26. The system of
|
The technology of the present invention allows the encoder to choose between any set of motion accuracies (for example, ½, ⅓, and ⅙-pel accurate motion vectors) using either a full search strategy or a fast search strategy.
Full-Search AMA Search Strategy
As shown in
A critical issue in the motion vector search is the choice of a measure or criterion for establishing which block is the best match for the given macroblock. In practice, most methods use either the mean squared error (“MSE”) or mean absolute difference (“MAD”) criteria. The MSE between two blocks consists of subtracting the pixel values of the two blocks, squaring the pixel differences, and then taking the average. The MAD difference between two blocks is a similar distortion measure, except that the absolute value of the pixel differences is computed instead of the squares. If two image blocks are similar to each other, the MSE and MAD values will be small. If, however, the image blocks are dissimilar, these values will be large. Hence, typical video coders find the best match for a macroblock by selecting the motion vector that produces either the smallest MSE or the smallest MAD. In other words, the block associated to the best motion vector is the one closest to the given macroblock in an MSE or MAD sense.
Unfortunately, the MSE and MAD distortion measures do not take into account the cost in bits of actually encoding the vector. For example, a given motion vector may minimize the MSE, but it may be very costly to encode with bits, so it may not be the best choice from a coding standpoint.
To deal with this, advanced encoders such as those described by Telenor use rate-distortion (“RD”) criteria of the type “distortion+L*Bits” to select the best motion vector. The value of “distortion” is typically the MSE or MAD, “L” is a constant that depends on the compression level (i.e., the quantization step size), and “Bits” is the number of bits required to code the motion vector. In general, any RD criteria of this type would work with the present invention. However, in the present invention “Bits” include the bits needed for encoding the vector and those for encoding the accuracy of the vector. In fact, some candidates can have several “Bits” values, because they can have several accuracy modes. For example, the candidate at location (½, −½) can be thought of having ½ or ⅓ pixel 1/6-pixel accuracy.
Fast-Search AMA Search Strategy
As shown in
Experimental data has shown that, on average, this simple fast search strategy typically checks the RD cost of about eighteen locations in sub-pixel space (ten more than Telenor's search strategy), and hence the overall computational complexity is only moderately increased.
The experimental data discussed below in connection with
Alternate embodiments of the invention replace one or more of the steps 108-120. These embodiments have also been effective and have further reduced the number of motion vector candidates to check in the sub-pixel velocity space.
Computation And Memory Savings
Because step 108 checks only motion vector candidates of ½-pixel accuracy, the computation and memory requirements for the hardware or software implementation are significantly reduced. To be specific, in a smart implementation embodiment of this fast-search the reference frame is interpolated by 2×2 in order to obtain the RD costs for the ½-pel vector candidates. A significant amount of fast (or cache) memory for a hardware or software encoder is saved as compared to Telenor's approach that needed to interpolate the reference frame by 3×3. In comparison to the Telenor encoder, this is a cache memory savings of 9/4or 9/4, or a factor of 2.25. The few additional interpolations can be done later on a block-by-block basis.
Additionally, since the interpolations in step 108 are used to direct the search towards the lower values of the RD cost function, a complex filter is not needed for these interpolations. Accordingly, computation power may be saved by using a simple bilinear filter for step 108.
Also, other key coding decisions such as selecting the mode of a macroblock (e.g., 16×16, four-8×8, etc.) can be done using the ½-pel vectors because such decisions do not benefit significantly from using higher accuracies. Then, the encoder can use a more complex cubic filter to interpolate the required sub-pixel values for the few additional vector candidates to check in the remaining steps. Since the macroblock mode has already been chosen, these final interpolations only need to be done for the chosen mode.
Use of multiple-filters obtained computation savings of over twenty percent in running time on a Sparc Ultra 10 Workstation in comparison to Telenor's approach, which uses a cubic interpolation all the time. Additionally, the fast-memory requirements were reduced by nearly half. Also, there was little or no loss in compression performance. Comparing one preferred embodiment of the fast-search, Benzler's technique requires about 70 interpolations per pixel in the Telenor encoder and the present invention requires only about 7 interpolations per pixel.
Coding The Motion Vector And Accuracies With Bits
Once the best motion vector and accuracy are determined, the encoder encodes both the motion vector and accuracy is values with bits. One approach is to encode the motion vector with a given accuracy (e.g., half-pixel accuracy) and then add some extra bits for refining the vector to the higher motion accuracy. This is the strategy suggested by B. Girod, but it is sub-optimal in a rate-distortion sense.
In one preferred embodiment of the present invention, the accuracy of the motion vector for a macroblock is first encoded using a simple code such as the one given in Table 1. Any other table with code lengths {1, 2, 2} could be used as well. The bit rate could be further reduced using a typical DPCM approach.
TABLE 1
VLC table to indicate the accuracy mode for a given macroblock.
Motion
Code
Accuracy
1
½-pel
01
⅓-pel
11
⅙-pel
Next, the value of the vector/s in the respective accuracy space is encoded. These bits can be obtained from entries of a single VLC table such as the one used in the H26L codec. The key idea is that these bits are interpreted differently depending on the motion accuracy for the macroblock. For example, if the motion accuracy is ⅓ and the code bits for the X component of the difference motion vector are 000011 00001 (observe that this code is the fourth entry (code number 3) of H26L's VLC table in [6]), the X component of the vector is Vx=⅔. If the accuracy is ½, such code corresponds to Vx=1.
Compared to the Benzler method for encoding the motion vectors with a variable length code (“VLC”) table that could be used for encoding ½and ¼pixel accurate vectors, the method of the present invention can be used for encoding vectors of any motion accuracy and the table can be interpreted differently at each frame and macroblock. Further, the general method of the present invention can be used for any motion accuracy, not necessarily those that are multiples of each other or those that are of the type 1/n (with n an integer). The number of increments in the given sub-pixel space is simply counted and the bits in the associated entry of the table is used as the code.
From the decoder's viewpoint, once the motion accuracy is decoded, the motion vector can also be easily decoded. After that, the associated block in the previous frame is reconstructed using a typical 4-tap cubic interpolator. There is a different 4-tap filter for each motion accuracy.
The AMA does not increase decoding complexity, because the number of operations needed to reconstruct the predicted block are the same, regardless of the motion accuracy.
Experimental Results
TABLE 2
Description of the Experiments
Video sequence
FIG. #
Resolution
Frame rate
Container
FIG. 8
QCIF
10
News
FIG. 9
QCIF
10
Mobile
FIG. 10
QCIF
10
FIG. 11
SIF
15
Garden
FIG. 12
QCIF
15
Tempete
FIG. 13
SIF
15
FIG. 14
QCIF
15
Paris Shaked
FIG. 15
QCIF
10
The video sequences are commonly used by the video coding community, except for “Paris Shaked.” The latter is a synthetic sequence obtained by shifting the well-known sequence “Paris” by a motion vector whose X and Y components take a random value within [−1,1]. This synthetic sequence simulates small movements caused by a hand-held camera in a typical video phone scene.
Comparison Of Full-Search And Fast-Search AMA
The experimental results shown in
Combining AMA And Multiple Reference Frames
In the plot shown in
The experiments show that the gains with AMA add to those obtained using multiple reference frames. The gain from AMA in the one-reference case can be measured by comparing the curve labeled with a “+” (Telenor AMA+c+1r) with the curve labeled with an “x” (Telenor ⅓+1r), and the gain in the five-reference case can be measured between the curve labeled with a “diamond” (Telenor AMA+c+5r) with the curve labeled with a “*” (Telenor ⅓+5r).
It should be noted that the present invention may be implemented at the frame level so that different frames could use different motion accuracies, but within a frame all motion vectors would use the same accuracy. Preferably in this embodiment the motion vector accuracy would then be signaled only once at the frame layer. Experiments have shown that using the best, fixed motion accuracy for the whole frame should also produce compression gains as those presented here for the macroblock-adaptive case.
In another frame-based embodiment the encoder could do motion compensation on the entire frame with the different vector accuracies and then select the best accuracy according to the RD criteria. This approach is not suitable for pipeline, one-pass encoders, but it could be appropriate for software-based or more complex encoders. Still In still another fame-based embodiment, the encoder could use previous statistics and/or formulas to predict what will be the best accuracy for a given frame (e.g., the formulas in set forth in the Ribas work or a variation thereof can be used). This approach would be well-suited for one-pass encoders, although the performance gains would depend on the precision of the formulas used for the prediction.
The terms and expressions which have been employed in the foregoing specification are used therein as terms of description and not of limitation, and there is no intention, in the use of such terms and expressions, of excluding equivalents of the features shown and described or portions thereof, it being recognized that the scope of the invention is defined and limited only by the claims that follow.
Ribas-Corbera, Jordi, Shen, Jiandong
Patent | Priority | Assignee | Title |
Patent | Priority | Assignee | Title |
4864393, | Jun 09 1987 | Sony Corp. | Motion vector estimation in television images |
4937666, | Dec 04 1989 | SHINGO LIMITED LIABILITY COMPANY | Circuit implementation of block matching algorithm with fractional precision |
5105271, | Sep 29 1989 | Victor Company of Japan, LTD | Motion picture data coding/decoding system having motion vector coding unit and decoding unit |
5408269, | May 29 1992 | Sony Corporation | Moving picture encoding apparatus and method |
5489949, | Feb 08 1992 | Samsung Electronics Co., Ltd. | Method and apparatus for motion estimation |
5610658, | Jan 31 1994 | Sony Corporation | Motion vector detection using hierarchical calculation |
5623313, | Sep 22 1995 | France Brevets | Fractional pixel motion estimation of video signals |
5682205, | Aug 19 1994 | Eastman Kodak Company | Adaptive, global-motion compensated deinterlacing of sequential video fields with post processing |
5694179, | Dec 23 1994 | PENDRAGON ELECTRONICS AND TELECOMMUNICATIONS RESEARCH LLC | Apparatus for estimating a half-pel motion in a video compression method |
5754240, | Oct 04 1995 | Matsushita Electric Industrial Co., Ltd. | Method and apparatus for calculating the pixel values of a block from one or two prediction blocks |
5767907, | Oct 11 1994 | Hitachi America, Ltd. | Drift reduction methods and apparatus |
5844616, | Jun 01 1993 | Thomson multimedia S.A. | Method and apparatus for motion compensated interpolation |
5987181, | Oct 12 1995 | Dolby Laboratories Licensing Corporation | Coding and decoding apparatus which transmits and receives tool information for constructing decoding scheme |
6005509, | Jul 15 1997 | Deutsches Zentrum fur Luft-und Raumfahrt e.V. | Method of synchronizing navigation measurement data with S.A.R radar data, and device for executing this method |
6205176, | Jul 28 1997 | JVC Kenwood Corporation | Motion-compensated coder with motion vector accuracy controlled, a decoder, a method of motion-compensated coding, and a method of decoding |
6249318, | Sep 12 1997 | VID SCALE, INC | Video coding/decoding arrangement and method therefor |
6269174, | Oct 28 1997 | HANGER SOLUTIONS, LLC | Apparatus and method for fast motion estimation |
6275532, | Mar 18 1995 | Dolby Laboratories Licensing Corporation | Video coding device and video decoding device with a motion compensated interframe prediction |
6714593, | Oct 21 1997 | Robert Bosch GmbH | Motion compensating prediction of moving image sequences |
6968008, | Jul 27 1999 | Sharp Kabushiki Kaisha | Methods for motion estimation with adaptive motion accuracy |
7224733, | Jul 15 1997 | Robert Bosch GmbH | Interpolation filtering method for accurate sub-pixel motion assessment |
20010017889, | |||
DE19730305, | |||
EP420653, | |||
EP1073276, | |||
GB2305569, | |||
JP1042295, | |||
JP1146364, | |||
JP1155673, | |||
JP2001189934, | |||
JP201135928, | |||
JP201275175, | |||
JP4264889, | |||
JP795585, | |||
JP8116532, | |||
JP9153820, | |||
RE44012, | Jul 27 1999 | Sharp Kabushiki Kaisha | Methods for motion estimation with adaptive motion accuracy |
RE45014, | Jul 27 1999 | Sharp Kabushiki Kaisha | Methods for motion estimation with adaptive motion accuracy |
WO9841011, | |||
WO9904574, |
Executed on | Assignor | Assignee | Conveyance | Frame | Reel | Doc |
Jan 31 2014 | Sharp Kabushiki Kaisha | (assignment on the face of the patent) | / |
Date | Maintenance Fee Events |
Date | Maintenance Schedule |
Jul 04 2020 | 4 years fee payment window open |
Jan 04 2021 | 6 months grace period start (w surcharge) |
Jul 04 2021 | patent expiry (for year 4) |
Jul 04 2023 | 2 years to revive unintentionally abandoned end. (for year 4) |
Jul 04 2024 | 8 years fee payment window open |
Jan 04 2025 | 6 months grace period start (w surcharge) |
Jul 04 2025 | patent expiry (for year 8) |
Jul 04 2027 | 2 years to revive unintentionally abandoned end. (for year 8) |
Jul 04 2028 | 12 years fee payment window open |
Jan 04 2029 | 6 months grace period start (w surcharge) |
Jul 04 2029 | patent expiry (for year 12) |
Jul 04 2031 | 2 years to revive unintentionally abandoned end. (for year 12) |