A method of intra mode prediction uses a block of pixels and their horizontal Hpos and vertical Vpos pixel positions and adjacent horizontal and vertical pixels within an input picture frame signal as inputs for a method for selecting a lowest sum of Absolute Transformed Differences (SATD) intra mode among intra modes among the horizontal, vertical, and steady state (DC) intra modes for use in advanced video coding algorithms, such as MPEG-4 Part 10 and H.264/AVC. Associated costs are calculated for each of the intra modes are used to select the output of the best intra mode. The method reduces the unimproved computational cost of three 2D 4×4 Hadamard transformations (which is equivalent to 24 1D 4 point transformations) to just 4 1D 4 point transformations for a significant computational improvement. As horizontal and vertical panning is frequently used in video imagery, this improvement may reduce encoder processing by 80%.
|
1. A system for selecting an intra mode in a video image encoder, comprising:
a computer processor; and
programming executable on said computer processor for:
(a) selecting a lowest sum of Absolute Transformed Differences (SATD) intra mode among intra modes using inputs [x], Hpos, Vpos, h(bar), and v(bar);
(b) wherein [x] is a 4×4 block of pixels within a picture and
(c) wherein Hpos is a horizontal pixel position of the 4×4 lock within the image;
(d) wherein Vpos is a vertical pixel position of the 4×4 block within the image;
(e) wherein {right arrow over (h)} is a horizontal vector immediately left of the 4×4 block [x], defined as {right arrow over (h)}≡(x0,−1, x1,−1, x2,−1, x3,−1,)T relative to the indexing of the elements of [x];
(f) wherein {right arrow over (v)} is a horizontal vector immediately above the 4×4 block [x], defined as {right arrow over (v)}≡(x−1,0, x−1,1, x−1,2, x−1,3)T relative to the indexing of the elements of [x];and
(g) wherein the lowest SATD intra mode is determined among a group comprising:
i. a horizontal intra mode;
ii. a vertical intra mode; and
iii. a steady state (DC) intra mode; and
(h) outputting the lowest SATD intra mode to a non-transitory computer readable medium.
13. A system for selecting an intra mode in a video image encoder, comprising:
a computer processor; and
programming executable on said computer processor for:
(a) selecting a lowest sum of Absolute Transformed Differences (SATD) intra mode among intra modes using inputs [x], Hpos, Vpos, {right arrow over (h)}, and {right arrow over (v)};
(b) wherein [x] is a p×q block of pixels within a picture with indices xi,j for iε0,1,2, . . . , p−1,jε0,1,2, . . . , q−1; and
(c) wherein Hpos is a horizontal pixel position of the p×q block within the image;
(d) wherein Vpos is a vertical pixel position of the p×q block within the image;
(e) wherein {right arrow over (h)} is a horizontal vector immediately left of the p×q block [x], defined as {right arrow over (h)}=(x0,−1, x1,−1, . . . , xp−1,−1)T relative to the indexing of the elements of [x];
(f) wherein {right arrow over (v)} is a horizontal vector immediately above the p×q block [x], defined as {right arrow over (v)}≡(x−1,0, x−1,1, . . . , x−1,q−1)T relative to the indexing of the elements of [x];
(g) wherein the lowest SATD intra mode is determined among a group comprising:
i. a horizontal intra mode;
ii. a vertical intra mode; and
iii. a steady state (DC) intra mode; and
(h) outputting the lowest SATD intra mode to a non-transitory computer readable medium.
2. The system of
(a) calculating a horizontal predictor {right arrow over (H)}≡(H0,H1,H2,H3)T, a vertical predictor {right arrow over (V)}≡(V0,V1,V2,V3)T, and a steady state (DC) predictor D;
(b) calculating a horizontal cost precursor Chs and a vertical cost precursor Cvs using the horizontal predictor {right arrow over (H)}, the vertical predictor {right arrow over (V)}, and the steady state (DC) predictor D; and
(c) calculating a horizontal intra mode cost CH, a vertical intra mode cost CV, and a steady state (DC) intra mode cost CD using the horizontal cost precursor Chs and the vertical cost precursor Cvs.
3. The system of
if Hpos≠0 and Vpos≠0 then:
i. setting {right arrow over (H)}≡(H0,H1,H2,H3)T=[T4]{right arrow over (h)},
where {right arrow over (h)}≡(h0,h1,h2,h3)T≡(x0,−1, x1,−1,x2,−1,x3,−1)T;
ii. setting {right arrow over (V)}≡(V0,V1,V2,V3)T=[T4]{right arrow over (v)},
where {right arrow over (v)}≡(v0,v1,v2,v3)≡(x−1,0,x−1,1,x−1,2,x−1,3)T; and
iii. setting D=(H0+V0)/2.
4. The system of
if Hpos=0 and Vpos≠0 then:
i. setting {right arrow over (H)}=(215−1,0,0,0)T;
ii. setting {right arrow over (V)}≡(V0,V1,V2,V3)T=[T4]{right arrow over (v)},
where {right arrow over (v)}≡(v0,v1,v2,v3)≡(x−1,0, x−1,1,x−1,2,x−1,3)T; and
iii. setting D=V0.
5. The system of
if Hpos≠0 and Vpos=0 then:
i. setting {right arrow over (H)}≡(H0,H1,H2,H3)T=[T4]{right arrow over (h)},
where {right arrow over (h)}≡(h0,h1,h2,h3)T≡(x0,−1, x1,−1,x2,−1,x3,−1)T;
ii. setting {right arrow over (V)}=(215−1,0,0,0)T; and
iii. setting D=H0.
6. The system of
if Hpos=0 and Vpos=0 then:
i. setting {right arrow over (H)}=(215−1,0,0,0)T;
ii. setting {right arrow over (V)}=(215−1,0,0,0)T; and
iii. setting D=128×16.
7. The system of
(a) calculating the values Xi,0, X0,i for iε0,1,2,3 using the relationships
e####
(i) wherein {right arrow over (u)}T=[1 1 1 1], and
(ii) wherein
(b) calculating the horizontal cost precursor
(c) calculating the vertical cost precursor
8. The system of
calculating
9. The system of
calculating
10. The system of
calculating CD=|D−X0,0|+Chs+Cvs.
11. The system of
selecting the lowest SATD intra mode with a lowest associated intra mode cost among the group consisting of: the horizontal intra mode cost CH, the vertical intra mode cost CV, and the steady state (DC) intra mode cost CD.
14. The system of
(a) calculating a horizontal predictor {right arrow over (H)}≡(H0,H1, . . . ,Hp−1)T, a vertical predictor {right arrow over (V)}≡(V0,V1, . . . ,Vq−1)T, and a steady state (DC) predictor D;
(b) calculating a horizontal cost precursor Chs and a vertical cost precursor Cvs using the horizontal predictor {right arrow over (H)}, the vertical predictor {right arrow over (V)}, and the steady state (DC) predictor D; and
(c) calculating a horizontal intra mode cost CH, a vertical intra mode cost CV, and a steady state (DC) intra mode cost CD using the horizontal cost precursor Chs and the vertical cost precursor Cvs.
15. The system of
(a) if Hpos≠0 and Vpos≠0 then:
i. setting {right arrow over (H)}=[Tpq]{right arrow over (h)},
wherein {right arrow over (h)}≡(h0,h1, . . . ,hp−1)T≡(x0,0, x1,0, . . . ,xp−1,0)T;
ii. setting {right arrow over (V)}=[Tpq]{right arrow over (v)},
wherein {right arrow over (v)}≡(v0,v1, . . . ,vq−1)≡(x0,0,x0,1, . . . ,x0,q−1)T; and
iii. setting D=(H0+V0)/2;
(b) if Hpos=0 and Vpos≠0 then:
i. setting {right arrow over (H)}=(215−1, 0, 0, 0)T;
ii. setting {right arrow over (V)}=[Tpq]{right arrow over (v)},
wherein {right arrow over (v)}≡(v0,v1, . . . ,vq)≡(x0,0,x0,1, . . . ,x0,q)T; and
iii. setting D=V0;
(c) if Hpos≠0 and Vpos=0 then:
i. setting {right arrow over (H)}=[Tpq]{right arrow over (h)},
wherein {right arrow over (h)}≡(h0, h1, . . . , hp−1)T≡(x0,0, x1,0, . . . , xp−1,0)T;
ii. setting {right arrow over (V)}=(215−1,0,0,0)T ,
wherein n is defined as 2n>q×p×q×2b+1; and
iii. setting D=H0;
(d) if Hpos=0 and Vpos=0 then:
i. setting {right arrow over (H)}=(215−1,0,0,0)T;
ii setting {right arrow over (V)}=(215−1,0,0,0)T; and
iii. setting D=128×16.
16. The system of
(a) calculating the values Xi,0, X0,j for iε0,1,2, . . . , p−1, jε0,1,2, . . . , q−1 using the relationship [X]=[Tpq][x][Tpq]T, where
(b) calculating the horizontal cost precursor
(c) calculating the vertical cost precursor
17. The system of
calculating
18. The system of
calculating
19. The system of
calculating CD=|D−X0,0|+Chs+Cvs.
20. The system of
selecting the lowest SATD intra mode with a lowest associated intra mode cost among the group consisting of: the horizontal intra mode cost CH, the vertical intra mode cost CV, and the steady state (DC) intra mode cost CD.
21. The system of
(a) the p dimension is selected from a group of dimensions consisting of:
2, 4, 8, 16, 32, 64, and 128; and
(b) the q dimension is selected from a group of dimensions consisting of:
2, 4, 8, 16, 32, 64, and 128.
|
Not Applicable
Not Applicable
Not Applicable
1. Field of the Invention
This invention pertains generally to video encoding, and more particularly to intra mode decisions within advanced video encoding (such as H.264/AVC or MPEG-4 Part 10) standards.
2. Description of Related Art
H.264/AVC, alternatively known as MPEG-4 Part 10 and by several other monikers, is representative of improved data compression algorithms. Improved data compression, however, comes at the price of greatly increased computational requirements during the encoding processing phase.
In particular, the H.264/AVC contains a very computationally expensive section where an optimal intra mode is calculated that yields the lowest sum of absolute transformed differences (or SATD) between an input 4×4 pixel block, and its compressed version. Examples of prior attempts to reduce this computational cost include A. C. Yu, G. R. Martin, and H. Park, in “A Frequency Domain Approach to Intra Mode Selection in H.264/AVC”, incorporated by reference in its entirety, U.S. patent publication US 2006/0209948 A1, incorporated herein by reference in its entirety, and U.S. patent publication US 2006/0251330 A1, incorporated herein by reference in its entirety.
Accordingly, one aspect of the invention is a method of selecting an intra mode in a video image encoder, comprising:
(a) selecting a lowest Sum of Absolute Transformed Differences (SATD) intra mode among intra modes while using [x], {right arrow over (h)}, {right arrow over (v)}, Hpos, Vpos;
(b) wherein [x] is a 4×4 block of pixels within a picture and
(c) wherein Hpos is a horizontal pixel position of the 4×4 block within the image;
(d) wherein Vpos is a vertical pixel position of the 4×4 block within the image;
(e) wherein the lowest SATD intra mode is determined among a group comprising: (i) a horizontal intra mode; (ii) a vertical intra mode; and (iii) a steady state (DC) intra mode; and
(f) outputting the lowest SATD intra mode.
The outputting step may comprise:
(a) calculating a horizontal predictor {right arrow over (H)}≡(H0,H1,H2,H3)T, a vertical predictor {right arrow over (V)}≡(V0,V1,V2,V3)T, and a steady state (DC) predictor D;
(b) calculating a horizontal cost precursor Chs and a vertical cost precursor Cvs using the horizontal predictor {right arrow over (H)}, the vertical predictor {right arrow over (V)}, and the steady state (DC) predictor D; and
(c) calculating a horizontal intra mode cost CH, a vertical intra mode cost CV, and a steady state (DC) intra mode cost CD using the horizontal cost precursor Chs and the vertical cost precursor Cvs.
The calculating of the horizontal predictor {right arrow over (H)}, the vertical predictor {right arrow over (V)}, and the steady state (DC) predictor D may comprise:
(a) if Hpos≠0 and Vpos≠0 then:
(b) if Hpos=0 and Vpos≠0 then:
(c) if Hpos≠0 and Vpos=0 then:
(d) if Hpos=0 and Vpos=0 then:
A relatively inefficient calculation of the horizontal cost precursor Chs and the vertical cost precursor Cvs may begin with steps comprising: (a) calculating the values Xi,0, X0,i for iε0, 1, 2, 3 using the relationship
This method is computationally inefficient, as only the Xi,0, X0,i for iε0, 1, 2, 3 values are required. Thus, 9 of 16 values are not required.
In another aspect of the invention, a more computationally efficient calculation of the Xi,0, X0,i for iε0, 1, 2, 3 values may comprise: providing the input [x] matrix and the [T4] matrix; then calculating:
where {right arrow over (u)}T=[1 1 1 1].
After calculating the Xi,0, X0,i for iε0, 1, 2, 3 values, then the horizontal cost precursor
and vertical cost precursor
may be calculated.
The horizontal intra mode cost CH may then be calculated by
The vertical intra mode cost CV may be calculated by
The steady state (DC) intra mode cost CD may be calculated by CD=|D−X0,0|+Chs+Cvs.
The calculating may further comprise: selecting the lowest SATD intra mode with a lowest associated intra mode cost among the group consisting of: the horizontal intra mode cost CH, the vertical intra mode cost CV, and the steady state (DC) intra mode cost CD. This calculating process is typically achieved through a series of logical operations.
Another aspect of the invention is a computer readable medium comprising a programming executable capable of performing on a computer the method described above.
A still further aspect of the invention is an advanced video encoder comprising the method described above. This advanced video encoder may take the form of a specially designed video encoder signal processor embedded within a consumer device, such as a hand held high definition television (HDTV) device.
Yet another aspect of the invention is a method of intra mode prediction, that comprises:
(a) providing a 4×4 block of pixels [x] within a picture, wherein
(b) providing 4 pixels
immediately to the left of the bock [x] in the picture when available;
(c) providing 4 pixels
immediately above the block [x] in the picture when available;
(d) providing a horizontal pixel position (Hpos) of the 4×4 block within the picture;
(e) providing a vertical pixel position (Vpos) of the 4×4 block within the picture; and
(f) outputting a lowest Sum of Absolute Transformed Differences (SATD) intra mode among intra modes while using [x], {right arrow over (h)}, {right arrow over (v)}, Hpos, Vpos, wherein the lowest SATD intra mode is selected from the group consisting of: i. a horizontal intra mode; ii. a vertical intra mode; and iii. a steady state (DC) intra mode.
The 4×4 pixel block [x] is typically oriented in the same sense as the original picture it is taken from, i.e. the indices increase in the same directions both horizontally and vertically. (Note that in the H.264 standard, the orientation is that pixel coordinate 0,0 is at the upper left of the picture.) In this orientation, the {right arrow over (h)} is immediately adjacent to the left side of the 4×4 pixel block [x], and the {right arrow over (v)} is immediately adjacent the top of the 4×4 pixel block [x].
It is readily apparent that the methods above may be easily modified for {right arrow over (h)} aligned on the right edge of the 4×4 pixel block [x], and the {right arrow over (v)} may be aligned with the bottom of the 4×4 pixel block [x]. Similarly, the picture or the 4×4 pixel block [x] indices may be readily reoriented with the orientation of pixel 0,0 being in any quadrant, while still performing the methods described here.
Additionally, the methods described herein may be readily applied to other H.264 pixel blocks, such as the 16×16 macroblock, or indeed generalized to block of size p×q where p and q may be independently selected from a group consisting of 2, 4, 8, 16, 32, 64, and 128.
In another aspect of the invention, the aforementioned steps may be incorporated into a more generalized method. The resultant generalized method of selecting an intra mode in a video image encoder may comprise:
(a) selecting a lowest Sum of Absolute Transformed Differences (SATD) intra mode among intra modes using inputs [x], Hpos, Vpos, {right arrow over (h)}, and {right arrow over (v)};
(b) wherein [x] is a p×q block of pixels within a picture with indices xi,j for iε0, 1, 2, . . . , p−1, jε0, 1, 2, . . . , q−1; and
(c) wherein Hpos is a horizontal pixel position of the p×q block within the image;
(d) wherein Vpos is a vertical pixel position of the p×q block within the image;
(e) wherein {right arrow over (h)} is a horizontal vector immediately left of the p×q block [x], defined as {right arrow over (h)}≡(x0,−1, x1,−1, . . . , xp−1,−1)T relative to the indexing of the elements of [x];
(f) wherein {right arrow over (v)} is a horizontal vector immediately above the p×q block [x], defined as {right arrow over (v)}≡(x−1,0, x−1,1, . . . , x−1,q−1)T relative to the indexing of the elements of [x];
(g) wherein b is the bit depth of a pixel in the picture; (h) wherein the lowest SATD intra mode is determined among a group comprising:
(i) outputting the lowest SATD intra mode to a computer readable medium.
The selecting step may comprise:
(a) calculating a horizontal predictor {right arrow over (H)}≡(H0, H1, . . . , Hp−1)T, a vertical predictor {right arrow over (V)}≡(V0, V1, . . . , Vq−1)T, and a steady state (DC) predictor D;
(b) calculating a horizontal cost precursor Chs and a vertical cost precursor Cvs using the horizontal predictor {right arrow over (H)}, the vertical predictor {right arrow over (V)}, and the steady state (DC) predictor D; and
(c) calculating a horizontal intra mode cost CH, a vertical intra mode cost CV, and a steady state (DC) intra mode cost CD using the horizontal cost precursor Chs and the vertical cost precursor Cvs.
In the method above, the calculating a horizontal predictor {right arrow over (H)}, a vertical predictor {right arrow over (V)}, and a steady state (DC) predictor D may comprise:
(a) if Hpos≠0 and Vpos≠0 then: (i) setting {right arrow over (H)}=[Tpq]{right arrow over (h)} wherein {right arrow over (h)}≡(h0, h1, . . . , hp−1)T≡(x0,0, x1,0, . . . , xp−1,0)T (ii) setting {right arrow over (V)}=[Tpq]{right arrow over (v)} wherein {right arrow over (v)}≡(v0, v1, . . . , vq−1)≡(x0,0, x0,1, . . . , x0,q−1)T; and (iii) setting D=(H0+V0)/2;
(b) if Hpos=0 and Vpos≠0 then: (i) setting {right arrow over (H)}=(215−1,0,0,0)T wherein m is defined as 2m≧p×p×q×2b+1; (ii) setting {right arrow over (V)}=[Tpq]{right arrow over (v)} wherein {right arrow over (v)}≡(v0, v1, . . . , vq)≡(x0,0, x0,1, . . . , x0,q)T; and (iii) setting D=V0;
(c) if Hpos≠0 and Vpos=0 then: (i) setting {right arrow over (H)}=[Tpq]{right arrow over (h)} wherein {right arrow over (h)}≡(h0, h1, . . . , hp−1)T≡(x0,0, x1,0, . . . , xp−1,0)T; (ii) setting {right arrow over (V)}=(215−1,0,0,0)T wherein n is defined as 2n≧q×p×q×2b+1; and (iii) setting D=H0; and
(d) if Hpos=0 and Vpos=0 then: (i) setting {right arrow over (H)}=(215−1,0,0,0)T wherein m is defined as 2m≧p×p×q×2b+1; (ii) setting {right arrow over (V)}=(215−1,0,0,0)T wherein n is defined as 2n≧q×p×q×2b+1; and (iii) setting D=128×16.
The horizontal cost precursor Chs and the vertical cost precursor Cvs may be calculated by steps comprising:
(a) calculating the values Xi,0, X0,j for iε0, 1, 2, . . . , p−1, jε0, 1, 2, . . . , q−1 using the relationship [X]=[Tpq][x][Tpq]T where
(b) calculating the horizontal cost precursor
(c) calculating the vertical cost precursor
The horizontal intra mode cost CH may be calculated using steps comprising: calculating
Similarly, the vertical intra mode cost CV may be calculated with steps comprising: calculating
Finally, the steady state (DC) intra mode cost CD may be calculated with steps comprising: calculating CD=|D−X0,0|+Chs+Cvs.
In the method above, the calculating means may comprise: selecting the lowest SATD intra mode with a lowest associated intra mode cost among the group consisting of: the horizontal intra mode cost CH, the vertical intra mode cost CV, and the steady state (DC) intra mode cost CD.
In another aspect of the invention, a computer readable medium comprising a programming executable may be capable of performing on a computer the methods described above.
In still another aspect of the invention, the methods of intra mode prediction above may be generalized for: (a) where the p dimension is selected from a group of dimensions consisting of: 2, 4, 8, 16, 32, 64, and 128; and (b) where the q dimension is selected from a group of dimensions consisting of: 2, 4, 8, 16, 32, 64, and 128. Still higher dimensions are also possible, however, they are likely computationally prohibitive nature of the increasingly complex calculations.
Further aspects of the invention will be brought out in the following portions of the specification, wherein the detailed description is for the purpose of fully disclosing preferred embodiments of the invention without placing limitations thereon.
The invention will be more fully understood by reference to the following drawings which are for illustrative purposes only:
Referring more specifically to the drawings, for illustrative purposes the present invention is embodied in the method generally depicted in
Introduction
In the H.264/AVC video compression standard, the 4×4 intra mode decision is made by performing Lagrangian rate-distortion optimization on each of the 9 intra mode possibilities. The intra mode selected is the one that provides the lowest distortion (SATD) between the original 4×4 input block, and the reconstructed block for a particular intra mode. This process is extremely computationally expensive.
The process described below reduces the computational cost of the 4×4 intra mode decision significantly, from 24 1D 4 point Hadamard transforms to merely 4 of them.
Definitions
“Computer” means any device capable of performing the steps, methods, or producing signals as described herein, including but not limited to: a microprocessor, a microcontroller, a video processor, a digital state machine, a field programmable gate array (FPGA), a digital signal processor, a collocated integrated memory system with microprocessor and analog or digital output device, a distributed memory system with microprocessor and analog or digital output device connected by digital or analog signal protocols.
“Computer readable medium” means any source of organized information that may be processed by a computer to perform the steps described herein to result in, store, perform logical operations upon, or transmit, a flow or a signal flow, including but not limited to: random access memory (RAM), read only memory (ROM), a magnetically readable storage system; optically readable storage media such as punch cards or printed matter readable by direct methods or methods of optical character recognition; other optical storage media such as a compact disc (CD), a digital versatile disc (DVD), a rewritable CD and/or DVD; electrically readable media such as programmable read only memories (PROMs), electrically erasable programmable read only memories (EEPROMs), field programmable gate arrays (FPGAs), flash random access memory (flash RAM); and information transmitted by electromagnetic or optical methods including, but not limited to, wireless transmission, copper wires, and optical fibers.
“SATD” means the Sum of Absolute Transformed Differences, which is a widely used video quality metric used for block-matching in motion estimation for video compression. It works by taking a frequency transform, usually a Hadamard transform, of the differences between the pixels in the original block and the corresponding pixels in the block being used for comparison. The transform itself is often of a small block rather than the entire macroblock. For example, in H.264/AVC, a series of 4×4 blocks are transformed rather than doing a more processor-intensive 16×16 transform.
“DCT” (Discrete Cosine Transformation) means a process that converts images from three-dimensions (3D) to two-dimensions (2D) by using the Discrete Cosine (DC) coefficient to examine the luminance of each block of pixels used to form an image. This process is typically used in MPEG and JPEG image compression.
“Hadamard Transforms” are defined below. Although the Hadamard transform has numerous applications to signal processing and analysis, this application describes specific applications of the Hadamard Transform in the context of video processing, without limitation of other, more general, applications.
Hadamard transforms are used in the intra mode estimation in the Joint Model (JM) of the Advance Video Coding (AVC) encoder. Refer now to
The Hadamard Vector Transform of a 4 pixel input may mathematically be defined in the following manner. Let {right arrow over (s)}=[s0,s1,s2,s3]T be a vector consisting of 4 elements. The Hadamard transform [T4] of {right arrow over (s)} is defined as {right arrow over (S)}=[S0,S1,S2,S3]T=[T4]{right arrow over (s)} where
The Hadamard Transformation of a 4×4 Block is defined as follows.
Let [y] be a 4×4 block of pixels such that
The Hadamard transform H4×4(y) of [y] is defined as
where
and {right arrow over (u)}T=[1 1 1 1].
Referring back to
Similarly, selection vector {right arrow over (u)}T is used with the input 4×4 pixel block [y], to produce the row zero Hadamard transform of the input 4×4 pixel block [y], which results in [Y0,0 Y0,1 Y0,2 Y0,3].
The combination of the row zero block Hadamard transform, and the column zero block Hadamard transform is denoted as TRC, indicating that only a row and column of the block transformation is to be processed. This may also be referred to as a “pruned” Hadamard transformation.
Process Overview
Referring now to
The 4×1 pixels 208 (the 4 top elements immediately above the input 4×4 block of pixels 202) are used as input into a T4 Hadamard transform 210 to produce a vertical prediction Hadamard transform output 212.
Similarly, the left 1×4 pixels 214 (the 4 left elements immediately left of the input 4×4 block of pixels 202) are used as input into a T4 Hadamard transform 216 to produce a horizontal prediction Hadamard transform output 218.
T4 Hadamard vertical 212 and horizontal 218 predictions are used to estimate the DC prediction 222.
The following inputs are compared 224 to determine the optimal intra mode prediction 226: 1) the pruned 4×4 Hadamard block transform output 206; 2) the vertical prediction Hadamard transform output 212; 3) the horizontal prediction Hadamard transform output 218; and 4) the DC prediction 222.
Only horizontal 218, vertical 212, and DC 222 predictions are used in the intra DCT mode decision coefficients. The intra predictions are performed in the frequency domain.
Refer now to
Estimate the Intra Macroblock DCT Coefficients
To reduce computation, only horizontal, vertical, and DC predictions are used in the estimation of the intra DCT coefficients. In particular, the intra predictions are computed in frequency domain; the DC prediction is derived from the horizontal and vertical predictions. And, finally, the prediction residue with the minimal SATD is selected as the intra mode.
Refer now to
For convenience, the 4×1 column vector to the left of vector (x0,0,x1,0,x2,0,x3,0)T with elements (x0,−1,x1,−1,x2,−1,x3,−1)T is denoted as {right arrow over (h)}=(h0,h1,h2,h3)T 306. The Hadamard transform of h is denoted as {right arrow over (H)}=(H0,H1,H2,H3)T 308.
Similarly, the 1×4 row vector above 4×4 block [x] 302 are (x−1,0,x−1,1,x−1,2,x−1,3), which are for convenience denoted 310 as {right arrow over (v)}=(v0,v1,v2,v3). The Hadamard transform of {right arrow over (v)} 312 is denoted {right arrow over (V)}=(V0,V1,V2,V3)T.
Compute Frequency Domain Predictors for the Intra Vertical, Horizontal, and DC Prediction Modes
This process may be followed more readily by referring to
First, input scalar index positions (Hpos,Vpos) of the top left pixels of a 4×4 pixel block [x] in a picture that begins with pixels 0,0 (the upper left corner of the picture in the H.264 design specification) and continues to pixel position values m,n. Also input the pixel block [x] 402.
Next, from the 4 pixels immediately to the left and above the 4×4 pixel block [x] are denoted as 404 {right arrow over (h)}=(h0,h1, h2, h3)T, {right arrow over (v)}=(v0,v1,v2,v3).
At this point, now calculate the horizontal predictor {right arrow over (H)}=[H0,H1,H2,H3]T, the vertical predictor {right arrow over (V)}=[V0,V1,V2,V3]T, and the steady state (DC) predictor D as follows:
If Hpos≠0 (406) and Vpos≠0 (408), then:
{right arrow over (H)}=[T4]{right arrow over (h)}
{right arrow over (V)}=[T4]{right arrow over (v)}
D=(H0+V0)/2
at (410).
If Hpos=0 (e.g., not Hpos≠0 at 406) and Vpos≠0 (at 412), then:
{right arrow over (H)}=[215−1,0,0,0]T
{right arrow over (V)}=[T4]{right arrow over (v)}
D=V0
at (414).
If Hpos≠0 (406) and Vpos=0 (e.g., not Vpos≠0 at 408), then:
{right arrow over (H)}=[T4]{right arrow over (h)}
{right arrow over (V)}=[215−1,0,0,0]T
D=H0
at (416).
If Hpos=0 (e.g., not Hpos≠0 at 406) and Vpos=0 (e.g., not Vpos≠0 at 412), then:
{right arrow over (H)}=[215−1,0,0,0]T
{right arrow over (V)}=[215−1,0,0,0]T
D=128×16
at (418).
Here, it is assumed that the pixels can only take on 8 bits of information. In particular, the DC predictor 215−1 appearing in block 418 corresponds to the DC prediction for 8 bits per pixel. The predictor {right arrow over (H)}=[215−1,0,0,0]T in 414, 418, and the predictor {right arrow over (V)}=[215−1,0,0,0]T in 416, 418, are selected to make sure that they will have sufficiently large intra prediction cost for 8 bits per pixel, and consequently the corresponding prediction mode will not be selected as the minimal cost intra prediction mode in
Regardless of which calculation branch was taken from 410, 414, 416, or 418, next the cost is calculated 420.
Compute Intra Prediction Cost
Refer now to
Cost precursors are then formed 506
Finally, the costs are calculated 508, where the cost of the horizontal prediction is
the cost of the vertical prediction is
and the cost of the DC prediction is CD=[|D−X0,0]|+Chs+Cvs.
Once the predicted costs are determined, the appropriate intra mode is selected from the group of Horizontal Prediction, Vertical Prediction, and DC Prediction.
Minimal Cost Intra Mode Prediction
The intra mode with the minimal cost is selected as the intra prediction mode. In particular:
If CH≦CV and CH≦CD, then select Horizontal Prediction 510;
If CH≦CV and CH>CD, then select DC Prediction 512;
If CH>CV and CV≦CD, then select Vertical Prediction 514; and finally,
If CH>CV and CV>CD, select DC Prediction 516.
The minimal cost prediction among CH, CV, and CD is then output as the appropriate associated predicted intra mode. From this point, the selected intra mode is used within the advanced video encoder to compress the 4×4 block.
Discussion
As previously described, in the H.264/AVC video compression standard the 4×4 intra mode decision is made by performing Lagrangian rate-distortion optimization on each of the 9 intra mode possibilities. The intra mode selected is the one that provides the lowest SATD between the original 4×4 input block, and the reconstructed block for a particular intra mode, where SATD is directly computed as the sum of absolute value of the transform of the difference. This process is extremely computationally expensive.
For selecting among the horizontal, vertical, and DC intra modes with the direct method, each SATD of these intra modes is separately computed with a 4×4 Hadamard block transform for a total of three 2D Hadamard transforms.
Each 2D 4×4 Hadamard transform is computationally equivalent to 8 of the 1D 4 point Hadamard transforms when row and column decomposition is used for the 2d transform.
In this process, the SATD is computed as the sum of the absolute value of difference of the transform coefficients, and only the horizontal, vertical, and DC transform coefficients are computed using 1D 4 point Hadamard transform. Thus, this process reduces the number of equivalent 1D Hadamard transforms of the horizontal, vertical, and DC intra modes from 24 to 4 by performing the intra mode prediction in the frequency domain, while providing the same intra mode selection as direct calculations of SATD.
Conclusion
Although the description above contains many details, these should not be construed as limiting the scope of the invention but as merely providing illustrations of some of the presently preferred embodiments of this invention. Therefore, it will be appreciated that the scope of the present invention fully encompasses other embodiments which may become obvious to those skilled in the art. In the appended claims, reference to an element in the singular is not intended to mean “one and only one” unless explicitly so stated, but rather “one or more.” All equivalents to the elements of the above-described preferred embodiments that are known to those of ordinary skill in the art are expressly incorporated herein by reference and are intended to be encompassed by the present claims. Moreover, it is not necessary for a device or method to address each and every problem sought to be solved by the present invention, for it to be encompassed by the present claims. Furthermore, no element, component, or method step in the present disclosure is intended to be dedicated to the public regardless of whether the element, component, or method step is explicitly recited in the claims. No claim element herein is to be construed under the provisions of 35 U.S.C. 112, sixth paragraph, unless the element is expressly recited using the phrase “means for.”
Patent | Priority | Assignee | Title |
8406286, | Jul 31 2007 | NEW FOUNDER HOLDINGS DEVELOPMENT LIMITED LIABILITY COMPANY; BEIJING FOUNDER ELECTRONICS CO , LTD ; Peking University | Method and device for selecting best mode of intra predictive coding for video coding |
8787449, | Apr 09 2010 | Sony Corporation | Optimal separable adaptive loop filter |
Patent | Priority | Assignee | Title |
7027510, | Mar 29 2002 | Sony Corporation; Sony Electronics Inc.; Sony Electronics INC | Method of estimating backward motion vectors within a video sequence |
20020159644, | |||
20060209948, | |||
20060251330, | |||
20070009027, | |||
20070053433, | |||
20070171978, | |||
EP1727370, |
Executed on | Assignor | Assignee | Conveyance | Frame | Reel | Doc |
Mar 03 2008 | AUYEUNG, CHEUNG | Sony Electronics, INC | CORRECTIVE ASSIGNMENT | 020883 | /0220 | |
Mar 03 2008 | AUYEUNG, CHEUNG | Sony Corporation | CORRECTIVE ASSIGNMENT | 020883 | /0220 | |
Mar 17 2008 | Sony Corporation | (assignment on the face of the patent) | / | |||
Mar 17 2008 | Sony Electronics Inc. | (assignment on the face of the patent) | / |
Date | Maintenance Fee Events |
Mar 07 2012 | ASPN: Payor Number Assigned. |
Oct 19 2015 | M1551: Payment of Maintenance Fee, 4th Year, Large Entity. |
Oct 17 2019 | M1552: Payment of Maintenance Fee, 8th Year, Large Entity. |
Dec 04 2023 | REM: Maintenance Fee Reminder Mailed. |
May 20 2024 | EXP: Patent Expired for Failure to Pay Maintenance Fees. |
Date | Maintenance Schedule |
Apr 17 2015 | 4 years fee payment window open |
Oct 17 2015 | 6 months grace period start (w surcharge) |
Apr 17 2016 | patent expiry (for year 4) |
Apr 17 2018 | 2 years to revive unintentionally abandoned end. (for year 4) |
Apr 17 2019 | 8 years fee payment window open |
Oct 17 2019 | 6 months grace period start (w surcharge) |
Apr 17 2020 | patent expiry (for year 8) |
Apr 17 2022 | 2 years to revive unintentionally abandoned end. (for year 8) |
Apr 17 2023 | 12 years fee payment window open |
Oct 17 2023 | 6 months grace period start (w surcharge) |
Apr 17 2024 | patent expiry (for year 12) |
Apr 17 2026 | 2 years to revive unintentionally abandoned end. (for year 12) |