Provided are a method for image prediction of a multi-view video codec capable of improving coding efficiency, and a computer readable recording medium therefor. The method for image prediction of a multi-view video codec includes partitioning an image to a plurality of base blocks, acquiring information of reference images which are temporally different, acquiring information of reference images which have different views, and predicting a target block based on the acquired information. Accordingly, an image that is most similar to an image of a view to be currently compressed is generated using multiple images of different views, so that coding efficiency can be improved.
|
0. 18. A method for image decoding of a multi-view video codec, which decodes multi-view images comprising a target-view image and at least one different-view image which is captured from a different viewpoint from a view point capturing the target-view image, the target-view image and the different-view image being captured at a same time, the method comprising:
partitioning the target-view image into a plurality of base blocks to be currently decoded;
predicting a base block among the base blocks based on the different-view image and a different-time image which is captured at a time different from a time when the target-view image is captured, the target-view image and the different-time image being captured from a same viewpoint;
decoding a difference between a prediction result of the base block and the base block; and
reconstructing the base block based on the prediction result and the difference,
wherein the base block is predicted by using the different-time image and the different-view image as reference images, and the predicting the base block comprises performing weighted prediction by applying at least one weighting value to the reference images.
12. A method for image prediction of a multi-view video codec, which encodes multi-view images comprising a base-view target-view image and at least one different-view image which is captured from a different viewpoint from a viewpoint capturing the base-view target-view image, the base-view target-view image and the different-view image being captured at a same time, the method comprising:
partitioning the base-view target-view image into a plurality of base blocks to be currently encoded;
predicting a base block among the base blocks based on the different-view image or and a different-time image which is captured at a time different from a time when the base-view target-view image is captured, the base-view target-view image and the different-time image being captured from a same viewpoint;
acquiring a difference between a prediction result of the base block and the base block, and encoding the difference; and
generating a bitstream comprising the encoded difference,
wherein in case where the base block is predicted by using the different-time image and the different-view image as reference images, and the predicting the base block comprises performing weighted prediction by applying at least one weighting value to the base-view image and the different-view image reference images.
1. A method for image prediction of a multi-view video codec, which encodes multi-view images comprising a base-view target-view image and at least one different-view image which is captured from a different viewpoint from a viewpoint capturing the base-view target-view image, the base-view target-view image and the different-view image being captured at a same time, the method comprising:
partitioning the base-view target-view image into a plurality of base blocks to be currently encoded;
predicting a base block among the base blocks based on the different-view image and a different-time image Which which is captured at a time different from a time when the base-view target-view image is captured, the base-view target-view image and the different-time image being captured from a same viewpoint;
acquiring a difference between a prediction result of the base block and the base block, and encoding the difference; and
generating a bitstream comprising the encoded difference,
wherein the predicting the base block comprises using at least one of a first residual adjusted using a second residual and the second residual adjusted using the first residual, and
wherein the first residual is a difference between the base-view target-view image and the different-time image, and the second residual is a difference between the base-view target-view image and the different-view image.
11. A non-transitory computer-readable recording medium storing a program for executing a method for image prediction of a multi-view video codec, which encodes multi-view images comprising a base-view target-view image and at least one different-view image which is captured from a different viewpoint from a viewpoint capturing the base-view target-view image, the base-view target-view image and the different-view image being captured at a same time, the method comprising:
partitioning the base view target-view image into a plurality of base blocks to be currently encoded;
predicting a base block among the base blocks based on the different-view image and a different-time image which is captured at a time different from a time when the base view target-view image is captured, the base view target-view image and the different-time image being captured from a same viewpoint;
acquiring a difference between a prediction result of the base block and the base block, and encoding the difference; and
generating a bitstream comprising the encoded difference,
wherein the predicting the base block comprises using at least one of a first residual adjusted using a second residual and a second residual adjusted using the first residual, and
wherein the first residual is a difference between the base-view target-view image and the different-time image, and the second residual is a difference between the base-view target-view image and the different-view image.
2. The method of
3. The method of
4. The method of
5. The method of
6. The method of
7. The method of
8. The method of
9. The method of
10. The method of
13. The method of
14. The method of
15. The method of
16. The method of
17. The method of
0. 19. The method of claim 18, wherein the predicting the base block comprises decoding the weighting value from a bitstream.
0. 20. The method of claim 18, wherein the predicting the base block comprises decoding information about the weighted prediction from a bitstream.
|
where Pred denotes a reference image of a specific size, which is most similar to a target block 210 of
According to the embodiment of the present invention, to minimize this residual information, a method of using residual information present in an image having a different view is proposed, thereby reducing the residual information being currently encoded/decoded. A video codec can be implemented such that Pred is properly selected to minimize Res. As Pred in the multi-view codec, an image that is proper in terms of view or time may be used. Pred may be defined by the following Equation (2):
Pred=F(Pred′+Res′) (2)
That is, Pred is obtained by applying a proper filter, e.g., an LPF such as a deblocking filter in H.264, to a value obtained by adding a residual to a certain reference image.
When Equation (2) is applied to Equation (1), the following Equation (3) can be obtained:
Recon=F(Pred′+Res′)+Res″ (3)
where Pred′ and Res′ are a reference image and a residual of an image that the target block 210 references, respectively. A combination of Pred′ and Res′ that are properly induced is used as a reference image of a current image, i.e., a target image, and residual information therebetween is minimized.
If Equation (3) is rearranged with respect to the terms Pred′ and Res′ by distributing F, F(Pred′) is represented by Pred, and Res is represented by F(Res′)+Res″. Thus, a gain is obtained by transmitting Res″ instead of Res as in the related art.
If Pred acquires a reference image in a temporal domain, the term Res′ is obtained from a view domain, whereas if Pred acquires a reference image in a view domain, the term Res′ is obtained from a temporal domain. F( ), which is a filter suitable for the obtained term Res′ may be additionally used. For example, the simplest filter having a filter coefficient {½, ½} may be used, or a filter such as 1/20{1, −4. 20, 20, −4, 1} may be used.
To report the application of the above techniques, information of the following exemplary format may be used:
{“NewPred is equal to 1” specifies that the current slice uses new view prediction method according to the present invention. “NewPred is equal to 0” specifies that the current slice does not use new view prediction method according to the present invention. When NewPred is not present, NewPred is inferred as 0. It can be located in slice layer or higher layer (SPS, PPS, Slice header, slice header extension, SEI).
“ResPredFlag is equal to 1” specifies that the predictor of the current macroblock is derivate as following,
If the current macroblock is coded by inter mode (temporal direction), residual signal of neighbor view(s) and reference block of the current macroblock(s) are used as the predictor (Deblocking filter can be applied on the predictor).
Otherwise (the current macroblock is coded by inter-view mode (view direction)), residual signal of collocated block(s) and reference block of the current macroblock(s) is used as the predictor (Deblocking filter can be applied on the predictor.)
if possible, with the considering of the global disparity. “ResPredFlag is equal to 0” specifies any residual signal is not predicted. When ViewPredFlag is not present, ViewPredFlag is inferred as 0.}.
According to another embodiment of the present invention, a weighting value or a weighted prediction value may be established. For example, if an image has a certain temporal characteristic such as image fading in/out, a weighting value of an image of a different view may be used. The weighting value means information indicating an extent to which image brightness or chrominance signals change over time as illustrated in
In general, even if images have different views, they may have similar temporal characteristics. That is, in the case where images are gradually brightened, an encoder may send a proper weighting value to a decoder so that the decoder can collectively apply the weighting value to the images of the different views.
If a different light source is used for each view, it may be difficult to collectively apply the weighting value. In this case, a new weighting value must be used.
As the simplest implementation method, weighting information may be defined for each view. However, this method may be inefficient because multiple redundant information pieces may be transmitted.
According to the embodiment of the present invention, in order to reduce redundant information and overcome a limitation caused by using different light sources, a weighting value of a specific view such as a BaseView or VIEW0 of
For example, as illustrated in
The weighting information is inserted in a bit stream to prevent mutual mis-operation between an encoder and a decoder. The weighting information may be contained in a slice header, a slice header extension or a higher layer such as PPS, PPSE, SPS, SPSE or SEI.
To report the application of the above techniques, information of the following exemplary format may be used:
{“baseview_pred_weight_table_flag is equal to 1” specifies that the variables for weighted prediction are inferred. When baseview_pred_weight_table_flag is not present, it shall be inferred as follows:
If baseViewFlag (which indicates whether baseview or not) is equal to 1, base_pred_weight_table_flag shall be inferred to be equal to 0.
Otherwise, baseview_pred_weight_table_flag shall be inferred to be equal to 1.}.
The method for image prediction of a multi-view video codec and the computer readable recording medium therefor according to the embodiments of the present invention, an image that is most similar to an image of a view to be currently compressed is generated by using inter-view prediction, i.e., using images of multiple different views, thereby improving coding efficiency.
The methods for image prediction of a multi-view video codec according to the exemplary embodiments can be realized as programs and stored in a computer-readable recording medium that can execute the programs. Examples of the computer-readable recording medium include CD-ROM, RAM, ROM, floppy disks, hard disks, magneto-optical disks and the like.
As the present invention may be embodied in several forms without departing from the spirit or essential characteristics thereof, it should also be understood that the above-described embodiments are not limited by any of the details of the foregoing description, unless otherwise specified, but rather should be construed broadly within its spirit and scope as defined in the appended claims, and therefore all changes and modifications that fall within the metes and bounds of the claims, or equivalents of such metes and bounds are therefore intended to be embraced by the appended claims.
Kim, Je-Woo, Kim, Yong-Hwan, Park, Ji-Ho, Choi, Byeong-Ho, Shin, Hwa-Seon
Patent | Priority | Assignee | Title |
11558597, | Aug 13 2018 | LG Electronics Inc | Method for transmitting video, apparatus for transmitting video, method for receiving video, and apparatus for receiving video |
Patent | Priority | Assignee | Title |
6122321, | May 12 1998 | Hitachi America, Ltd.; Hitachi America, Ltd | Methods and apparatus for reducing the complexity of inverse quantization operations |
6148107, | Sep 06 1996 | THOMSON LICENSING S A ; THOMSON LICENSING DTV | Quantization process for video encoding |
6480547, | Oct 15 1999 | LG Electronics Inc | System and method for encoding and decoding the residual signal for fine granular scalable video |
6650705, | May 26 2000 | Mitsubishi Electric Research Laboratories, Inc | Method for encoding and transcoding multiple video objects with variable temporal resolution |
7280708, | Mar 09 2002 | Samsung Electronics Co., Ltd. | Method for adaptively encoding motion image based on temporal and spatial complexity and apparatus therefor |
7742657, | Oct 18 2005 | Korea Electronics Technology Institute | Method for synthesizing intermediate image using mesh based on multi-view square camera structure and device using the same and computer-readable medium having thereon program performing function embodying the same |
7903736, | Nov 04 2005 | Electronics and Telecommunications Research Institute | Fast mode-searching apparatus and method for fast motion-prediction |
7912298, | Sep 16 2004 | NTT DoCoMo, Inc | Video evaluation device, frame rate determination device, video process device, video evaluation method, and video evaluation program |
8005145, | Aug 11 2000 | Nokia Technologies Oy | Method and apparatus for transferring video frame in telecommunication system |
8265156, | Jul 19 2005 | Samsung Electronics Co., Ltd. | Video encoding/decoding method and apparatus in temporal direct mode in hierarchical structure |
8401080, | Jan 09 2002 | DOLBY INTERNATIONAL AB | Motion vector coding method and motion vector decoding method |
9042439, | Jan 14 2005 | HUMAX CO , LTD | Reference frame ordering for multi-view coding |
20030169817, | |||
20030169933, | |||
20030202592, | |||
20050152450, | |||
20050175093, | |||
20050249288, | |||
20060165303, | |||
20060222079, | |||
20060262853, | |||
20060262856, | |||
20070019727, | |||
20070064800, | |||
20070081814, | |||
20070183495, | |||
20070237232, | |||
20070274388, | |||
20080170618, | |||
KR1020060065553, |
Executed on | Assignor | Assignee | Conveyance | Frame | Reel | Doc |
Sep 06 2018 | Korea Electronics Technology Institute | (assignment on the face of the patent) | / |
Date | Maintenance Fee Events |
Sep 06 2018 | BIG: Entity status set to Undiscounted (note the period is included in the code). |
Sep 07 2018 | SMAL: Entity status set to Small. |
May 26 2020 | M2551: Payment of Maintenance Fee, 4th Yr, Small Entity. |
May 26 2020 | M2554: Surcharge for late Payment, Small Entity. |
Dec 27 2023 | M2552: Payment of Maintenance Fee, 8th Yr, Small Entity. |
Date | Maintenance Schedule |
Mar 03 2023 | 4 years fee payment window open |
Sep 03 2023 | 6 months grace period start (w surcharge) |
Mar 03 2024 | patent expiry (for year 4) |
Mar 03 2026 | 2 years to revive unintentionally abandoned end. (for year 4) |
Mar 03 2027 | 8 years fee payment window open |
Sep 03 2027 | 6 months grace period start (w surcharge) |
Mar 03 2028 | patent expiry (for year 8) |
Mar 03 2030 | 2 years to revive unintentionally abandoned end. (for year 8) |
Mar 03 2031 | 12 years fee payment window open |
Sep 03 2031 | 6 months grace period start (w surcharge) |
Mar 03 2032 | patent expiry (for year 12) |
Mar 03 2034 | 2 years to revive unintentionally abandoned end. (for year 12) |