Method for image prediction of multi-view video codec and computer readable recording medium therefor

Method for image prediction of multi-view video codec and computer readable recording medium therefor
RE47897

Provided are a method for image prediction of a multi-view video codec capable of improving coding efficiency, and a computer readable recording medium therefor. The method for image prediction of a multi-view video codec includes partitioning an image to a plurality of base blocks, acquiring information of reference images which are temporally different, acquiring information of reference images which have different views, and predicting a target block based on the acquired information. Accordingly, an image that is most similar to an image of a view to be currently compressed is generated using multiple images of different views, so that coding efficiency can be improved.

PTO Wrapper PDF
Dossier Espace Google

Patent RE47897
Priority Jan 11 2007
Filed Sep 06 2018
Issued Mar 03 2020
Expiry Jan 11 2028
Inventors Kim, Je-Woo
Assg.orig Korea Elec…
Assg.curr Korea Elec…
Entity Small
Referenced by 1
References 30
Maint.: currently ok

CROSS-REFERENCE TO R…

0. 18. A method for image decoding of a multi-view video codec, which decodes multi-view images comprising a target-view image and at least one different-view image which is captured from a different viewpoint from a view point capturing the target-view image, the target-view image and the different-view image being captured at a same time, the method comprising:

partitioning the target-view image into a plurality of base blocks to be currently decoded;

predicting a base block among the base blocks based on the different-view image and a different-time image which is captured at a time different from a time when the target-view image is captured, the target-view image and the different-time image being captured from a same viewpoint;

decoding a difference between a prediction result of the base block and the base block; and

reconstructing the base block based on the prediction result and the difference,

wherein the base block is predicted by using the different-time image and the different-view image as reference images, and the predicting the base block comprises performing weighted prediction by applying at least one weighting value to the reference images.

12. A method for image prediction of a multi-view video codec, which encodes multi-view images comprising a base-view target-view image and at least one different-view image which is captured from a different viewpoint from a viewpoint capturing the base-view target-view image, the base-view target-view image and the different-view image being captured at a same time, the method comprising:

partitioning the base-view target-view image into a plurality of base blocks to be currently encoded;

predicting a base block among the base blocks based on the different-view image or and a different-time image which is captured at a time different from a time when the base-view target-view image is captured, the base-view target-view image and the different-time image being captured from a same viewpoint;

acquiring a difference between a prediction result of the base block and the base block, and encoding the difference; and

generating a bitstream comprising the encoded difference,

wherein in case where the base block is predicted by using the different-time image and the different-view image as reference images, and the predicting the base block comprises performing weighted prediction by applying at least one weighting value to the base-view image and the different-view image reference images.

1. A method for image prediction of a multi-view video codec, which encodes multi-view images comprising a base-view target-view image and at least one different-view image which is captured from a different viewpoint from a viewpoint capturing the base-view target-view image, the base-view target-view image and the different-view image being captured at a same time, the method comprising:

partitioning the base-view target-view image into a plurality of base blocks to be currently encoded;

predicting a base block among the base blocks based on the different-view image and a different-time image Which which is captured at a time different from a time when the base-view target-view image is captured, the base-view target-view image and the different-time image being captured from a same viewpoint;

acquiring a difference between a prediction result of the base block and the base block, and encoding the difference; and

generating a bitstream comprising the encoded difference,

wherein the predicting the base block comprises using at least one of a first residual adjusted using a second residual and the second residual adjusted using the first residual, and

wherein the first residual is a difference between the base-view target-view image and the different-time image, and the second residual is a difference between the base-view target-view image and the different-view image.

11. A non-transitory computer-readable recording medium storing a program for executing a method for image prediction of a multi-view video codec, which encodes multi-view images comprising a base-view target-view image and at least one different-view image which is captured from a different viewpoint from a viewpoint capturing the base-view target-view image, the base-view target-view image and the different-view image being captured at a same time, the method comprising:

partitioning the base view target-view image into a plurality of base blocks to be currently encoded;

predicting a base block among the base blocks based on the different-view image and a different-time image which is captured at a time different from a time when the base view target-view image is captured, the base view target-view image and the different-time image being captured from a same viewpoint;

acquiring a difference between a prediction result of the base block and the base block, and encoding the difference; and

generating a bitstream comprising the encoded difference,

wherein the predicting the base block comprises using at least one of a first residual adjusted using a second residual and a second residual adjusted using the first residual, and

2. The method of claim 1, wherein information about the first residual and the second residual is contained in a macroblock layer or a higher layer than the macroblock layer.

3. The method of claim 2, wherein the higher layer than the macroblock layer is a slice header extension (SHE), a picture parameter set extension (PPSE), or a sequence parameter set extension (SPSE).

4. The method of claim 1, wherein the adjusted first residual is a result of adding the second residual to the first residual or subtracting the second residual from the first residual, and the adjusted second residual is a result of adding the first residual to the second residual or subtracting the first residual from the second residual.

5. The method of claim 1, wherein the predicting the base block based on the different-view image comprises performing weighted prediction by applying a weighting value to at least one of the base-view target-view image and the different-view image.

6. The method of claim 5, wherein the generating the bitstream comprises including the weighting value in the bitstream.

7. The method of claim 5, wherein in the predicting the base block, a same weighting value is applied to both the base-view target-view image and the different-view image.

8. The method of claim 5, wherein in the predicting the base block, different weighting values are applied to the base-view target-view image and the different-view image, respectively.

9. The method of claim 5, wherein the generating the bitstream comprises including information about the weighted prediction in the bitstream.

10. The method of claim 1, wherein the different-time image is an image which is the most similar to the base-view target-view image among a plurality of different-time images with respect to the base-view target-view image, and the different-view image is an image which is the most-similar to the base-view target-view image among a plurality of different-view images with respect to the base-view target-view image.

13. The method of claim 12, wherein the generating the bitstream comprises including the weighting value in the bitstream.

14. The method of claim 12, wherein in case where the base block is predicted by using the different-view image, a same weighting value is applied to both the base-view target-view image and the different-view image.

15. The method of claim 12, wherein in case where the base block is predicted by using the different-view image, different weighting values are applied to the base-view target-view image and the different-view image, respectively.

16. The method of claim 12, wherein in case where the base block is predicted by using the different-view image, the generating the bitstream comprises of including information about the weighted prediction in the bitstream.

17. The method of claim 12, wherein the different-time image is an image which is the most similar to the base-view target-view image among a plurality of different-time images with respect to the base-view target-view image, and the different-view image is an image which is the most-similar to the base-view target-view image among a plurality of different-view images with respect to the base-view target-view image.

0. 19. The method of claim 18, wherein the predicting the base block comprises decoding the weighting value from a bitstream.

0. 20. The method of claim 18, wherein the predicting the base block comprises decoding information about the weighted prediction from a bitstream.

CROSS-REFERENCE TO RELATED APPLICATIONS

where Pred denotes a reference image of a specific size, which is most similar to a target block 210 of FIG. 2 in a temporal/spatial domain and can be represented by motion information, and Res denotes residual information indicating a difference between a reference image and a target block 210.

According to the embodiment of the present invention, to minimize this residual information, a method of using residual information present in an image having a different view is proposed, thereby reducing the residual information being currently encoded/decoded. A video codec can be implemented such that Pred is properly selected to minimize Res. As Pred in the multi-view codec, an image that is proper in terms of view or time may be used. Pred may be defined by the following Equation (2):
Pred=F(Pred′+Res′) (2)

That is, Pred is obtained by applying a proper filter, e.g., an LPF such as a deblocking filter in H.264, to a value obtained by adding a residual to a certain reference image.

When Equation (2) is applied to Equation (1), the following Equation (3) can be obtained:
Recon=F(Pred′+Res′)+Res″ (3)
where Pred′ and Res′ are a reference image and a residual of an image that the target block 210 references, respectively. A combination of Pred′ and Res′ that are properly induced is used as a reference image of a current image, i.e., a target image, and residual information therebetween is minimized.

If Equation (3) is rearranged with respect to the terms Pred′ and Res′ by distributing F, F(Pred′) is represented by Pred, and Res is represented by F(Res′)+Res″. Thus, a gain is obtained by transmitting Res″ instead of Res as in the related art.

If Pred acquires a reference image in a temporal domain, the term Res′ is obtained from a view domain, whereas if Pred acquires a reference image in a view domain, the term Res′ is obtained from a temporal domain. F( ), which is a filter suitable for the obtained term Res′ may be additionally used. For example, the simplest filter having a filter coefficient {½, ½} may be used, or a filter such as 1/20{1, −4. 20, 20, −4, 1} may be used.

To report the application of the above techniques, information of the following exemplary format may be used:

{“NewPred is equal to 1” specifies that the current slice uses new view prediction method according to the present invention. “NewPred is equal to 0” specifies that the current slice does not use new view prediction method according to the present invention. When NewPred is not present, NewPred is inferred as 0. It can be located in slice layer or higher layer (SPS, PPS, Slice header, slice header extension, SEI).

“ResPredFlag is equal to 1” specifies that the predictor of the current macroblock is derivate as following,

If the current macroblock is coded by inter mode (temporal direction), residual signal of neighbor view(s) and reference block of the current macroblock(s) are used as the predictor (Deblocking filter can be applied on the predictor).

Otherwise (the current macroblock is coded by inter-view mode (view direction)), residual signal of collocated block(s) and reference block of the current macroblock(s) is used as the predictor (Deblocking filter can be applied on the predictor.)

if possible, with the considering of the global disparity. “ResPredFlag is equal to 0” specifies any residual signal is not predicted. When ViewPredFlag is not present, ViewPredFlag is inferred as 0.}.

According to another embodiment of the present invention, a weighting value or a weighted prediction value may be established. For example, if an image has a certain temporal characteristic such as image fading in/out, a weighting value of an image of a different view may be used. The weighting value means information indicating an extent to which image brightness or chrominance signals change over time as illustrated in FIG. 4. FIG. 4 illustrates a weighting-value reference model according to the embodiment of the present invention.

In general, even if images have different views, they may have similar temporal characteristics. That is, in the case where images are gradually brightened, an encoder may send a proper weighting value to a decoder so that the decoder can collectively apply the weighting value to the images of the different views.

If a different light source is used for each view, it may be difficult to collectively apply the weighting value. In this case, a new weighting value must be used.

As the simplest implementation method, weighting information may be defined for each view. However, this method may be inefficient because multiple redundant information pieces may be transmitted.

According to the embodiment of the present invention, in order to reduce redundant information and overcome a limitation caused by using different light sources, a weighting value of a specific view such as a BaseView or VIEW0 of FIG. 4 is shared, and information reporting whether weighting values of different views are used as they are (hereinafter, referred to as weighting information) is used.

For example, as illustrated in FIG. 4, images of VIEW1 may contain weighting information reporting the use of a weighting value of BaseView (View0), and images of View2 may contain weighting information reporting the use of their own weighting values without using the weighting value of BaseView.

The weighting information is inserted in a bit stream to prevent mutual mis-operation between an encoder and a decoder. The weighting information may be contained in a slice header, a slice header extension or a higher layer such as PPS, PPSE, SPS, SPSE or SEI.

To report the application of the above techniques, information of the following exemplary format may be used:

{“baseview_pred_weight_table_flag is equal to 1” specifies that the variables for weighted prediction are inferred. When baseview_pred_weight_table_flag is not present, it shall be inferred as follows:

If baseViewFlag (which indicates whether baseview or not) is equal to 1, base_pred_weight_table_flag shall be inferred to be equal to 0.

Otherwise, baseview_pred_weight_table_flag shall be inferred to be equal to 1.}.

The method for image prediction of a multi-view video codec and the computer readable recording medium therefor according to the embodiments of the present invention, an image that is most similar to an image of a view to be currently compressed is generated by using inter-view prediction, i.e., using images of multiple different views, thereby improving coding efficiency.

The methods for image prediction of a multi-view video codec according to the exemplary embodiments can be realized as programs and stored in a computer-readable recording medium that can execute the programs. Examples of the computer-readable recording medium include CD-ROM, RAM, ROM, floppy disks, hard disks, magneto-optical disks and the like.

As the present invention may be embodied in several forms without departing from the spirit or essential characteristics thereof, it should also be understood that the above-described embodiments are not limited by any of the details of the foregoing description, unless otherwise specified, but rather should be construed broadly within its spirit and scope as defined in the appended claims, and therefore all changes and modifications that fall within the metes and bounds of the claims, or equivalents of such metes and bounds are therefore intended to be embraced by the appended claims.

INVENTORS:

Kim, Je-Woo, Kim, Yong-Hwan, Park, Ji-Ho, Choi, Byeong-Ho, Shin, Hwa-Seon

THIS PATENT IS REFERENCED BY THESE PATENTS:

Patent	Priority	Assignee	Title
11558597,	Aug 13 2018	LG Electronics Inc	Method for transmitting video, apparatus for transmitting video, method for receiving video, and apparatus for receiving video

THIS PATENT REFERENCES THESE PATENTS:

Patent	Priority	Assignee	Title
6122321,	May 12 1998	Hitachi America, Ltd.; Hitachi America, Ltd	Methods and apparatus for reducing the complexity of inverse quantization operations
6148107,	Sep 06 1996	THOMSON LICENSING S A ; THOMSON LICENSING DTV	Quantization process for video encoding
6480547,	Oct 15 1999	LG Electronics Inc	System and method for encoding and decoding the residual signal for fine granular scalable video
6650705,	May 26 2000	Mitsubishi Electric Research Laboratories, Inc	Method for encoding and transcoding multiple video objects with variable temporal resolution
7280708,	Mar 09 2002	Samsung Electronics Co., Ltd.	Method for adaptively encoding motion image based on temporal and spatial complexity and apparatus therefor
7742657,	Oct 18 2005	Korea Electronics Technology Institute	Method for synthesizing intermediate image using mesh based on multi-view square camera structure and device using the same and computer-readable medium having thereon program performing function embodying the same
7903736,	Nov 04 2005	Electronics and Telecommunications Research Institute	Fast mode-searching apparatus and method for fast motion-prediction
7912298,	Sep 16 2004	NTT DoCoMo, Inc	Video evaluation device, frame rate determination device, video process device, video evaluation method, and video evaluation program
8005145,	Aug 11 2000	Nokia Technologies Oy	Method and apparatus for transferring video frame in telecommunication system
8265156,	Jul 19 2005	Samsung Electronics Co., Ltd.	Video encoding/decoding method and apparatus in temporal direct mode in hierarchical structure
8401080,	Jan 09 2002	DOLBY INTERNATIONAL AB	Motion vector coding method and motion vector decoding method
9042439,	Jan 14 2005	HUMAX CO , LTD	Reference frame ordering for multi-view coding
20030169817,
20030169933,
20030202592,
20050152450,
20050175093,
20050249288,
20060165303,
20060222079,
20060262853,
20060262856,
20070019727,
20070064800,
20070081814,
20070183495,
20070237232,
20070274388,
20080170618,
KR1020060065553,

ASSIGNMENT RECORDS Assignment records on the USPTO

Executed on	Assignor	Assignee	Conveyance	Frame	Reel	Doc
Sep 06 2018		Korea Electronics Technology Institute	(assignment on the face of the patent)

MAINTENANCE FEES AND DATES: Maintenance records on the USPTO

Date	Maintenance Fee Events
Sep 06 2018	BIG: Entity status set to Undiscounted (note the period is included in the code).
Sep 07 2018	SMAL: Entity status set to Small.
May 26 2020	M2551: Payment of Maintenance Fee, 4th Yr, Small Entity.
May 26 2020	M2554: Surcharge for late Payment, Small Entity.
Dec 27 2023	M2552: Payment of Maintenance Fee, 8th Yr, Small Entity.

Date	Maintenance Schedule
Mar 03 2023	4 years fee payment window open
Sep 03 2023	6 months grace period start (w surcharge)
Mar 03 2024	patent expiry (for year 4)
Mar 03 2026	2 years to revive unintentionally abandoned end. (for year 4)
Mar 03 2027	8 years fee payment window open
Sep 03 2027	6 months grace period start (w surcharge)
Mar 03 2028	patent expiry (for year 8)
Mar 03 2030	2 years to revive unintentionally abandoned end. (for year 8)
Mar 03 2031	12 years fee payment window open
Sep 03 2031	6 months grace period start (w surcharge)
Mar 03 2032	patent expiry (for year 12)
Mar 03 2034	2 years to revive unintentionally abandoned end. (for year 12)