embedded graphics coding (EGC) is used to encode images with sparse histograms. In EGC, an image is divided into blocks of pixels. For each block, the pixels are converted into binary representations. For each block, the pixels are scanned and encoded bit-plane by bit-plane from the most significant bit-plane (MSB) to the least significant bit-plane (LSB). The pixels in the block are partitioned into groups. Each group contains pixels with the same value. From the MSB to the LSB, the groups in the current bit plane are processed. During the processing, a group is split into two, if pixels in the group have different bit values in the bit-plane being encoded. Then, the encoder sends the refinement bit for each pixel in the group and the encoder splits the original group into two. A method is described herein to compress the refinement bits which employs context-adaptive prediction and binary run-length coding.
|
1. A method of encoding programmed in a controller in a device comprising:
a. performing embedded graphics coding by
i. partitioning an image into blocks;
ii. converting pixels of the blocks into binary representations;
iii. partitioning the pixels into groups; and
iv. processing the groups in a current bit-plane;
b. applying context-adaptive prediction comprising performing prediction on refinement bits and generating prediction residuals, if members of a group have a different bit value; and
c. applying binary run-length coding including applying an adaptive codebook to the prediction residuals.
17. A system for encoding programmed in a controller in a device comprising:
a. an embedded graphics coding module to process an image and send refinement bits by:
i. partitioning an image into blocks;
ii. converting pixels of the blocks into binary representations;
iii. partitioning the pixels into groups; and
iv. processing the groups in a current bit-plane;
b. context-adaptive prediction module to apply context-adaptive prediction to the refinement bits and generate prediction residuals, if members of a group have a different bit value; and
c. a binary run-length coding module to apply binary run-length coding to the prediction residuals including applying an adaptive codebook to the prediction residuals.
2. A method of encoding programmed in a controller in a device comprising:
a. partitioning an image into blocks;
b. setting a dynamically changeable bit budget for the blocks;
c. converting pixels of the blocks into binary representations;
d. partitioning the pixels into groups;
e. processing the groups in a current bit-plane;
f. if members of a group of the groups have a same bit value, then a same indicator is indicated; and
g. if the members of the group have a different bit value, then:
i. a different indicator is indicated;
ii. refinement bits are sent;
iii. the group is split;
iv. context-adaptive prediction is applied to the refinement bits to generate prediction residuals; and
v. binary run-length coding is applied to the prediction residuals.
20. A camera device comprising:
a. a video acquisition component for acquiring a video;
b. a memory for storing an application, the application for:
i. performing embedded graphics coding by
A. partitioning an image into blocks;
B. converting pixels of the blocks into binary representations;
C. partitioning the pixels into groups; and
D. processing the groups in a current bit-plane;
ii. applying context-adaptive prediction comprising performing prediction on refinement bits and generating prediction residuals, if members of a group have a different bit value; and
iii. applying binary run-length coding including applying an adaptive codebook to the prediction residuals; and
c. a processing component coupled to the memory, the processing component configured for processing the application.
3. The method of
4. The method of
5. The method of
6. The method of
7. The method of
8. The method of
10. The method of
a. initializing a context model;
b. coding the refinement bits in a raster-scan order;
c. predicting using the context model;
d. updating the context model; and
e. calculating a prediction residual.
11. The method of
14. The method of
15. The method of
16. The method of
18. The system of
19. The system of
|
This application claims the benefit of U.S. Provisional Patent Application Ser. No. 61/273,611, filed Aug. 5, 2009 and entitled “A Method for Improving the Performance of Embedded Graphics Coding,” which is hereby incorporated by reference in its entirety for all purposes.
The present invention relates to the field of image processing. More specifically, the present invention relates to embedded graphics coding.
Most image compression schemes are designed for “natural images” such as photos taken by a digital camera. For natural images, strong correlation exists among neighboring pixels. Hence, most image compression schemes work as follows:
1. The pixels are decorrelated using prediction or transform or both, resulting in a sparse histogram of the prediction residuals or transform coefficients. The histogram has a single peak which is located around 0.
2. Quantization is applied as necessary.
3. The (quantized) prediction residuals or transform coefficients are entropy coded. The entropy coder is designed for distributions described above. If the distribution has a significantly different shape, the coding performance is able to be poor.
However, there are many “unnatural images” such as images of graphics or text which typically have a large dynamic range, strong contrast, sharp edges, strong textures and sparse histograms. These types of images are usually not handled well by conventional image compression algorithms. Inter-pixel correlation is weaker, and prediction or transform does not provide a sparse distribution as it does for natural images.
Some schemes have been proposed for unnatural images. One example is referred to as “histogram packing” where the encoder goes through the whole image, computes the histogram and does a non-linear mapping of the pixels before compressing the image. The compression requires a two-pass processing, causing increased memory cost and more computations. The bitstream is not scalable which means that the decoder needs the whole bitstream to decode the image. Partial reconstruction is not possible without re-encoding.
Embedded Graphics Coding (EGC) is used to encode images that have sparse histograms. In EGC, an image is divided into blocks of pixels. For each block, the pixels are converted into binary representations. For each block, the pixels are scanned and encoded bit-plane by bit-plane from the most significant bit-plane (MSB) to the least significant bit-plane (LSB). The pixels in the block are partitioned into groups. Each group contains pixels that have the same value. From the MSB to the LSB, the groups in the current bit plane are processed. During the processing, a group is split into two, if pixels in the group have different bit values in the bit-plane being encoded. Then, the encoder sends the refinement bit for each pixel in the group and the encoder splits the original group into two. A method is described herein to compress the refinement bits which employs context-adaptive prediction and binary run-length coding.
In one aspect, a method of encoding programmed in a controller in a device comprises performing embedded graphics coding, applying context-adaptive prediction and applying binary run-length coding. Performing embedded graphics coding further comprises partitioning an image into blocks, converting pixels of the blocks into binary representations, partitioning the pixels into groups and processing the groups in a current bit-plane. Applying context-adaptive prediction further comprises performing prediction on refinement bits and generating prediction residuals. Applying binary run-length coding further comprises applying a (fixed or adaptive) codebook to the prediction residuals.
In another aspect, a method of encoding programmed in a controller in a device comprises partitioning an image into blocks, converting pixels of the blocks into binary representations, partitioning the pixels into groups, processing the groups in a current bit-plane, if members of a group of the groups have a same bit value, then a same indicator is indicated and if the members of the group have a different bit value, then a different indicator is indicated, refinement bits are generated, the group is split, context-adaptive prediction is applied to the refinement bits to generate prediction residuals and binary run-length coding is applied to the prediction residuals. The method further comprises setting a bit budget for the blocks. The bit budget is able to be dynamically changed. The context-adaptive prediction is selected from the group consisting of one-dimensional prediction and two-dimensional prediction. The bits are vectorized in a one-dimensional array in the one-dimensional prediction. The bits are vectorized by scanning the current bit-plane in a raster scan order while skipping pixels in a different group. The prediction residual is an exclusive- or between an original bit and a predicted bit. A context model is utilized in the one-dimensional prediction and the two-dimensional prediction. Two adjacent pixels are used for the two-dimensional prediction. Two-dimensional prediction comprises initializing a context model, coding the refinement bits in a raster-scan order, predicting using the context model, updating the context model and calculating a prediction residual. The prediction residual is entropy coded using binary run-length coding. The binary run-length coding includes variable length coding. Variable length coding utilizes a codebook. The codebook is selected from the group consisting of a fixed codebook or an adaptively selected codebook. The controller is selected from the group consisting of a programmed computer readable medium and an application-specific circuit. The device is selected from the group consisting of a personal computer, a laptop computer, a computer workstation, a server, a mainframe computer, a handheld computer, a personal digital assistant, a cellular/mobile telephone, a smart appliance, a gaming console, a digital camera, a digital camcorder, a camera phone, an iPhone, an iPod®, a video player, a DVD writer/player, a television and a home entertainment system.
In another aspect, a system for encoding programmed in a controller in a device comprises an embedded graphics coding module for processing an image and generating refinement bits, context-adaptive prediction module for applying context-adaptive prediction to the refinement bits and generating prediction residuals and a binary run-length coding module for applying binary run-length coding to the prediction residuals. The controller is selected from the group consisting of a programmed computer readable medium and an application-specific circuit. The device is selected from the group consisting of a personal computer, a laptop computer, a computer workstation, a server, a mainframe computer, a handheld computer, a personal digital assistant, a cellular/mobile telephone, a smart appliance, a gaming console, a digital camera, a digital camcorder, a camera phone, an iPhone, an iPod®, a video player, a DVD writer/player, a television and a home entertainment system.
In yet another aspect, a camera device comprises a video acquisition component for acquiring a video, a memory for storing an application, the application for performing embedded graphics coding, applying context-adaptive prediction and applying binary run-length coding and a processing component coupled to the memory, the processing component configured for processing the application.
Embedded Graphics Coding (EGC) is able to be used to encode images that have sparse histograms. Thus, it is typically used for, but not limited to, graphical or textual images. EGC provides a lossy to lossless scalability which means the bitstream is able to be stopped in the middle of the bitstream, and the decoder is able to still make a reasonable decoded block based on the bits it has received. If the entire bitstream is sent, the image is able to be losslessly reconstructed. EGC has low complexity and a high coding performance.
In EGC, an image (such as each frame of a video) is divided into blocks. A bit budget is set for each block. The bit budget is agreed upon by the encoder and decoder. In some embodiments, the bit budget is able to be dynamically changed if the bandwidth is time-varying. The coded bitstream is fully embedded, so that the bit budget is able to be changed arbitrarily. For each block, the pixels are converted into binary representations. Predictions and transforms are not performed because the pixels are assumed to have a sparse histogram. For each block, the pixels are scanned and encoded bit-plane by bit-plane from the most significant bit-plane (MSB) to the least significant bit-plane (LSB). The encoding process of each bit-plane is described herein. The encoding is able to be stopped if the bit budget of the current block is depleted or if the reconstructed block has reached the end of the LSB. If the reconstructed block has reached the end of the LSB, the decoder also knows where the end of the block is. For each block, if all of the bit-planes are coded, the compression is lossless.
The pixels in the block are partitioned into groups. Each group contains pixels that have the same value. A group is able to be split into two, if pixels in the group have different bit values in the bit-plane being encoded. For sparse histogram images, group splitting rarely occurs.
A method of encoding a bit-plane is used in EGC. Before coding the MSB, the pixels are assumed to be in the same group. Then, from the MSB to the LSB, the groups in the current bit plane are processed, where for each group: the encoder sends a “0” if all group members have the same bit value in the current bit-plane (and then sends a “0” or a “1” to indicate the value) or the encoder sends a “1”. Following the “1”, the encoder sends the refinement bit for each pixel in the group and the encoder splits the original group into two. The decoder is able to split the group based on the refinement bits.
As described in the approach above, when a group is split, raw bits are sent for all members of the group in the current bit-plane without any compression. A method is described herein to compress these bits which employs context-adaptive prediction and binary run-length coding.
Context-Adaptive Prediction
Prediction is usually used in compression to reduce the information entropy of the signal to be coded. The signal to be coded is binary. Before prediction, the raw bits are expected to have equal probability of 0 and 1, hence the information entropy is 1 bit per sample. However, after prediction, a prediction residual is obtained and the probability of the prediction residual being 0 is more than the probability of the prediction residual being 1, hence the information entropy is expected to be less than 1 bit per sample.
There are two methods of context-adaptive prediction described herein: 1D prediction and 2D prediction. The former has lower complexity, and the latter has higher performance. In both cases, contexts are used to improve the prediction performance. A context model is usually initialized first.
One-Dimensional (1D) Prediction
In the 1D prediction case, the bits to be coded (bits in the current group) are vectorized into a 1D array.
1D Context-Adaptive Prediction
The bits to be coded are vectorized as [b1, b2, . . . , bn]. The number of occurrences of a binary pair (bi-1, bi) are recorded by maintaining a table as shown in
Two Dimensional (2D) Context-Adaptive Prediction
The 1D prediction is able to be extended to the 2D case. In some embodiments, the left and upper neighbors are used for prediction. Binary contexts are still used, with the entry being (bup, bleft). The total number of contexts are reduced. However, if the neighbors are in different groups, their values are not used in the current-bitplane (as is) for context. Context modeling is performed such that: if the left pixel is in the same group as the current pixel, bleft is the bit in the current bit-plane of the left pixel. If the left pixel is in the same group as the current pixel, bleft is the logic (left>current), where: ‘left’ is the value of the left pixel, only taking into account its MSB down to the previous bit-plane; ‘current’ is the value of the current pixel, only taking into account the MSB down to the previous bit-plane. For bup, if the up pixel is in the same group as the current pixel, bup is the bit in the current bit-plane of the up pixel. If the up pixel is not in the same group as the current pixel, bup is the logic (up>current), where: ‘up’ is the value of the up pixel, only taking into account its MSB down to the previous bit-plane; ‘current’ is the value of the current pixel, only taking into account the MSB down to the previous bit-plane.
Binary Run-Length Coding
Run-length coding is an entry coding method. It is efficient if the signal to be coded has a lot of consecutive 0's. The prediction residual is still binary, so that it is referred to as binary run-length coding (BRLC). Compared to standard run-length coding, BRLC does not code levels (e.g. level is always 1). An example includes, a residual signal: 1,0,0,0,1,0,1,1,0,0,0,0 and the BRLC from the 2nd bit is to encode 3, 1, 0, 4.
For run-length coding, variable length coding (VLC) is used. The VLC is able to be implemented using a codebook.
A fixed codebook is able to be used or a selective method similar to that of JPEG-LS is able to be used. The selective method is able to improve efficiency by allowing a choice of one of several codebooks. In the selective method, the codec adaptively selects, for example, one of the codebooks in
Adaptive Codebook Selection
If a codeword ‘1’ is observed, then there is potentially a long run, therefore, g=g+1 is used for the next run. If other codewords are observed, then g=g−1 is used for the next run. In some embodiments, selecting the code book makes use of the block size. Both the encoder and the decoder know the size of the group, and g never grows beyond:
ceil(log2(number of not yet coded group members))
where the notation ceil(x) is defined as the smallest integer that is not smaller than x.
In some embodiments, the IEGC application(s) 830 include several applications and/or modules. As described herein, modules such as an embedded graphics coding module, a context adaptive prediction module and a binary run-length coding module. In some embodiments, modules include one or more sub-modules as well. In some embodiments, fewer or additional modules are able to be included.
Examples of suitable computing devices include a personal computer, a laptop computer, a computer workstation, a server, a mainframe computer, a handheld computer, a personal digital assistant, a cellular/mobile telephone, a smart appliance, a gaming console, a digital camera, a digital camcorder, a camera phone, an iPod®/iPhone, a video player, a DVD writer/player, a television, a home entertainment system or any other suitable computing device.
To utilize the improved embedded graphics coding method, a user acquires a video/image such as on a digital camcorder, and while or after the video is acquired, or when sending the video to another device such as a computer, the improved embedded graphics coding method automatically encodes each image of the video, so that the video is encoded appropriately to maintain a high quality video. The improved embedded graphics coding method occurs automatically without user involvement.
In operation, improved embedded graphics coding is used when a group is to be split at the current bit-plane. Context-adaptive prediction and binary run-length coding are used. The prediction is able to be 1D or 2D. The binary run-length coding is able to use a fixed codebook or an adaptive codebook. Each image block is processed from the MSB to the LSB, hence the resulting bitstream is still embedded. When encoding a group in the current bit-plane, no information is utilized from another group in the same bit-plane, thus different groups in the same bit-plane are able to be encoded in parallel. The improved embedded graphics coding method is able to be used in any implementation including, but not limited to, wireless high definition (Wireless HD).
The improved embedded graphics coding method described herein is able to be used with videos and/or images.
High definition video is able to be in any format including but not limited to HDCAM, HDCAM-SR, DVCPRO HD, D5 HD, XDCAM HD, HDV and AVCHD.
The present invention has been described in terms of specific embodiments incorporating details to facilitate the understanding of principles of construction and operation of the invention. Such reference herein to specific embodiments and details thereof is not intended to limit the scope of the claims appended hereto. It will be readily apparent to one skilled in the art that other various modifications may be made in the embodiment chosen for illustration without departing from the spirit and scope of the invention as defined by the claims.
Liu, Wei, Gharavi-Alkhansari, Mohammad
Patent | Priority | Assignee | Title |
8502708, | Dec 09 2008 | Nippon Telegraph and Telephone Corporation | Encoding method and decoding method, and devices, program and recording medium for the same |
Patent | Priority | Assignee | Title |
4546385, | Jun 30 1983 | International Business Machines Corporation | Data compression method for graphics images |
5850261, | Oct 15 1992 | Sony Corporation | Efficient variable length encoder and decoder |
5903676, | Nov 10 1994 | WU, XIAOLIN 1% RETAINED ; CHINESE UNIVERSITY OF HONG KONG, THE 99% | Context-based, adaptive, lossless image codec |
6091777, | Sep 18 1997 | SYS TECHNOLOGIES | Continuously adaptive digital video compression system and method for a web streamer |
6272180, | Nov 21 1997 | RAKUTEN, INC | Compression and decompression of reference frames in a video decoder |
6567081, | Jan 21 2000 | Microsoft Technology Licensing, LLC | Methods and arrangements for compressing image-based rendering (IBR) data using alignment and 3D wavelet transform techniques |
6614939, | Aug 05 1997 | Matsushita Electric Industrial Co., Ltd | Image compression apparatus and decoding apparatus suited to lossless image compression |
6909811, | Mar 06 1998 | Canon Kabushiki Kaisha | Image processing apparatus and method and storage medium storing steps realizing such method |
6983075, | Feb 15 2001 | Ricoh Co., Ltd | Method and apparatus for performing selective quantization by manipulation of refinement bits |
7085424, | Jun 06 2000 | Kabushiki Kaisha Office NOA | Method and system for compressing motion image information |
7194140, | Nov 05 2001 | Canon Kabushiki Kaisha | Image processing apparatus and method which compresses image data of each region using a selected encoding method |
7321697, | Apr 28 2000 | Sun Microsystems, Inc. | Block-based, adaptive, lossless image coder |
7356191, | Dec 02 2002 | Sony Corporation | Image encoding apparatus and method, program and recording medium |
7505624, | May 27 2005 | ATI Technologies ULC | Block-based image compression method and apparatus |
7567719, | Jan 21 2000 | Nokia Corporation | Method for encoding images, and an image coder |
7742645, | Feb 28 2006 | SOCIONEXT INC | Encoding device and method |
8170357, | Nov 21 2003 | Samsung Electronics Co., Ltd. | Apparatus and method for generating coded block pattern for alpha channel image and alpha channel image encoding/decoding apparatus and method using the same |
8213729, | Apr 06 2007 | Canon Kabushiki Kaisha | Multidimensional data encoding apparatus and decoding apparatus, and control method thereof |
20050001928, | |||
20050152605, | |||
20050195897, | |||
20070217695, | |||
20080043846, | |||
20080101464, | |||
20080297620, | |||
20090202164, | |||
20110033127, | |||
EP2081155, | |||
WO49571, |
Executed on | Assignor | Assignee | Conveyance | Frame | Reel | Doc |
Jul 15 2010 | Sony Corporation | (assignment on the face of the patent) | / | |||
Jul 15 2010 | LIU, WEI | Sony Corporation | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 024693 | /0854 | |
Jul 15 2010 | GHARAVI-ALKHANSARI, MOHAMMAD | Sony Corporation | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 024693 | /0854 |
Date | Maintenance Fee Events |
Sep 13 2012 | ASPN: Payor Number Assigned. |
Apr 11 2016 | M1551: Payment of Maintenance Fee, 4th Year, Large Entity. |
Apr 09 2020 | M1552: Payment of Maintenance Fee, 8th Year, Large Entity. |
Mar 21 2024 | M1553: Payment of Maintenance Fee, 12th Year, Large Entity. |
Date | Maintenance Schedule |
Oct 09 2015 | 4 years fee payment window open |
Apr 09 2016 | 6 months grace period start (w surcharge) |
Oct 09 2016 | patent expiry (for year 4) |
Oct 09 2018 | 2 years to revive unintentionally abandoned end. (for year 4) |
Oct 09 2019 | 8 years fee payment window open |
Apr 09 2020 | 6 months grace period start (w surcharge) |
Oct 09 2020 | patent expiry (for year 8) |
Oct 09 2022 | 2 years to revive unintentionally abandoned end. (for year 8) |
Oct 09 2023 | 12 years fee payment window open |
Apr 09 2024 | 6 months grace period start (w surcharge) |
Oct 09 2024 | patent expiry (for year 12) |
Oct 09 2026 | 2 years to revive unintentionally abandoned end. (for year 12) |