A method of image compression includes significance switching of DCT coefficients in block-based embedded DCT procedures. Bitwise digitized DCT coefficients are passed through successive significance sweeps of the whole image from the most significant down to the least significant coefficient bit planes. With each new sweep, newly significant coefficients may appear within a block, and block-masking is used to transmit the addresses of those newly significant coefficients. An off-mask may also be used. The invention further relates to a hardware or software-based image encoder.
|
18. A coder for encoding images device, comprising the steps of:
(a) means for dividing an image to be compressed into a plurality of image blocks;
(b) means for carrying out a two-dimensional block transform on each block to produce a corresponding plurality of coefficient blocks;
(c) means for bitwise digitizing the coefficients within each coefficient block to define a plurality of bit planes for each coefficient block;
(d) means for defining a group of one or more consecutive bit planes starting with the most significant bit plane;
(e) means for a mechanism configured to selectively flagging those flag coefficients which first become significant within the a group of one or more consecutive bit planes of corresponding coefficient blocks of image blocks resulting from a block transform, starting with the most significant bit plane; and
(f) means for a mechanism configured to transmitting transmit information representative of the positions of the said flagged coefficients and transmitting the bits within the a group of the said flagged coefficients; and,
(g) means for repeating steps (d) to (f) one or more times, with each new group starting with the most significant bit plane not previously dealt with, and means for transmitting, at each repeated pass, the bits within the current group of those coefficients which were previously flagged on an earlier pass.
1. A method of image compression comprising the steps of:
(a) dividing an image to be compressed into a plurality of image blocks;
(b) carrying out a two-dimensional block transform on each block to produce a corresponding plurality of coefficient blocks;
(c) bitwise digitizing the coefficients within each coefficient block to define a plurality of bit planes for each coefficient block;
(d) defining a group of one or more consecutive bit planes starting with the most significant bit plane;
(e) selectively flagging those, by a coder device, coefficients which first become significant within the a group of one or more consecutive bit planes of corresponding coefficient blocks of image blocks resulting from a block transform, starting with the most significant bit plane; and
(f) transmitting, by the coder device, information representative of the positions of the said flagged coefficients and transmitting the bits within the a group of the said flagged coefficients; and,
(g) repeating steps (d) to (f) one or more times, with each new group starting with the most significant bit plane not previously dealt with and, at each repeated pass, also transmitting the bits within the current group of those coefficients which were previously flagged on an earlier pass.
21. A video coder/decoder comprising:
a coder and an associated decoder, wherein
(1) the coder encoding images and comprising the steps of comprises:
(a) means for dividing a mechanism configured to divide an image to be compressed into a plurality of image blocks;
(b) means for a mechanism configured to carrying carry out a two-dimensional block transform on each block to produce a corresponding plurality of coefficient blocks;
(c) means for a mechanism configured to bitwise digitizing the digitize coefficients with within each coefficient block to define a plurality of bit planes for each coefficient block;
(d) means for defining a mechanism configured to define a group of one or more consecutive bit planes starting with the a most significant bit plane;
(e) means for a mechanism configured to selectively flagging flag those coefficients which first become significant within the group;
(f) means for a mechanism configured to transmitting transmit information representative of the positions of the said flagged coefficients and for transmitting the bits within the a group of the said flagged coefficients; and,
(g) means for a mechanism configured to repeating steps (d) to (f) repeat operation of the mechanism configured to define a group and the mechanism configured to transmit information one or more times, with each new group starting with the a most significant bit plane not previously dealt with, and means for transmitting, at each repeated pass, the bits within the a current group of those coefficients which were previously flagged on an earlier pass, and
(2) the decoder being arranged to maintain a running record, as transmission between the coder and the decoder proceeds, of the coefficients which are currently significant.
2. A method as claimed in
repeating the selectively flagging and the transmitting one or more times, each with a new group starting with a most significant bit plane not previously dealt with and, at each repeated pass, also transmitting bits within a current group of those coefficients which were previously flagged on an earlier pass.
3. A method as claimed in claim 1 2 in which step (g) the repeating is separately repeated performed for each image block.
4. A method as claimed in claim 1 2 in which the block transform is the a two-dimensional Discrete Cosine Transform.
5. A method as claimed in claim 1 2 in which the block transform is the a Lapped Orthogonal Transform.
7. A method as claimed in claim 1 2 further including, at step (f) transmitting, by the coder device, mask information representative of a binary mask which defines the positions of the said selected flagged coefficients.
8. A method as claimed in
9. A method as claimed in
10. A method as claimed in
11. A method as claimed in
12. A method as claimed in
13. A method as claimed in
15. A method as claimed in claim 1 2 in which the a binary mask defines the positions of the selected said flagged coefficients within each coefficient block in JPEG zig-zag order, the binary mask is associated with a mask length code to define the a mask end point, and the mask length code defines the a mask end point zig-zag address.
16. A method as claimed in claim 1 2 in which the a binary mask defines the positions of the selected said flagged coefficients within each coefficient block in JPEG zig-zag order, the binary mask is associated with a mask length code to define the a mask end point, and the mask length code defines the a Manhattan distance from a DC term to the mask end point.
17. A method as claimed in claim 1 2 further including the step of transmitting information representative of a binary off-mask for defining the positions of coefficients whose bits are no longer required to be sent.
19. A coder device as claimed in
20. A coder device as claimed in
0. 22. A method as claimed in claim 2 in which the repeating step is carried out across an entire image to be compressed.
|
This Application is a continuation of International Application No. PCT/GB98/00360, filed Feb. 5, 1998, now pending (which is hereby incorporated by reference).
The present invention relates to image compression and particularly, although not exclusively, to a progressive block-based embedded DCT coder, and to a method of encoding.
The JPEG baseline method for still image coding uses the Discrete Cosine Transform (DCT) in a fixed 8×8 pixel partition. Through a linear quantization table and zig-zag scanning of DCT coefficients, the redundancy and band width characteristics of the DCT are exploited over a range of compressions. Recently, however, it has become clear that the JPEG coder is not particularly efficient at higher compression ratios, and other methods such as wavelets have produced better results while having the advantage of being fully embedded. Some researchers have also attempted to combine DCT with zerotree quantization, usually associated with wavelet transforms: see Xiong, Guleryuz and Orchard, ‘A DCT-Based Image Coder’, IEEE Sig. Proc. Lett., Vol 3, No 11, November 1996, p289.
It is an object of the present invention to advance the field of image compression generally, and in particular to provide an improved method of image compression which is capable of use with well understood transforms such as the DCT.
According to a first aspect of the present invention this is provided a method of image compression comprising:
Such a method could also be applied using the one-dimensional DCT to audio recording.
According to a second aspect of the invention there is provided a coder for encoding images, comprising:
Preferably, the encoder provides for significance switching of DCT coefficients in block-based embedded DCT image compression. The encoder provides output on one or more data streams that may be terminated within a few bits of any point.
The invention also extends to a video coder/decoder including a coder as claimed in claim 17 and an associated decoder, the decoder being arranged to maintain a running record, as transmission between the coder and the decoder proceeds, of the coefficients which are currently significant.
The preferred two-dimensional block transform of the present invention is the Discrete Cosine Transform, although other transforms such as the Fast Fourier Transform (FFT) or the Lapped Orthogonal Transform could be used.
It will be appreciated that in the method of the present invention the order of transmission is not significant. It is therefore to be understood, except where logic requires, that the various parts of the method are not necessarily carried out sequentially in the order specified in section (d) of claim 1. For example, the bits from the newly selected coefficients could be transmitted either before or after the bits of those coefficients which were previously flagged on an earlier path. Similarly, the bits of those coefficients which were previously flagged on an earlier path could be transmitted either before or after the new coefficients are selected for flagging.
The bit planes are swept consecutively from the most significant bit plane to the least significant bit plane. This may either be repeated separately for each image block, or alternatively all blocks may be dealt with at the first bit plane, then all blocks dealt with at the second bit plane, and so on.
The philosophy of significance switching, as used in the present invention, is that the overheads introduced will be compensated for by the savings in not transmitting bits for small coefficients until they are switched on. Good performance might naturally be expected at high compression ratios, but what is surprising is the excellent performance both for lossless compression and for compression at low ratios. The coder of the present invention is preferably embedded, in other words the bit stream can be stopped within a few bits of any point while still guaranteeing the least possible distortion overall. When used with an appropriate decoder, either coder or decoder can terminate the bit stream as needed, dependent upon the available bandwidth or the available bit budget.
The coder of the present invention has been found to out-perform the base-line JPEG method in peak signal to noise ratio (PSNR) at any compression ratio, and is similar to state-of-the-art wavelet coders. The results show that block transform (for example DCT) coding is competitive across the whole range of compression ratios, including lossless, so that a significance-switched block coder would be capable of meeting the requirements of future image compression standards in an evolutionary manner.
The invention may be carried into practice a number of ways and one specific embodiment will now be described, by way of example, with reference to the figures in which:
In the preferred the preferred method of the present invention, the image to be encoded is first partitioned into a plurality of square image blocks. The partitioning may either be by way of a regular tiling, for example of 8×8 pixel blocks, or alternatively some more complex tiling using blocks of differing sizes. One convenient method of tiling is to vary the block size across the image according to the power in the image (measured by the sum of the squares of pixel intensities). As shown in
The encoded integer coefficients are now required to be manipulated in a progressive fashion, and transmitted within a data-stream that can be rapidly terminated at any point. In order to achieve this, the integer coefficients are first rearranged into an ordered array using the zig-zag sequence of the JPEG standard.
Turning now to
In order to determine the order in which the individual bits will be transmitted, the following algorithm is used:
Sweep DCT bit planes from MSB to LSB
Next bit plane
This algorithm may perhaps best be described by way of example, with reference to
Once the significant coefficients have been selected, all of the corresponding bits in the lower bit planes 2 to 4 are automatically switched on, as indicated by the filled in squares representing the bits of coefficients A to D. The selected bits within bit plane 1 are then transmitted.
Next, a second sweep is made over the whole image across bit plane 2. Bits from coefficients which have already been switched on in the previous sweep are automatically transmitted, so in this example bits 1001 are transmitted, these representing the second most significant bits for the coefficients A to D. Any coefficients which newly become significant at this level, are switched on, as illustrated by the crossed squares representing the second, third and fourth bits for coefficient E. The bits for all such newly significant coefficients on bit plane 2 are also transmitted.
Another sweep is then carried out on the third bit plane. Once again, all the bits representative of coefficients which have already been switched on are automatically sent: in this case, these are the third bits of coefficients A, B, C, D and E. At this level, coefficient F newly becomes significant, and accordingly the one representative of bit 3 of that coefficient is also sent. At the same time, that coefficient is switched on, as indicated by the half-shaded squares representative of the third and fourth bits of that coefficient.
Finally, a sweep is performed across the fourth bit planes. In this example, all of the illustrated coefficients have previously been switched on at a higher level, and hence all of the resultant bits 111000 are transmitted.
The process continues for as many bit planes as were initially required to digitize and bitwise encode each individual coefficient, although the very last bit plane may need to be dealt with as a special case, to be discussed below. The process is progressive, in the sense that the most important information is sent first, so that the transmission may be stopped part-way through if transmission time is limited and/or limited bandwidth is available.
It has been found in practice to be more efficient to exclude the DC component of each block from the above scheme, and to send that separately. Accordingly, in the preferred embodiment switching applied only to the AC terms of the coefficient block. Rather than sweeping across each of the coefficient blocks for a particular bit plane, it would in an alternative embodiment be possible to sweep through all the bit planes within one block before proceeding to the next block. The corresponding algorithm for this would be:
Sweep all image blocks
Next block
During each significance sweep, the significant bits within each bit plane may be sent in any convenient order. For example, within each bit plane, the first bits to be send may be those for each of the coefficients which have already been “switched on” (in other words those coefficients which are already significant); the addresses of any newly-significant coefficients are then transmitted, to switch them on, followed by a stop symbol. Finally, the next bit is sent for all of the newly-significant coefficients. Alternatively, the switches may be sent first followed by all the data: this has the advantage of improving the run length coding of the data.
For improved efficiency at higher compressions, the DC coefficient within each block is preferably sent separately, prior to the significance sweeps of the AC terms.
It should be understood that the system needs to transmit addressing information representative of the positions of those coefficients which are significant. While this could be done simply by transmitting a list of addresses sent in zig-zag order, the applicants have determined that the sending of a binary mask, also in zig-zag order, can further improve efficiency. For example, referring to
The applicant has realized that the addresses may be transmitted not directly but by way of a bit mask, in zig-zag order. Thus, in order to switch on the mentioned coefficients (assuming none have previously been switched on), the mask 001100110000011001 may be sent. A ‘1’ in the mask indicates that that particular coefficient has newly become significant. This may be run length coded as 2020502 STOP.
The coefficients within each DCT block may be negative as well as positive, and if appropriate negative values may be suitably bitwise encoded using a 2's compliment representation. If the coefficient is positive, the “1” bits are significant; if negative the “0” bits are significant. The first data bit sent when a coefficient becomes significant determines the sign.
The final bit plane may need to be dealt with as a special case. Any coefficient not previously switched on will be either 0 or −1. One way of dealing with the final phase is to send a mask only for the −1 coefficients; there is no need to send any data, as all non-significant coefficients are 0 and all those newly switched are −1.
It should be recalled that only new coefficients have to be masked at each pass, since once a coefficient has been switched on it remains switched on until the end of the procedure. The masking method is efficient, with typically fewer bits needed to transmit the switching information than a direct list of coefficient addresses.
Various methods of packaging the mask prior to transmission may be used. The mask may be sent sequentially, followed by a special stop symbol to indicate its end. In an alternative scheme, a length symbol representative of the zig-zag length may be sent before the mask to obviate the need for the special stop symbol. Finally, the mask could be preceded by the Manhattan depth of its highest order coefficient; for example in
In each case, the switching mask will contain mostly “Off” symbols (zeros), which can be efficiently compressed using an arithmetic coder. It is particularly useful to encode the mask data using first order predictive adaptive arithmetic coding, since much of the data comprises long runs of low entropy off symbols. Other methods such as Huffman could also be used to compact the mask data prior to transmission. Run length coding may also be used.
Depending upon the method chosen to package the mask, the output from the coder will take the form of one or more bit streams. In the first arrangement mentioned above, where the mask is terminated by a special stop symbol, the mask and symbol may be sent as one stream with the DCT data being sent as another stream. In the alternatives, where a separate length symbol or Manhattan depth symbol is used, it may be convenient to output three separate streams: one specifying the Manhattan depth, one for the mask data, and one for the DCT data.
Where several separate data streams are used, the synchronization may be controlled so that each stream is maintained within a few bytes of synchrony with the others. This allows a decoder to interrupt the transmission at any time.
At the far end of the transmission stream, a decoder maintains a record of the mask for each image block, giving the current status of each of its DCT coefficients. The mask is updated at each significance pass.
In a variant of the method, an “off” mask may be maintained and sent as well as the “on” mask discussed above. This allows the coder to avoid the sending of bits which are so far down the bit planes as to represent noise rather than real data. In practice, once a coefficient has been significant for several bit planes it may well be sufficiently-well defined for visual purposes, and it could then be turned off.
In a further more general variant of the method, the significance testing need not be carried out for each consecutive bit plane. In many circumstances, it may be more efficient, and may given an acceptable result, to mask several planes at once. This reduces the overhead of masking each bit plane individually.
It will be understood that masking each bit plane separately is mathematically equivalent to making comparisons against a decreasing threshold value which goes as 2n. Other threshold values could be used instead, providing for either wider or narrower significance steps. In one embodiment, the bit planes are divided up into groups, with each group being masked separately. Depending on the application, each group might consist of the same number or alternatively of a different number of bit planes. Some of the groups might comprise only a single bit plane, while others are made up of several.
It is found in practice that the mask switching algorithm described above performs slightly better than JPEG at low compression ratios, and substantially better at high compression ratios. Indeed, using a 16×16 block size, performance over the whole range of compression ratios is very similar to that achieved by the wavelet coder of Said and Pearlman: see SAID, A. and PEARLMAN, W. A,: ‘Image Compression Using the Spatial-Orientation Tree’, IEEE International Symposium on Circuits and Systems, 1993, (694), pp.279-282. The present invention therefore provides state of the art performance while having the advantage of being usable with DCT, a transformation which is widely used and understood as a result of its adoption at the core technology of JPEG and all current versions of MPEG.
In a typical embodiment of the present invention, a video codec (Coder/Decoder) comprises a hardware or software based coder, and a hardware or software based decoder. Bits are transmitted progressively from the coder to the decoder, with the coder being instructed to keep sending bits until a certain compression target has been reached, or a certain distortion achieved. Using the two or three individual streams of data previously referred to, the decoder can progressively reconstruct the image. As the mask data are received, the decoder updates a record, held in memory, of which coefficients are currently switched on. As data transmission proceeds, more and more coefficients are switched on. If the process is allowed to continue until all of the data has been transmitted, the decoder can reconstruct a lossless image, with the exception of any small rounding errors that may have occurred during the DCT digitization process. A decoder in a multi-media system which uses a progressive, embedded coder as described above can begin to reveal an image as soon as transmission commences. This is an advantage often claimed for wavelets, but significance-switched block transforms also have this capability.
Patent | Priority | Assignee | Title |
Patent | Priority | Assignee | Title |
5416854, | Jul 31 1990 | Fujitsu Limited | Image data processing method and apparatus |
5504484, | Nov 09 1992 | MATSUSHITA ELECTRIC INDUSTRIAL CO , LTD | Variable-length data alignment apparatus for digital video data |
5563960, | Jan 22 1993 | Sarnoff Corporation | Apparatus and method for emphasizing a selected region in the compressed representation of an image |
5768437, | Feb 28 1992 | Ayscough Visuals LLC | Fractal coding of data |
5818877, | Mar 26 1996 | Regents of the University of California, The | Method for reducing storage requirements for grouped data values |
6556719, | Feb 19 1997 | HANGER SOLUTIONS, LLC | Progressive block-based coding for image compression |
7142720, | Jul 31 1990 | Institut Francais du Petrole | Image data processing method and apparatus |
EP327931, | |||
EP551672, | |||
WO9837700, |
Executed on | Assignor | Assignee | Conveyance | Frame | Reel | Doc |
Sep 09 2004 | MONRO, DONALD M | Ayscough Visuals LLC | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 021973 | /0836 | |
Apr 29 2005 | Ayscough Visuals LLC | (assignment on the face of the patent) | / | |||
Aug 11 2015 | Ayscough Visuals LLC | ZARBAÑA DIGITAL FUND LLC | MERGER SEE DOCUMENT FOR DETAILS | 037219 | /0345 | |
Dec 06 2019 | INTELLECTUAL VENTURES ASSETS 161 LLC | HANGER SOLUTIONS, LLC | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 052159 | /0509 |
Date | Maintenance Fee Events |
Jul 10 2014 | ASPN: Payor Number Assigned. |
Sep 24 2014 | M1553: Payment of Maintenance Fee, 12th Year, Large Entity. |
Date | Maintenance Schedule |
Sep 11 2015 | 4 years fee payment window open |
Mar 11 2016 | 6 months grace period start (w surcharge) |
Sep 11 2016 | patent expiry (for year 4) |
Sep 11 2018 | 2 years to revive unintentionally abandoned end. (for year 4) |
Sep 11 2019 | 8 years fee payment window open |
Mar 11 2020 | 6 months grace period start (w surcharge) |
Sep 11 2020 | patent expiry (for year 8) |
Sep 11 2022 | 2 years to revive unintentionally abandoned end. (for year 8) |
Sep 11 2023 | 12 years fee payment window open |
Mar 11 2024 | 6 months grace period start (w surcharge) |
Sep 11 2024 | patent expiry (for year 12) |
Sep 11 2026 | 2 years to revive unintentionally abandoned end. (for year 12) |