A one-dimensional (1D) inverse Discrete Cosine Transform (IDCT) is applied to an input two-dimensional (2D) transform block along the axis to be modified. Since the one-dimensional IDCT is not performed on the other axis, each block is left in a one-dimensional transform space (called hybrid space). For a shift (merge), the appropriate "m" elements are picked up from one block and the "8-m" elements are picked up from the other block and are used as input to the one-dimensional forward DCT (FDCT) along that same axis. For two-dimensional shifts or merges, the results of the first one-dimensional IDCT and FDCT can be stored with extra precision to be used as input to a second one-dimensional IDCT and FDCT along the other axis. The execution time worst case conditions are approximately constant for all shift/merger amounts. Taking advantage of fast paths can improve the execution times for typical blocks.
|
5. A computer implemented method for hybrid domain processing of a multi-dimensional transformed image comprising the steps of:
receiving an n-dimensional real image; performing an (n-m)-dimensional forward orthonormal transform on the n-dimensional real image to produce "hybrid" data, where 1--m<n; manipulating the "hybrid" data to effect a desired at most m-dimensional change in the n-dimensional real image; applying a m-dimensional forward transform to the manipulated "hybrid" data to generate processed n-dimensional transformed image; and outputting the processed n-dimensional orthornormally transformed image to an output device which generates an output n-dimensional real image representation of the processed n-dimensional orthornormally transformed image.
1. A computer implemented method for hybrid domain processing of a multi-dimensional transformed image comprising the steps of:
receiving an n-dimensional orthornormally transformed image representing some original n-dimensional real image; performing an m-dimensional inverse transform on the n-dimensional orthornormally transformed image to produce "hybrid" data, where 1≦m<n; manipulating the "hybrid" data to effect a desired at most m-dimensional change in the n-dimensional real image; applying an m-dimensional forward transform to the manipulated "hybrid" data to return the "hybrid" data to n-dimensional transform space as a processed n-dimensional orthornormally transformed image; and outputting the processed n-dimensional orthornormally transformed image to an output device which generates an output n-dimensional real image representation of the process n-dimensional orthornormally transformed image.
6. A computer implemented method for hybrid domain processing of a multi-dimensional transformed image comprising the steps of:
receiving first and second input image blocks, {tilde over (G)}1 and {tilde over (G)}2, respectively, from a unquantized two-dimensional transformed image; and for each row or column of the image blocks {tilde over (G)}1 and {tilde over (G)}2, (1) performing a one-dimensional inverse transform on the ith row or column of the image block {tilde over (G)}1 to produce "hybrid" data, (2) performing a one-dimensional inverse transform on the ith row or column of the image block {tilde over (G)}2 to produce "hybrid" data, (3) merging the resulting "hybrid" data to obtain the "hybrid" data of a shifted row or column, (4) computing one-dimensional forward transform data corresponding to the "hybrid" data of step (3) producing a row or column {tilde over (F)}i representing a shifted image block, and (5) outputting an image block containing {tilde over (F)}i to an output device which generates a real image representation of the image block. 8. A computer implemented method for hybrid domains processing of a multi-dimensional transformed image comprising the steps of:
receiving first and second two-dimensional (2D) input n1×N2 transform image blocks to be modified by a shift or merge operation, where n1 and n2 are dimensions of the first and second image blocks along first and second axes, respectively; applying a one-dimensional (1D) inverse transform to the first and second 2D input transform image blocks to be modified along the first axis, leaving each of the first and second image blocks as first and second transform blocks in a "hybrid" domain; selecting m elements of the first transform block along said first axis and n1--m elements of the second transform block along said first axis, where m<n1; using the m elements of the first transform block and n1--m elements of the second transform block in a one-dimensional forward transform along said first axis to generate a merged 2D form block; and outputting the merged 2D transform block to an output device which generates a 2D real image representation of the merged 2D transform block.
11. A computer implemented method for hybrid domain processing of a multi-dimensional transformed image comprising the steps of:
receiving first and second input blocks of transform coefficients for two n1×N2 image blocks G and H; performing a one-dimensional inverse transform on n1 rows of image block G and n1-n1 rows of image block H to produce n1 rows of "hybrid" data, where n1<n1 and is a vertical merge/shift parameter; merging/shifting the image blocks G and H into a temporary n1×N2 block K by combining the n1 "hybrid" rows with the n1-n"hybrid" rows and performing a one-dimensional forward transformation on resulting n1 "hybrid" rows; performing a one-dimensional inverse transform on n2-n2 columns of block K and on image block G to produce n2 columns of "hybrid" data, where n2<n2 and is a horizontal merge/shift parameter; merging/shifting the block K and image block G by combining the n2 "hybrid" columns with the n2-n2 "hybrid" columns and performing a one-dimensional forward transformation on the resulting n2 "hybrid" columns; and outputting the merged/shifted block K and image block G to an output device which generates a real image representation of the merged/shifted block K and image block G.
14. A computer implemented method for hybrid domain processing of a multi-dimensional transformed image comprising the steps of:
receiving first and second input blocks of transform coefficients for two n1×N2 image blocks G and H; performing a one-dimensional inverse transform on n2 columns of the image block G and on n2-n2 columns of the image block H to produce n2 columns of "hybrid" data, where n2<n2 and is a horizontal merge/shift parameter; merging/shifting the image blocks G and H into a temporary n1×N2 block K by combining the n2 "hybrid" columns with the n2-n2 "hybrid" columns and performing a one-dimensional forward transformation on resulting n2 "hybrid" columns; performing a one-dimensional inverse transform on n1-n1 rows of block K and on n1 rows of image block G to produce n1 rows of "hybrid" data, where n1<n1 and is a vertical merge/shift parameter; merging/shifting the block K and the image block G by combining the n1 "hybrid" rows with the n1-n1 "hybrid rows" and performing a one-dimensional forward transformation on resulting n1 "hybrid" rows; and outputting the merged/shifted block K and image block G to an output device which generates a real image representation of the merged/shifted block K and image block G.
15. A computer implemented method for hybrid domain processing of a multi-dimensional transformed image comprising the steps of:
receiving first and second input blocks of quantized transform coefficients for two n1×N2 image blocks G and H; performing a dequantization and inverse transform on n2 columns of the image block G and n2-n2 columns of the image block H to produce n2 columns of "hybrid" data, where n2>n2 and is a horizontal merge/shift parameter; merging/shifting the image blocks G and H into a temporary n1×N2 block K by combining the n2 "hybrid" rows with the n2-n2 "hybrid" columns and performing a requantization with extra precision and a one-dimensional forward transformation on resulting n2 "hybrid" columns; performing a dequantization and a one-dimensional inverse transform on n1-n1 rows of block K and on n1 rows of the image block G to produce n1 columns of "hybrid" data, where n1<n1 and is a vertical merge/shift parameter; merging/shifting the block K and the image block G by combining the n1 "hybrid" rows with the n1-n1 "hybrid" columns and performing requantization and a one-dimensional forward transformation on resulting n1 "hybrid" rows; and outputting the merged/shifted block K and image block G to an output device which generates a real image representation of the merged/shifted block K and image block G.
12. A computer implemented method for hybrid domain processing of a multi-dimensional transformed image comprising the steps of:
receiving first and second input blocks of quantized transform coefficients for two n1×N2 image blocks G and H; performing a dequantization and inverse transform on n1 rows of image block G and n1-n1 rows of image block H to produce n1 rows of "hybrid" data, where n1<n1 and is a vertical merge/shift parameter; merging/shifting the image blocks G and H into a temporary n1×N2 block K by combining the n1 "hybrid" rows with the n1-n1 "hybrid" rows and performing a requantization with extra precision and a one-dimensional forward transformation on the resulting n1 "hybrid" rows; performing a dequantization and a one-dimensional inverse transform on n2-n2 columns of block K and on n2 columns of image block G to produce n2 columns n2 columns of "hybrid" data, where n2<n2 and is a horizontal merge/shift parameter; merging/shifting the block K and image block G by combining the n2 "hybrid" columns with the n2-n2 "hybrid" columns and performing requantization of a one-dimensional forward transformation on the resulting n2 "hybrid" columns; and outputting the merged/shifted block K and image block G to an output device which generates a real image representation of the merged/shifted block K and image block G.
7. A computer implemented method for hybrid domain processing of a multi-dimensional transformed image comprising the steps of:
receiving first and second input image blocks, {tilde over (G)}1 and {tilde over (G)}2, respectively, of from a quantized two-dimensional transformed data of a first image; receiving first and second input image blocks, {tilde over (H)}1 and {tilde over (H)}2, respectively, of from a quantized two-dimensional transformed data from a second image; and for each row or column of the image blocks {tilde over (G)}1 and {tilde over (G)}2 and the image blocks {tilde over (H)}1 and {tilde over (H)}2, (1) dequantizing the ith row or column of the image blocks {tilde over (G)}1 and {tilde over (G)}2 and the ith row or column of the image blocks {tilde over (H)}1 and {tilde over (H)}2, (2) performing a one-dimensional inverse transform on the ith row or column of the image blocks {tilde over (G)}1 and {tilde over (G)}2 and the image blocks {tilde over (H)}1 and {tilde over (H)}2 to produce "hybrid" data corresponding to the dequantized ith row or column of image blocks {tilde over (G)}1 and {tilde over (G)}2 and image blocks {tilde over (H)}1 and {tilde over (H)}2, (3) merging or shifting the resulting "hybrid" data to obtain "hybrid" data of the merged or shifted row or column, (4) computing one-dimensional forward transform data corresponding to the "hybrid" data of step (3) producing a row or column {tilde over (F)}i representing a merged or shifted image, and (5) outputting an image block containing {tilde over (F)}i to an output device which generates a real image representation of the block. 2. The computer implemented method for hybrid domain processing of a multi-dimensional transformed image recited in
3. The computer implemented method for hybrid domain processing of a multi-dimensional transformed image recited in
4. The computer implemented method for hybrid domain processing of a multi-dimensional transformed image recited in
9. The computer implemented method for hybrid domain processing of a multi-dimensional transformed image recited in
10. The computer implemented method for hybrid domain processing of a multi-dimensional transformed image recited in
13. The computer implemented method for hybrid domain processing of a multi-dimensional transformed image recited in
16. The computer implemented method for hybrid domain processing of a multi-dimensional transformed image recited in
|
The present application is related to U.S. patent application Ser. No. 09/524,389 filed Mar. 13, 2000, by Timothy J. Trenary, Joan L. Mitchell, Charles A. Micchelli, and Marco Martens for "Shift And/or Merge of Transformed Data along Two Axes", assigned to a common assignee with this application and the disclosure of which is incorporated herein by reference.
1. Field of the Invention
The present invention generally relates to transform coding of digital data, specifically to processing of transformed data and, more particularly, to a shift and/or merge of two-dimensional transformed data using hybrid domain processing which increases the speed of, for example, processing of color images printed by color printers. The invention implements an efficient method for two-dimensional merging and shifting of JPEG (Joint Photographic Experts Group) images compressed with the Discrete Cosine Transform (DCT) domain. Since each dimension is handled separately, the shift or merge amounts are independent for the two axes. The invention provides fast shifting of the basic 8×8 DCT blocks contained in baseline JPEG compressed images to create JPEG images on a new grid.
Transform coding is the name given to a wide family of techniques for data coding, in which each block of data to be coded is transformed by some mathematical function prior to further processing. A block of data may be a part of a data object being coded, or may be the entire object. The data generally represent some phenomenon, which may be for example a spectral or spectrum analysis, an image, a video clip, etc. The transform function is usually chosen to reflect some quality of the phenomenon being coded; for example, in coding of still images and motion pictures, the Fourier transform or Discrete Cosine Transform (DCT) can be used to analyze the data into frequency terms or coefficients. Given the phenomenon being compressed, there is generally a concentration of the information into a few frequency coefficients. Therefore, the transformed data can often be more economically encoded or compressed than the original data. This means that transform coding can be used to compress certain types of data to minimize storage space or transmission time over a communication link.
An example of transform coding in use is found in the Joint Photographic Experts Group (JPEG) international standard for still image compression, as defined by ITU-T Rec. T.81 (1992)\ISO/IEC 10918-1:1994, Information technology--Digital compression and coding of continuous-tone still images, Part 1: Requirements and Guidelines. Another example is the Moving Pictures Experts Group (MPEG) international standard for motion picture compression, defined by ISO/IEC 11172:1993, Information Technology--Coding of moving pictures and associated audio for digital storage media at up to about 1.5 Mbits/s. This MPEG-1 standard defines a video compression (Part 2 of the standard). A more recent MPEG video standard (MPEG-2) is defined by ITU-T Rec. H.262\ISO/IEC 13818-2: 1996 Information Technology--Generic Coding of moving pictures and associated audio--Part 2: video. All three image international data compression standards use the DCT on 8×8 blocks of samples to achieve image compression. DCT compression of images is used herein to give illustrations of the general concepts put forward below; a complete explanation can be found in Chapter 4 "The Discrete Cosine Transform (DCT)" in W. B. Pennebaker and J. L. Mitchell, JPEG: Still Image Data Compression Standard, Van Nostrand Reinhold: New York, (1993).
Wavelet coding is another form of transform coding. Special localized basis functions allow wavelet coding to preserve edges and small details. For compression the transformed data is usually quantized. Wavelet coding is used for fingerprint identification by the Federal Bureau of Investigation (FBI). Wavelet coding is a subset of the more general subband coding technique. Subband coding uses filter banks to decompose the data into particular bands. Compression is achieved by quantizing the lower frequency bands more finely than the higher frequency bands while sampling the lower frequency bands more coarsely than the higher frequency bands. A summary of wavelet, DCT, and other transform coding is given in Chapter 5 "Compression Algorithms for Diffuse Data" in Roy Hoffman, Data Compression in Digital Systems, Chapman and Hall: New York, (1997).
In any technology and for any phenomenon represented by digital data, the data before a transformation is performed are referred to as being "in the real domain". After a transformation is performed, the new data are often called "transform data" or "transform coefficients", and referred to as being "in the transform domain". Since the present invention works on multi-dimensional transformed data after taking the inverse transform on less than the total dimension, we are defining a new term, "hybrid domain", to indicate that the orthogonal axis/axes is still transformed. To simplify notation, we will describe the invention for two dimensional transform data. Unless the context makes another meaning clear, the term "transform domain" will refer to the full multi-dimensional transform domain. The function used to take data from the real domain to the transform domain is called the "forward transform". The mathematical inverse of the forward transform, which takes data from the transform domain to the real domain, is called the respective "inverse transform".
In general, the forward transform will produce real-valued data, not necessarily integers. To achieve data compression, the transform coefficients are converted to integers by the process of quantization. Suppose that (λi) is a set of real-valued transform coefficients resulting from the forward transform of one unit of data. Note that one unit of data may be a one-dimensional or two-dimensional block of data samples or even the entire data. The "quantization values" (qi) are parameters to the encoding process. The "quantized transform coefficients" or "transform-coded data" are the sequence of values (ai) defined by the quantization function Q:
where └x┘ means the greatest integer less than or equal to x.
The resulting integers are then passed on for possible further encoding or compression before being stored or transmitted. To decode the data, the quantized coefficients are multiplied by the quantization values to give new "dequantized coefficients" (λi') given by
The process of quantization followed by de-quantization (also called inverse quantization) can thus be described as "rounding to the nearest multiple of qi". The quantization values are chosen so that the loss of information in the quantization step is within some specified bound. For example, for image data, one quantization level is usually the smallest change in data that can be perceived. It is quantization that allows transform coding to achieve good data compression ratios. A good choice of transform allows quantization values to be chosen which will significantly cut down the amount of data to be encoded. For example, the DCT is chosen for image compression because the frequency components which result produce almost independent responses from the human visual system. This means that the coefficients relating to those components to which the visual system is less sensitive, namely the high-frequency components, may be quantized using large quantization values without loss of image quality. Coefficients relating to components to which the visual system is more sensitive, namely the low-frequency components, are quantized using smaller quantization values.
The inverse transform also generally produces non-integer data. Usually the decoded data are required to be in integer form. For example, systems for the display of image data generally accept input in the form of integers. For this reason, a transform decoder generally includes a step that converts the non-integer data from the inverse transform to integer data, either by truncation or by rounding to the nearest integer. There is also often a limit on the range of the integer data output from the decoding process in order that the data may be stored in a given number of bits. For this reason the decoder also often includes a "clipping" stage that ensures that the output data are in an acceptable range. If the acceptable range is [a, b], then all values less than a are changed to a, and all values greater than b are changed to b.
These rounding and clipping processes are often considered an integral part of the decoder, and it is these which are the cause of inaccuracies in decoded data and in particular when decoded data are re-encoded. For example, the JPEG standard (Part 1) specifies that a source image sample is defined as an integer with precision P bits, with any value in the range 0 to 2P-1. The decoder is expected to reconstruct the output from the inverse discrete cosine transform (IDCT) to the specified precision. For the baseline JPEG coding P is defined to be 8; for other JPEG DCT-based coding P can be 8 or 12. The MPEG-2 video standard states in Annex A (Discrete Cosine Transform), "The input to the forward transform and the output from the inverse transform is represented with 9 bits."
For the JPEG standard, the compliance test data for the encoder source image test data and the decoder reference test data are 8 bit/sample integers. Even though rounding to integers is typical, some programming languages convert from floating point to integers by truncation. Implementations in software that accept this conversion to integers by truncation introduce larger errors into the real-domain integer output from the inverse transform.
The term "high-precision" is used herein to refer to numerical values which are stored to a precision more accurate than the precision used when storing the values as integers. Examples of high-precision numbers are floating-point or fixed-point representations of numbers.
In performing a printing operation, there is a need for the printer to be able to merge a portion of an 8×8 Discrete Cosine Transform (DCT) domain block with the complementary portion of a second DCT block quickly. The traditional approach involves conversion from the DCT domain for each of the original blocks to the respective real domains (each a 64-bit sample space) via an inverse DCT followed by merging the components of interest from each block in the real domain and finally transforming this new image back to the DCT domain. This method involves more computations than is necessary and lengthens total processing time.
While it is commonplace for graphics utilities to merge two independent images with brute force pixel-by-pixel merges as described above, it is also possible to approach the problem by working exclusively in the frequency domain. This approach potentially has at least two advantages over the traditional method in that it (1) provides for faster and more flexible image processing than the traditional technology and (2) eliminates errors which routinely take place when working in the real domain with fixed precision computation by avoiding the real domain entirely. The invention described in copending application Ser. No. 09/524,389 works exclusively in the transform domain.
The present invention relates to processing in an intermediate domain, called the "hybrid" domain, a domain between the transform domain and the real domain. This type of processing is called hybrid domain processing.
2. Background Description
The purpose of image compression is to represent images with less data in order to save storage costs or transmission time and costs. The most effective compression is achieved by approximating the original image, rather than reproducing it exactly. The JPEG standard allows the interchange of images between diverse applications and opens up the capability to provide digital continuous-tone color images in multi-media applications. JPEG is primarily concerned with images that have two spatial dimensions, contain grayscale or color information, and possess no temporal dependence, as distinguished from the MPEG (Moving Pictures Experts Group) standard. The amount of data in a digital image can be extremely large, sometimes being millions of bytes. JPEG compression can reduce the storage requirements by more than an order of magnitude and improve system response time in the process.
One of the basic building blocks for JPEG is the Discrete Cosine Transform (DCT). An important aspect of this transform is that it produces uncorrelated coefficients. Decorrelation of the coefficients is very important for compression because each coefficient can be treated independently without loss of compression efficiency. Another important aspect of the DCT is the ability to quantize the DCT coefficients using visually-weighted quantization values. Since the human visual system response is very dependent on spatial frequency, by decomposing an image into a set of waveforms, each with a particular spatial frequency, it is possible to separate the image structure the eye can see from the structure that is imperceptible. The DCT provides a good approximation to this decomposition.
The most straightforward way to implement the DCT is to follow the theoretical equations. When this is done, an upper limit of 64 multiplications and 56 additions is required for each one-dimensional (1-D) 8-point DCT. For a full 8×8 DCT done in separable one-dimensional format--eight rows and then eight columns--would require 1,024 multiplications and 896 additions plus additional operations to quantize the coefficients. In order to improve processing speed, fast DCT algorithms have been developed. The origins of these algorithms go back to the algorithm for the Fast Fourier Transform (FFT) implementation of the Discrete Fourier Transform (DFT). The most efficient algorithm for the 8×8 DCT requires only 54 multiplications, 464 additions and 6 arithmetic shifts.
The two basic components of an image compression system are the encoder and the decoder. The encoder compresses the "source" image (the original digital image) and provides a compressed data (or coded data) output. The compressed data may be either stored or transmitted, but at some point are fed to the decoder. The decoder recreates or "reconstructs" an image from the compressed data. In general, a data compression encoding system can be broken into three basic parts: an encoder model, an encoder statistical model, and an entropy encoder. The encoder model generates a sequence of "descriptors" that is an abstract representation of the image. The statistical model converts these descriptors into symbols and passes them on to the entropy encoder. The entropy encoder, in turn, compresses the symbols to form the compressed data. The encoder may require external tables; that is, tables specified externally when the encoder is invoked. Generally, there are two classes of tables; model tables that are needed in the procedures that generate the descriptors and entropy-coding tables that are needed by the JPEG entropy-coding procedures. JPEG uses two techniques for entropy encoding: Huffman coding and arithmetic coding. Similarly to the encoder, the decoder can be broken into basic parts that have an inverse function relative to the parts of the encoder.
JPEG compressed data contains two classes of segments: entropy-coded segments and marker segments. Other parameters that are needed by many applications are not part of the JPEG compressed data format. Such parameters may be needed as application-specific "wrappers" surrounding the JPEG data; e.g., image aspect ratio, pixel shape, orientation of image, etc. Within the JPEG compressed data, the entropy-coded segments contain the entropy-coded data, whereas the marker segments contain header information, tables, and other information required to interpret and decode the compressed image data. Marker segments always begin with a "marker", a unique 2-byte code that identifies the function of the segment.
There is a need for fast shifting of the basic 8×8 DCT blocks contained in baseline JPEG compressed images to create JPEG images on a new grid. There is also a need to merge two JPEG images into a combined JPEG compressed image. The amount of shift "m" (or the number of sample-rows or sample-columns "m" from the first image's blocks to be merged into a new block) are independent of the horizontal and vertical axes.
Traditionally, graphics utilities operate by first doing the two-dimensional (2-D) inverse transform back to the real domain and then working pixel by pixel. The pixel values have been converted to finite precision (usually 8-bits/component sample). The new two-dimensional blocks are generated by doing the two-dimensional forward transform on the processed image. The two-dimensional inverse and forward transform contain too many operations to be fast enough for high speed printing operations. Fast paths exist for the inverse DCT (IDCT) which, due to quantization, has many coefficients of zero. These special cases are estimated to improve throughput by about a factor of three. However, the forward DCT (FDCT) does not have similar fast paths because no pixel values are likely to be zero.
According to the invention, the solution provided does a one- dimensional (1D) inverse DCT (IDCT) to the input two-dimensional (2D) transform blocks along the axis to be modified. Since the one-dimensional IDCT is not performed on the other axis, each block is neither real data nor transform data. It is called hybrid data. For a shift (merge), the appropriate "m" elements are picked up from one block and the "8-m" elements are picked up from the other block (in a different image for the merge) and are used as input to the one-dimensional forward DCT (FDCT) along that same axis. The re-quantization step can be merged into the final one-dimensional FDCTs. For two-dimensional shifts or merges, the results of the first one-dimensional IDCT and FDCT can be stored with extra precision to be used as input to a second one-dimensional IDCT and FDCT along the other axis. Note that the only difference between the shift and merge (assuming correct alignment of the participating blocks) is the location of the second block. In the shift case, it is the adjacent block in the image along the desired axis. In the merge case, it is from another image.
There are several advantages of handling operations on two-dimensional transform blocks such as the two-dimensional shift and/or merge by doing the one-dimensional IDCT, the operation along that axis, and the one-dimensional FDCT and then repeating for the other axis. These include the following:
1. The computations for the one-dimensional IDCT have known fast paths. For columns (or rows) along the direction of block change that contain only zero elements in both blocks (usually at least half the columns or rows), the one-dimensional IDCT calculation, shift/merge, and then the one-dimensional FDCT calculation can be skipped because the result has only zeros for elements. After a full two-dimensional IDCT, few elements are zero.
2. This invention takes advantage of the fast one-dimensional IDCT and FDCT code which may already be implemented and optimized for the product.
3. This invention eliminates the need for special, separate cases for the different "m" shift or merge amounts.
4. The generic shift of the two-dimensional DCT block by (m,n) should execute faster than any other known method. The merge operation should execute in similar times. The execution times for all combinations of m and n should be about the same.
5. For one-dimensional operations on the two-dimensional transform blocks, the process is much faster than having to perform the full two-dimensional inverse transform to get to the real domain and then the full two-dimensional forward transform to get back the transform two-dimensional blocks.
6. The precision after the one-dimensional IDCTs and the first one-dimensional FDCTs is maintained sufficiently to eliminate concerns about multi-generation problems. The second one-dimensional IDCT along the orthogonal axis does not need to include dequantization if the first FDCT does not include quantization. The quantization is done after the second IDCTs.
This invention will also work for other orthogonal transforms or even higher dimensionality orthogonal transformations if they are fully separable into one-dimensional transforms along each axis. It works for other processing besides the shift and merge examples as long as the original full inverse transform, processing in the real domain, full forward transform can be mathematically separated into one-dimensional inverse transformations, a less than full inverse transform can be performed, the processing done in the resulting hybrid transform space, and then the corresponding forward transform performed to return to the full multi-dimensional transform.
The foregoing and other objects, aspects and advantages will be better understood from the following detailed description of a preferred embodiment of the invention with reference to the drawings, in which:
Although the preferred embodiment uses JPEG compressed data, it will be understood by those skilled in the art that the principles of the invention can be applied to MPEG compressed data or any data transformed by multi-dimensional transforms that separate into orthogonal one-dimensional transforms.
Referring now to the drawings, and more particularly to
The example of this single gray component can be extended to color by those familiar with color printing. Color images are traditionally printed in Cyan, Magenta, Yellow and blacK (CMYK) toner or ink. High speed printers can have dedicated hardware for each component to decode independent JPEG images. Lower speed printers may take Red, Green and Blue (RGB) images and do the color conversion internally to the printer. In either case, the JPEG blocks are composed of only one component.
The present invention will be described in the case of two-dimensional transforms which can be separated into one-dimensional orthogonal transforms; in particular, the two-dimensional DCT transform. However, the technique can be generalized to multi-dimensional transforms based on lower dimensional orthogonal transforms, like the two-dimensional DCT transform.
Let D:
where
This matrix is orthogonal and so the inverse of the matrix, D-1, is equal to its transpose, DT. The input domain of D is called the sampled or real domain, and the output is said to be in the DCT-domain. The points in the DCT-domain are denoted by capitals with tildes on top; e.g., {tilde over (F)}, {tilde over (G)}, {tilde over (H)}. The points in the real domain are denoted by F, G, H.
Merging Operator in Hybrid (One-Dimensional) Domain
Let
be a one-dimensional merge operator. This merge operator defines a two-dimensional merge operator as follows. Let G1, G2 ε M(8) be elements in the real domain, G1 being a block above G2. Now, we can merge the G1 and G2 blocks column-by-column
or
The transforms are denoted by {tilde over (F)}, {tilde over (G)}1, and {tilde over (G)}2. Then
or in block form notation
Observe that D-1{tilde over (G)}1 and D-1{tilde over (G)}2 are neither real data blocks nor transform data blocks; it is hybrid data. The above formula describes the algorithm. First apply a fast inverse one-dimensional DCT algorithm to the columns of {tilde over (G)}1 and {tilde over (G)}2, then merge (in the hybrid domain), and finally apply a fast forward one-dimensional DCT algorithm to the resultant columns.
The fast forward and inverse one-dimensional DCT algorithms allow incorporation of quantization and de-quantization.
This essentially one-dimensional algorithm can be improved by using Fast-Paths. Namely, the rows and columns of the matrices {tilde over (G)}1 and {tilde over (G)}2, which are formed by quantization, will only be non-zero in the first few coefficients of size τ=1, 2, 3, . . . , 8. There are two possibilities. First, during the process, we determine each time the size of the non-zero τ. Alternatively, we could fix τ and always perform the calculation based on the size of τ.
Assume the matrices {tilde over (G)}1 and {tilde over (G)}2 are non-zero only in the upper left square of size τ×τ. We will describe the computation to be done in this case. As an example, take τ=3. This means that the columns 4, 5, 6, 7, and 8 are zero columns. The inverse DCT has nothing to do. The remaining columns 1, 2, and 3 have only the first three entries non-zero. To these firs three we do not have to apply the complete inverse DCT algorithm, but we only have to apply the corresponding Fast Path fast inverse algorithm. We gained by exploring in two different ways the structure of matrices in the DCT domain.
Now, we have to re-group the six columns of hybrid data obtained above. To finish the computation, we have to apply a fast FDCT algorithm to all three resultant columns. The method described above has three main steps.
1) Hybrid domain processing;
2) Fast algorithms for 1-D FDCT and 1-D IDCT; and
3) Fast Path algorithms.
It should be noticed that the steps 1) and 3) are generally applicable whenever we consider two-dimensional transforms of the form
where D is an orthogonal matrix, or even multi-dimensional transforms which can be separated into orthogonal one-dimensional transforms. The specific implementations are based on DCT and the fast algorithms for DCT and IDCT are explored.
More generally, given a block of n-dimensional orthonormally transformed data representing some original real n-dimensional data, one can perform an m-dimensional (1≦m<n) inverse transform on this block to produce an n-dimensional "hybrid" block which can then be manipulated to effect a desired m-dimensional change in the n-dimensional real data. At this point, an m-dimensional forward transform is applied to return the manipulated hybrid block back to n-dimensional transform space. By way of illustrative example,
It is a simple matter to extend the previous one-dimensional construction to allow for the merging of two 8×8 sample blocks as illustrated in FIG. 11.
Arbitrary comer merges can be obtained by sixteen successive applications of the one-dimensional algorithm, eight on each column and eight on each row, as shown in FIG. 14. To get the hybrid block shown in
As shown in
Returning to
Although we have focused our attention on the Discrete Cosine Transform (DCT) since we are interested in processing JPEG (Joint Photographic Experts Group) images, the disclosed algorithm can be easily extended to any other situation where orthogonal linear transformations are applied to a real sample space to merge signals (e.g., the Hadamard-Walsh Transform, used in digital image and speech processing).
Another application example for use of the present invention is in the high-end digital graphics market which uses digital images with sometimes more than 100 megapixels. Glossy advertising brochures and the large photographic trade show booth backdrops are just two examples of the use of such high quality digital imagery. High-quality lossy JPEG compression are sometimes used to keep the transmission and storage costs down. Shifting the block grid, merging multiple images into one image, and/or cropping an image has traditionally required de-compressing and re-compressing which introduce undesirable errors. Our invention avoids these problems by working at higher precision in the hybrid domain.
The above examples for the concepts of the present invention are usual for image and video transform data. The wide use of the Internet has shown the value of JPEG and MPEG compressed image data. When JPEG images are to be printed, then manipulations such as a change of scale or a change of orientation may be required. Use of the present invention overcomes the problem inherent in propagating the errors from the rounding and clipping.
Fan-folded advertising brochures typically are composed of multiple individual pictures. Today's highest end laser printers print more than one page at a time. In such cases, the images generally do not overlap, but may not have the same quantization, positioning relative to the reference grid such as the 8×8 block structure for JPEG DCTs, or orientation. If a single image is required to simplify on-the-fly decoding and printing, then composing the final picture in the transform domain avoids the precision problems inherent in the traditional ways of working in the real domain.
Similar implementations are performed for other industrial, commercial, and military applications of digital processing employing a transform and an inverse transform of data representing a phenomenon when the data is stored in the transform domain. These are thus other representative applications wherein the use of the present invention is highly advantageous.
It is further noted that this invention may also be provided as an apparatus or a computer product. For, example, it may be implemented as an article of manufacture comprising a computer usable medium having computer readable program code means embodied therein for causing a computer to perform the methods of the present invention.
While the invention has been described in terms of preferred embodiments, those skilled in the art will recognize that the invention can be practiced with modification within the spirit and scope of the appended claims.
Martens, Marco, Mitchell, Joan L., Trenary, Timothy J.
Patent | Priority | Assignee | Title |
11216742, | Mar 04 2019 | IOCURRENTS, INC | Data compression and communication using machine learning |
11468355, | Mar 04 2019 | ioCurrents, Inc. | Data compression and communication using machine learning |
6839468, | Jun 28 2000 | International Business Machines Corporation | Scaling of multi-dimensional data in a hybrid domain |
6847735, | Jun 07 2000 | Canon Kabushiki Kaisha | Image processing system, image processing apparatus, image input apparatus, image output apparatus and method, and storage medium |
7433530, | Jun 28 2000 | International Business Machines Corporation | Scaling of multi-dimensional data in a hybrid domain |
7489827, | Jun 28 2000 | International Business Machines Corporation | Scaling of multi-dimensional data in a hybrid domain |
7657587, | May 05 2005 | ARM Limited | Multi-dimensional fast fourier transform |
9451275, | May 28 2014 | LASSOFX, INC | System and method for storing and moving graphical image data sets with reduced data size requirements |
Patent | Priority | Assignee | Title |
3925648, | |||
5126962, | Jul 11 1990 | Massachusetts Institute of Technology | Discrete cosine transform processing system |
5257213, | Feb 20 1991 | SAMSUNG ELECTRONICS CO , LTD , | Method and circuit for two-dimensional discrete cosine transform |
5387982, | Aug 04 1992 | Sharp Kabushiki Kaisha | Apparatus for performing inverse discrete cosine transform |
5475432, | Dec 31 1991 | QUARTERHILL INC ; WI-LAN INC | Hibrid video signal encoder having a block rearrangement capability for better compressibility |
5748514, | Nov 30 1994 | Godo Kaisha IP Bridge 1 | Forward and inverse discrete cosine transform circuits |
5917954, | Jun 07 1995 | Intel Corporation | Image signal coder operating at reduced spatial resolution |
6167092, | Aug 12 1999 | III Holdings 2, LLC | Method and device for variable complexity decoding of motion-compensated block-based compressed digital video |
6215422, | Aug 29 1997 | Canon Kabushiki Kaisha | Digital signal huffman coding with division of frequency sub-bands |
6237012, | Nov 07 1997 | MINEBEA CO , LTD | Orthogonal transform apparatus |
6327602, | Jul 14 1998 | LG Electronics Inc. | Inverse discrete cosine transformer in an MPEG decoder |
Executed on | Assignor | Assignee | Conveyance | Frame | Reel | Doc |
Jun 06 2000 | MARTENS, MARCO | International Business Machines Corporation | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 011179 | /0649 | |
Jun 06 2000 | MITCHELL, JOAN L | International Business Machines Corporation | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 011179 | /0649 | |
Jun 07 2000 | International Business Machines Corporation | (assignment on the face of the patent) | / | |||
Jun 07 2000 | TRENARY, TIMOTHY J | International Business Machines Corporation | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 011179 | /0649 | |
Aug 24 2007 | NeoMedia Technologies, Inc | YA GLOBAL INVESTMENTS, LP | SECURITY AGREEMENT | 019943 | /0778 | |
Sep 14 2007 | NEOMEDIA MIGRATION, INC | YA GLOBAL INVESTMENTS, LP | SECURITY AGREEMENT | 019943 | /0778 | |
Sep 14 2007 | NEOMEDIA MICRO PAINT REPAIR, INC | YA GLOBAL INVESTMENTS, LP | SECURITY AGREEMENT | 019943 | /0778 | |
Sep 14 2007 | NEOMEDIA TELECOM SERVICES, INC | YA GLOBAL INVESTMENTS, LP | SECURITY AGREEMENT | 019943 | /0778 | |
May 03 2011 | International Business Machines Corporation | Google Inc | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 026664 | /0866 | |
Sep 29 2017 | Google Inc | GOOGLE LLC | CHANGE OF NAME SEE DOCUMENT FOR DETAILS | 044127 | /0735 |
Date | Maintenance Fee Events |
Jun 29 2007 | M1551: Payment of Maintenance Fee, 4th Year, Large Entity. |
Jul 06 2011 | M1552: Payment of Maintenance Fee, 8th Year, Large Entity. |
Jul 06 2015 | M1553: Payment of Maintenance Fee, 12th Year, Large Entity. |
Date | Maintenance Schedule |
Jan 06 2007 | 4 years fee payment window open |
Jul 06 2007 | 6 months grace period start (w surcharge) |
Jan 06 2008 | patent expiry (for year 4) |
Jan 06 2010 | 2 years to revive unintentionally abandoned end. (for year 4) |
Jan 06 2011 | 8 years fee payment window open |
Jul 06 2011 | 6 months grace period start (w surcharge) |
Jan 06 2012 | patent expiry (for year 8) |
Jan 06 2014 | 2 years to revive unintentionally abandoned end. (for year 8) |
Jan 06 2015 | 12 years fee payment window open |
Jul 06 2015 | 6 months grace period start (w surcharge) |
Jan 06 2016 | patent expiry (for year 12) |
Jan 06 2018 | 2 years to revive unintentionally abandoned end. (for year 12) |