An encoder/decoder is based on a combination of two audio or video channels to obtain a first combination signal as a mid-signal and a residual signal derivable using a predicted side signal derived from the mid-signal. A decoder uses the prediction residual signal, the first combination signal, a prediction direction indicator and prediction information to derive decoded first channel and second channel signals. A real-to-imaginary transform can be applied for estimating the imaginary part of the spectrum of the first combination signal. The prediction signal used in the derivation of the prediction residual signal, the real-valued first combination signal is multiplied by a real portion of the complex prediction information and the estimated imaginary part of the first combination signal is multiplied by an imaginary portion of the complex prediction information.

Patent
   RE49469
Priority
Apr 13 2010
Filed
Nov 17 2020
Issued
Mar 21 2023
Expiry
Feb 17 2031
Assg.orig
Entity
Large
0
53
currently ok
0. 23. A method of decoding an encoded multi-channel audio or video signal, the encoded multi-channel audio or video signal comprising an encoded first combination signal, an encoded prediction residual signal and prediction information, comprising:
decoding the encoded first combination signal to acquire a decoded first combination signal, and decoding the encoded residual signal to acquire a decoded residual signal; and
calculating a decoded multi-channel signal comprising a decoded first channel signal, and a decoded second channel signal using the decoded residual signal, the prediction information, the decoded first combination signal and a prediction direction indicator indicating a prediction direction associated with the decoded prediction residual signal, so that the decoded first channel signal and the decoded second channel signal are at least approximations of a first channel signal and a second channel signal of a multi-channel signal,
wherein calculating the decoded multi-channel signal comprises using a first calculation rule for calculating the decoded multi-channel signal in case of a first state of the prediction direction indicator and using a second different calculation rule for calculating the decoded multi-channel signal in case of a second different state of the prediction direction indicator,
wherein the prediction information comprises a real-valued factor different from zero, and
wherein the calculating the decoded multi-channel signal comprises multiplying the decoded first combination signal by the real factor to acquire a part of a prediction signal, and linearly combining the decoded residual signal and the part of the prediction signal.
0. 24. A non-transitory storage medium having stored thereon a computer program for performing, when running on a computer or a processor, the method of decoding an encoded multi-channel audio or video signal, the encoded multi-channel audio or video signal comprising an encoded first combination signal, an encoded prediction residual signal and prediction information, said method comprising:
decoding the encoded first combination signal to acquire a decoded first combination signal, and decoding the encoded residual signal to acquire a decoded residual signal; and
calculating a decoded multi-channel signal comprising a decoded first channel signal, and a decoded second channel signal using the decoded residual signal, the prediction information, the decoded first combination signal and a prediction direction indicator indicating a prediction direction associated with the decoded prediction residual signal, so that the decoded first channel signal and the decoded second channel signal are at least approximations of the first channel signal and the second channel signal of the multi-channel signal,
wherein calculating the decoded multi-channel signal comprises using a first calculation rule for calculating the decoded multi-channel signal in case of a first state of the prediction direction indicator and using a second different calculation rule for calculating the decoded multi-channel signal in case of a second different state of the prediction direction indicator,
wherein the prediction information comprises a real-valued factor different from zero, and
wherein the calculating the decoded multi-channel signal comprises multiplying the decoded first combination signal by the real factor to acquire a part of a prediction signal, and linearly combining the decoded residual signal and the part of the prediction signal.
0. 20. An audio or video decoder for decoding an encoded multi-channel audio or video signal, the encoded multi-channel audio or video signal comprising an encoded first combination signal, an encoded prediction residual signal and prediction information, comprising:
a signal decoder for decoding the encoded first combination signal to acquire a decoded first combination signal, and for decoding the encoded residual signal to acquire a decoded residual signal; and
a decoder calculator for calculating a decoded multi-channel signal comprising a decoded first channel signal, and a decoded second channel signal using the decoded residual signal, the prediction information, the decoded first combination signal and a prediction direction indicator indicating a prediction direction associated with the decoded prediction residual signal, so that the decoded first channel signal and the decoded second channel signal are at least approximations of a first channel signal and a second channel signal of a multi-channel signal,
wherein the decoder calculator is configured for using a first calculation rule for calculating the decoded multi-channel signal in case of a first state of the prediction direction indicator and for using a second different calculation rule for calculating the decoded multi-channel signal in case of a second different state of the prediction direction indicator,
wherein the prediction information comprises a real-valued factor different from zero, and
wherein the decoder calculator comprises a predictor and a combination signal calculator, wherein the predictor is configured for multiplying the decoded first combination signal by the real factor to acquire a part of a prediction signal, and wherein the combination signal calculator is configured for linearly combining the decoded residual signal and the part of the prediction signal.
0. 1. An audio or video decoder for decoding an encoded multi-channel audio or video signal, the encoded multi-channel audio or video signal comprising an encoded first combination signal, an encoded prediction residual signal and prediction information, comprising:
a signal decoder for decoding the encoded first combination signal to acquire a decoded first combination signal, and for decoding the encoded residual signal to acquire a decoded residual signal; and
a decoder calculator for calculating a decoded multi-channel signal comprising a decoded first channel signal, and a decoded second channel signal using the decoded residual signal, the prediction information, the decoded first combination signal and a prediction direction indicator indicating a prediction direction associated with the decoded prediction residual signal, so that the decoded first channel signal and the decoded second channel signal are at least approximations of a first channel signal and a second channel signal of a multi-channel signal,
wherein the decoder calculator is configured for using a first calculation rule for calculating the decoded multi-channel signal in case of a first state of the prediction direction indicator and for using a second different calculation rule for calculating the decoded multi-channel signal in case of a second different state of the prediction direction indicator.
0. 2. The audio or video decoder in accordance with claim 1, in which the prediction direction indicator is comprised by the encoded multi-channel signal, and in which the audio or video decoder further comprises an input interface for extracting the prediction direction indicator and for forwarding the prediction direction indicator to the decoder calculator.
0. 3. The audio or video decoder in accordance with claim 1, in which the decoded first combination signal comprises a mid signal, in which the first calculation rule comprises the calculation of a side signal from the decoded first combination signal and the decoded residual signal; or
in which the decoded first combination signal comprises a side signal, and in which the second calculation rule comprises the calculation of a mid signal from the decoded first combination signal and the decoded residual signal.
0. 4. The audio or video decoder in accordance with claim 1, in which the decoded first combination signal comprises a mid signal, and in which the first calculation rule comprises the calculation of the decoded first channel signal and the calculation of the decoded second channel signal using the mid signal, the prediction information and the decoded residual signal without an explicit calculation of the side signal; or
in which the decoded first combination signal comprises a side signal, and in which the second calculation rule comprises the calculation of the decoded first channel signal and the calculation of the decoded second channel signal using the side signal, the prediction information and the decoded residual signal without an explicit calculation of the mid signal.
0. 5. The audio or video decoder in accordance with claim 1, in which the decoder calculator is configured for using the prediction information where the prediction information comprises a real-valued portion different from zero and/or an imaginary portion different from zero.
0. 6. The audio or video decoder of claim 1, in which the decoder calculator comprises:
a predictor for applying the prediction information to the decoded first combination signal or to a signal derived from the decoded first combination signal to acquire a prediction signal;
a combination signal calculator for calculating a second combination signal by combining the decoded residual signal and the prediction signal; and
a combiner for combining the decoded first combination signal and the second combination signal to acquire a decoded multi-channel audio or video signal comprising the decoded first channel signal and the decoded second channel signal,
wherein in case of a first state of the prediction direction indicator, the first combination signal is a sum signal and the second combination signal is a difference signal, or
wherein in case of a second state of the prediction direction indicator, the first combination signal is a difference signal and the second combination signal is a sum signal.
0. 7. The audio or video decoder in accordance with claim 1,
in which the encoded first combination signal and the encoded residual signal have been generated using an aliasing generating time-spectral conversion,
wherein the decoder further comprises:
a spectral-time converter for generating a time-domain first channel signal and a time-domain second channel signal using a spectral-time conversion algorithm matched to the time-spectral conversion algorithm;
an overlap/add processor for conducting an overlap-add processing for the time-domain first channel signal and for the time-domain second channel signal to acquire an aliasing-free first time-domain signal and an aliasing-free second time-domain signal.
0. 8. The audio or video decoder in accordance with claim 1, in which the prediction information comprises a real-valued factor different from zero,
in which the predictor is configured for multiplying the decoded first combination signal by the real factor to acquire a first part of the prediction signal, and
in which the combination signal calculator is configured for linearly combining the decoded residual signal and the first part of the prediction signal.
0. 9. The audio or video decoder in accordance with claim 1, in which the prediction information comprises an imaginary factor different from zero,
in which the predictor is configured for estimating an imaginary part of the decoded first combination signal using a real-valued part of the decoded first combination signal,
in which the predictor is configured for multiplying the imaginary part of the decoded first combination signal by the imaginary factor of the prediction information to acquire a second part of the prediction signal; and
in which the combination signal calculator is configured for linearly combining the first part of the prediction signal and the second part of the prediction signal and the decoded residual signal to acquire a second combination signal.
0. 10. The audio or video decoder in accordance with claim 6,
in which the predictor is configured for filtering at least two time-subsequent frames, where one of the two time-subsequent frames precedes or follows a current frame of the first combination signal to acquire an estimated imaginary part of a current frame of the first combination signal using a linear filter.
0. 11. The audio or video decoder in accordance with claim 6,
in which the decoded first combination signal comprises a sequence of real-valued signal frames, and
in which the predictor is configured for estimating an imaginary part of the current signal frame using only the current real-valued signal frame of the decoded first combination signal, or
in which the predictor is configured for estimating an imaginary part of the current signal frame using the current real-valued signal frame of the decoded first combination signal and only one or more preceding real-valued signal frames of the decoded first combination signal, or
in which the predictor is configured for estimating an imaginary part of the current signal frame using the current real-valued signal frame of the decoded first combination signal and only one or more following real-valued signal frames of the decoded first combination signal, or
in which the predictor is configured for estimating an imaginary part of the current signal frame using the current real-valued signal frame of the decoded first combination signal and one or more preceding real-valued signal frames and one or more following real-valued signal frames of the decoded first combination signal.
0. 12. An audio or video encoder for encoding a multi-channel audio or video signal comprising two or more channel signals, comprising:
an encoder calculator for calculating a first combination signal and a prediction residual signal using a first channel signal and a second channel signal and prediction information and a prediction direction indicator indicating a prediction direction associated with the prediction residual signal, so that a prediction residual signal, when combined with a prediction signal derived from the first combination signal or a signal derived from the first combination signal and the prediction information results in a second combination signal,
wherein the encoder calculator comprises a combiner for combining the first channel signal and the second channel signal in two different ways to acquire the first combination signal and the second combination signal;
an optimizer for calculating the prediction information so that the prediction residual signal fulfills an optimization target;
a prediction direction calculator for calculating the prediction direction indicator indicating the prediction direction associated with the prediction residual signal;
a signal encoder for encoding the first combination signal and the prediction residual signal to acquire an encoded first combination signal and an encoded prediction residual signal; and
an output interface for combining the encoded first combination signal, the encoded prediction residual signal and the prediction information to acquire an encoded multi-channel audio or video signal.
0. 13. The audio or video encoder in accordance with claim 12, in which the encoder calculator comprises:
a combiner for combining the first channel signal and the second channel signal in two different ways to acquire the first combination signal and the second combination signal;
a predictor for applying the prediction information to the first combination signal or a signal derived from the first combination signal to acquire a prediction signal or for applying prediction information to the second combination signal or a signal derived from the second combination signal to acquire a prediction signal depending on the prediction direction indicator; and
a residual signal calculator for calculating the prediction residual signal by combining the prediction signal and the second combination signal or by combining the prediction signal and the first combination signal depending on the prediction direction indicator.
0. 14. The audio or video encoder in accordance with claim 12,
in which the first channel signal is a spectral representation of a block of samples;
in which the second channel signal is a spectral representation of a block of samples,
wherein the spectral representations are either pure real-valued spectral representations or pure imaginary spectral representations,
in which the optimizer is configured for calculating the prediction information as a real-valued factor different from zero and/or as an imaginary factor different from zero, and
in which the encoder calculator comprises a real-to-imaginary transformer or an imaginary-to-real transformer for deriving a transform spectral representation from the first combination signal or from the second combination signal depending on the prediction direction indicator, and
in which the encoder calculator is configured to calculate the first combination signal or the second combination signal depending on the prediction direction indicator and to calculate the prediction residual signal from the transformed spectrum and the imaginary factor.
0. 15. The encoder in accordance with claim 12,
in which the predictor is configured for multiplying the first combination signal by a real part of the prediction information to acquire a first part of the prediction signal;
for estimating an imaginary part of the first combination signal or of the second combination signal using the first combination signal or the second combination signal;
for multiplying the imaginary part of the first or the second combined signal by an imaginary part of the prediction information to acquire a second part of the prediction signal; and
wherein the residual calculator is configured for linearly combining the first part signal of the prediction signal or the second part signal of the prediction signal and the second combination signal or the first combination signal to acquire the prediction residual signal.
0. 16. A method of decoding an encoded multi-channel audio or video signal, the encoded multi-channel audio or video signal comprising an encoded first combination signal, an encoded prediction residual signal and prediction information, comprising:
decoding the encoded first combination signal to acquire a decoded first combination signal, and decoding the encoded residual signal to acquire a decoded residual signal; and
calculating a decoded multi-channel signal comprising a decoded first channel signal, and a decoded second channel signal using the decoded residual signal, the prediction information, the decoded first combination signal and a prediction direction indicator indicating a prediction direction associated with the decoded prediction residual signal, so that the decoded first channel signal and the decoded second channel signal are at least approximations of a first channel signal and a second channel signal of a multi-channel signal,
wherein calculating the decoded multi-channel signal comprises using a first calculation rule for calculating the decoded multi-channel signal in case of a first state of the prediction direction indicator and using a second different calculation rule for calculating the decoded multi-channel signal in case of a second different state of the prediction direction indicator.
0. 17. A method of encoding a multi-channel audio or video signal comprising two or more channel signals, comprising:
calculating a first combination signal and a prediction residual signal using a first channel signal and a second channel signal, prediction information and a prediction direction indicator indicating a prediction direction associated with the prediction residual signal, so that a prediction residual signal, when combined with a prediction signal derived from the first combination signal or a signal derived from the first combination signal and the prediction information results in a second combination signal,
combining the first channel signal and the second channel signal in two different ways to acquire the first combination signal and the second combination signal;
calculating the prediction information so that the prediction residual signal fulfills an optimization target;
calculating the prediction direction indicator indicating the prediction direction associated with the prediction residual signal;
encoding the first combination signal and the prediction residual signal to acquire an encoded first combination signal and an encoded residual signal; and
combining the encoded first combination signal, the encoded prediction residual signal and the prediction information to acquire an encoded multi-channel audio or video signal.
0. 18. A non-transitory storage medium having stored thereon a computer program for performing, when running on a computer or a processor, the method of decoding an encoded multi-channel audio or video signal, the encoded multi-channel audio or video signal comprising an encoded first combination signal, an encoded prediction residual signal and prediction information, said method comprising:
decoding the encoded first combination signal to acquire a decoded first combination signal, and decoding the encoded residual signal to acquire a decoded residual signal; and
calculating a decoded multi-channel signal comprising a decoded first channel signal, and a decoded second channel signal using the decoded residual signal, the prediction information, the decoded first combination signal and a prediction direction indicator indicating a prediction direction associated with the decoded prediction residual signal, so that the decoded first channel signal and the decoded second channel signal are at least approximations of the first channel signal and the second channel signal of the multi-channel signal,
wherein calculating the decoded multi-channel signal comprises using a first calculation rule for calculating the decoded multi-channel signal in case of a first state of the prediction direction indicator and using a second different calculation rule for calculating the decoded multi-channel signal in case of a second different state of the prediction direction indicator.
0. 19. A non-transitory storage medium having stored thereon a computer program for performing, when running on a computer or a processor, the method of encoding a multi-channel audio or video signal comprising two or more channel signals, said method comprising:
calculating a first combination signal and a prediction residual signal using a first channel signal and a second channel signal prediction information and a prediction direction indicator indicating a prediction direction associated with the decoded prediction residual signal, so that a prediction residual signal, when combined with a prediction signal derived from the first combination signal or a signal derived from the first combination signal and the prediction information results in a second combination signal,
combining the first channel signal and the second channel signal in two different ways to acquire the first combination signal and the second combination signal;
calculating the prediction information so that the prediction residual signal fulfills an optimization target;
calculating the prediction direction indicator indicating the prediction direction associated with the prediction residual signal;
encoding the first combination signal and the prediction residual signal to acquire an encoded first combination signal and an encoded residual signal; and
combining the encoded first combination signal, the encoded prediction residual signal and the prediction information to acquire an encoded multi-channel audio or video signal.
0. 21. The audio or video decoder in accordance with claim 20, in which the decoded first combination signal comprises a mid signal, and in which the first calculation rule comprises the calculation of the decoded first channel signal and the calculation of the decoded second channel signal using the mid signal, the prediction information and the decoded residual signal without an explicit calculation of the side signal; or
in which the decoded first combination signal comprises a side signal, and in which the second calculation rule comprises the calculation of the decoded first channel signal and the calculation of the decoded second channel signal using the side signal, the prediction information and the decoded residual signal without an explicit calculation of the mid signal.
0. 22. The audio or video decoder in accordance with claim 20,
in which the encoded first combination signal and the encoded residual signal have been generated using an aliasing generating time-spectral conversion, wherein the decoder further comprises:
a spectral-time converter for generating a time-domain first channel signal and a time-domain second channel signal using a spectral-time conversion algorithm matched to the time-spectral conversion algorithm;
an overlap/add processor for conducting an overlap-add processing for the time-domain first channel signal and for the time-domain second channel signal to acquire an aliasing-free first time-domain signal and an aliasing-free second time-domain signal.

This application is a
alpha_im=alpha_q_im*0.1

Without prediction direction reversal problems may occur when the side signal S has a rather high energy compared to the downmix signal M. In such cases, it may become difficult to predict the dominant part of the signal present in S, especially when M is of very low level and thus primarily consists of noise components.

Furthermore, the range of values for the prediction coefficient α may become very large, potentially leading to coding artifacts due to unwanted amplification or panning of quantization noise (e.g. spatial unmasking effects).

To give an example, one can consider a slightly panned out-of-phase signal with R=−0.9·L
R=−0.9·L;
M=−0.5·(L+R)=0.05·L;
S=0.5·(L−R)=0.95·L;
RES=S−(α*M);
optimum α:
α=19;
which leads to a rather large optimum prediction factor of 19.

In accordance with the present invention, the direction of prediction is switched, and this results in an increase in prediction gain with minimum computational effort and a smaller α.

In case of a side signal S with high energy compared to the mid signal M, it becomes of interest to reverse the direction of the prediction so that M is being predicted from the complex-value representation of S as, for example, illustrated in FIG. 13b(2). When switching the direction of prediction, so that M is predicted from S, an additional MDST is advantageously needed for S, but no MDST may be used for M. Additionally, in this case, instead of the mid signal as in the first alternative in FIG. 13b(1), the (real-valued) side signal has to be transmitted to the decoder together with the residual signal and the prediction information α.

The switching of the prediction direction can be done on a per-frame basis, i.e. on the time axis, a per-band basis, i.e. on the frequency axis, or a combination thereof so that per band and frequency switching is allowed. This results in a prediction direction indicator (a bit) for each frame and each band, but it might be useful to only allow a single prediction direction for each frame.

To this end, the prediction direction calculator 219 is provided, which is illustrated in FIG. 12a. As in other figures, FIG. 12a illustrates an MDCT stage 50/51, a mid/side coding stage 2031, a real-to-complex converter 2070, prediction signal calculator 2073/2074 and a final residual signal calculator 2034. Additionally, a prediction direction-control M/S swapper 507 is provided which is configured and useful for implementing the two different prediction rules 502, 503 illustrated in FIG. 11a. The first prediction rule is that the swapper 507 is in the first state, i.e. where M and S are not swapped. The second prediction rule is implemented when the swapper 507 is in the swapping state, i.e. where M and S are swapped from the input to the output. This implementation has the advantage that the whole circuitry behind the swapper 507 is the same for both prediction directions.

Similarly, the different decoding rules 402, 403, i.e. the different decoder calculation rules can also be implemented by a swapper 407 at the input of the combiner 1162 which, in the FIG. 12b embodiment, is implemented to perform an inverse mid/side coding. The swapper 407 which can also be termed a “prediction switch” receives, at its input, the downmix signal DMX and a signal IPS, where IPS stands for inversely predicted signal. Depending on the prediction direction indicator, the swapper 407 either connects DMX to M and IPS to S or connects to DMX to S and IPS to M as illustrated in the table above FIG. 12b.

FIG. 13b illustrates an implementation of the first calculation rule of FIG. 11b, i.e. the rule illustrated by block 402. In the first embodiment, the inverse prediction is explicitly performed so that the side signal is explicitly calculated from the residual signal and the transmitted mid signal. Then, in a subsequent step, L and R are calculated by the equations to the right of the explicit inverse prediction equation in FIG. 13. In an alternative implementation, an implicit inverse prediction is performed, where the side signal S is not explicitly calculated, but where the left signal L and the right signal R are directly calculated from the transmitted M signal and the transmitted residual signal using the prediction information α.

FIG. 13d illustrates the equations for the other prediction direction, i.e. when the prediction direction indicator pred_dir is equal to 1. Again, an explicit inverse prediction to obtain M can be performed using the transmitted residual signal and the transmitted side signal and a subsequent calculation of L and R can be done using the mid signal and the side signal. Alternatively, an implicit inverse prediction can be performed so that L and R are calculated from the transmitted signal S, the residual signal and the prediction information α without explicitly calculating the mid signal M.

As outlined below in FIG. 13b, the sign of a can be reversed in all equations. When this is performed, FIG. 13b has, for the residual signal calculation, a sum between the two terms. Then, the explicit inverse prediction turns into a difference calculation. Depending on the actual implementation, the notation as outlined in FIG. 13b to FIG. 13d or the inverse notation may be convenient.

In the equations in FIG. 13b to FIG. 13d, several complex multiplications may occur. These complex multiplications may occur for all cases, where a is a complex number. Then, the complex approximation of M or S may be used as stated in the equations. The complex multiplication will incur a difference between the actual multiplication of the real part of the two factors and the product of the imaginary parts of the two factors as illustrated in FIG. 13e for the case of a only or for the case of (1+α).

The prediction direction calculator 219 can be implemented in different ways. FIG. 14 illustrates two basic ways for calculating the prediction direction. One way is a feed forward calculation, where the signal M and the signal S, which are generally the first combination signal and the second combination signal, are compared by calculating an energy difference as indicated in step 550. Then, in step 551 the difference is compared to a threshold, where the threshold can be set via a threshold input line or can be fixed to a program. However, it is advantageous that there is some hysteresis. Hence, as a decision criterion for the actual prediction direction, the energy difference between S and M can be evaluated. In order to achieve the best perceptual quality, the decision criterion may, therefore, be stabilized by using some hysteresis, i.e. different decision thresholds based on the last frame's prediction direction. Another conceivable criterion for the prediction direction would be the inter-channel phase difference of the input channels. Regarding the hysteresis, the control of the threshold can be performed in such a way that rare changes of the prediction direction in a certain time interval are favored over many changes in this time interval. Therefore, starting from a certain threshold, the threshold may be increased in response to a prediction direction change. Then, based on this high value, the threshold can be reduced more and more during periods where no prediction direction change is calculated. Then, when the threshold approaches its value before the last change, the threshold remains at the same level and the system is once again ready to change the prediction direction. This procedure allows changes within short intervals only when there is a very high difference between S and M, but allows less frequent changes when the energy differences between M and S are not so high.

Alternatively, or additionally, a feedback calculation can be performed, where the residual signals for both prediction directions are calculated as illustrated in step 552. Then, in step 553, the prediction direction is calculated which results in a smaller residual signal or less bits for the residual signal or the downmix signal or a smaller number of overall bits or a better quality of the audio signal or in any other specific condition. Therefore, the prediction direction resulting in a certain optimization target is selected in this feedback calculation.

It is to be emphasized that the invention is not only applicable to stereo signals, i.e. multi-channel signals having only two channels, but is also applicable to two channels of a multi-channel signal having three or more channels such as a 5.1 or 7.1 signal. An embodiment for a multi-channel implementation may comprise the identification of a plurality of pairs of signals and the calculation and parallel transmission or storage of the data for more than one pair of signals.

In an embodiment of the audio decoder, the encoded or decoded first combination signal 104 and the encoded or decoded prediction residual signal 106 each comprises a first plurality of subband signals, wherein the prediction information comprises a second plurality of prediction information parameters, the second plurality being smaller than the first plurality, wherein the predictor 1160 is configured for applying the same prediction parameter to at least two different subband signals of the decoded first combination signal, wherein the decoder calculator 116 or the combination signal calculator 1161 or the combiner 1162 are configured for performing a subband-wise processing; and wherein the audio decoder further comprises a synthesis filterbank 52, 53 for combining subband signals of the decoded first combination signal and the decoded second combination signal to obtain a time-domain first decoded signal and a time-domain second decoded signal.

In an embodiment of the audio decoder, the predictor 1160 is configured for receiving window shape information 109 and for using different filter coefficients for calculating an imaginary spectrum, where the different filter coefficients depend on different window shapes indicated by the window shape information 109.

In an embodiment of the audio decoder, the decoded first combination signal is associated with different transform lengths indicated by a transform length indicator included in the encoded multi-channel signal 100, and in which the predictor 1160 is configured for only using one or more frames of the first combination signal having the same associated transform length for estimating the imaginary part for a current frame for a first combination signal.

In an embodiment of the audio decoder, the predictor 1160 is configured for using a plurality of subbands of the decoded first combination signal adjacent in frequency, for estimating the imaginary part of the first combination signal, and wherein, in case of low or high frequencies, a symmetric extension in frequency of the current frame of the first combination signal is used for subbands associated with frequencies lower or equal to zero or higher or equal to a half of a sampling frequency on which the current frame is based, or in which filter coefficients of a filter included in the predictor 1160a are set to different values for the missing subbands compared to non-missing subbands.

In an embodiment of the audio decoder, the prediction information 108 is included in the encoded multi-channel signal in a quantized and entropy-encoded representation, wherein the audio decoder further comprises a prediction information decoder 65 for entropy-decoding or dequantizing to obtain a decoded prediction information used by the predictor 1160, or in which the encoded multi-channel audio signal comprises a data unit indicating in the first state that the predictor 1160 is to use at least one frame preceding or following in time to a current frame of the decoded first combination signal, and indicating in the second state that the predictor 1160 is to use only a single frame of the decoded first combination signal for an estimation of an imaginary part for the current frame of the decoded first combination signal, and in which the predictor 1160 is configured for sensing a state of the data unit and for operating accordingly.

In an embodiment of the audio decoder, the prediction information 108 comprises codewords of differences between time sequential or frequency adjacent complex values, and wherein the audio decoder is configured for performing an entropy decoding step and a subsequent difference decoding step to obtain time sequential quantized complex prediction values or complex prediction values for adjacent frequency bands.

In an embodiment of the audio decoder, the encoded multi-channel signal comprises, as side information, a real indicator indicating that all prediction coefficients for a frame of the encoded multi-channel signal are real-valued, wherein the audio decoder is configured for extracting the real indicator from the encoded multi-channel audio signal 100, and wherein the decoder calculator 116 is configured for not calculating an imaginary signal for a frame, for which the real indicator is indicating only real-valued prediction coefficients.

In an embodiment of the audio encoder, the predictor 2033 comprises a quantizer for quantizing the first channel signal, the second channel signal, the first combination signal or the second combination signal to obtain one or more quantized signals, and wherein the predictor 2033 is configured for calculating the residual signal using quantized signals.

In an embodiment of the audio encoder, the first channel signal is a spectral representation of a block of samples, and the second channel signal is a spectral representation of a block of samples, wherein the spectral representations are either pure real spectral representations or pure imaginary spectral representations, in which the optimizer 207 is configured for calculating the prediction information 206 as a real-valued factor different from zero and/or as an imaginary factor different from zero, and in which the encoder calculator 203 is configured to calculate the first combination signal and the prediction residual signal so that the prediction signal is derived from the pure real spectral representation or the pure imaginary spectral representation using the real-valued factor.

The inventive encoded audio signal can be stored on a digital storage medium or can be transmitted on a transmission medium such as a wireless transmission medium or a wired transmission medium such as the Internet.

Although the present invention is mainly described in the context of audio processing, it is to be emphasized that the invention can also be applied to the coding of decoding of video signals. The complex prediction with varying direction can be applied to the e.g. 3D stereo video compression. In this particular example, a 2D-MDCT is used. An example for this technique is Google WebM/VP8. However, other implementations without a 2D-MDCT can be applied as well.

Although some aspects have been described in the context of an apparatus, it is clear that these aspects also represent a description of the corresponding method, where a block or device corresponds to a method step or a feature of a method step. Analogously, aspects described in the context of a method step also represent a description of a corresponding block or item or feature of a corresponding apparatus.

Depending on certain implementation requirements, embodiments of the invention can be implemented in hardware or in software. The implementation can be performed using a digital storage medium, for example a floppy disk, a DVD, a CD, a ROM, a PROM, an EPROM, an EEPROM or a FLASH memory, having electronically readable control signals stored thereon, which cooperate (or are capable of cooperating) with a programmable computer system such that the respective method is performed.

Some embodiments according to the invention comprise a non-transitory or tangible data carrier having electronically readable control signals, which are capable of cooperating with a programmable computer system, such that one of the methods described herein is performed.

Generally, embodiments of the present invention can be implemented as a computer program product with a program code, the program code being operative for performing one of the methods when the computer program product runs on a computer. The program code may for example be stored on a machine readable carrier.

Other embodiments comprise the computer program for performing one of the methods described herein, stored on a machine readable carrier.

In other words, an embodiment of the inventive method is, therefore, a computer program having a program code for performing one of the methods described herein, when the computer program runs on a computer.

A further embodiment of the inventive methods is, therefore, a data carrier (or a digital storage medium, or a computer-readable medium) comprising, recorded thereon, the computer program for performing one of the methods described herein.

A further embodiment of the inventive method is, therefore, a data stream or a sequence of signals representing the computer program for performing one of the methods described herein. The data stream or the sequence of signals may for example be configured to be transferred via a data communication connection, for example via the Internet.

A further embodiment comprises a processing means, for example a computer, or a programmable logic device, configured to or adapted to perform one of the methods described herein.

A further embodiment comprises a computer having installed thereon the computer program for performing one of the methods described herein.

In some embodiments, a programmable logic device (for example a field programmable gate array) may be used to perform some or all of the functionalities of the methods described herein. In some embodiments, a field programmable gate array may cooperate with a microprocessor in order to perform one of the methods described herein. Generally, the methods are advantageously performed by any hardware apparatus.

While this invention has been described in terms of several embodiments, there are alterations, permutations, and equivalents which fall within the scope of this invention. It should also be noted that there are many alternative ways of implementing the methods and compositions of the present invention. It is therefore intended that the following appended claims be interpreted as including all such alterations, permutations and equivalents as fall within the true spirit and scope of the present invention.

Robilliard, Julien, Neusinger, Matthias, Helmrich, Christian, Hilpert, Johannes, Rettelbach, Nikolaus, Disch, Sascha, Edler, Bernd

Patent Priority Assignee Title
Patent Priority Assignee Title
5285498, Mar 02 1992 AT&T IPM Corp Method and apparatus for coding audio signals based on perceptual model
5754733, Aug 01 1995 Qualcomm Incorporated Method and apparatus for generating and encoding line spectral square roots
5808569, Oct 11 1993 U S PHILIPS CORPORATION Transmission system implementing different coding principles
6012025, Jan 28 1998 Nokia Technologies Oy Audio coding method and apparatus using backward adaptive prediction
6539357, Apr 29 1999 AVAGO TECHNOLOGIES INTERNATIONAL SALES PTE LIMITED Technique for parametric coding of a signal containing information
6587823, Jun 29 1999 ELECTRONICS & TELECOMMUNICATION RESEARCH; Fraunhofer-Gesellschaft Data CODEC system for computer
7340391, Mar 01 2004 Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung E V Apparatus and method for processing a multi-channel signal
7437299, Apr 10 2002 Koninklijke Philips Electronics N V Coding of stereo signals
7447317, Oct 02 2003 AVAGO TECHNOLOGIES GENERAL IP SINGAPORE PTE LTD Compatible multi-channel coding/decoding by weighting the downmix channel
7822617, Feb 23 2005 TELEFONAKTIEBOLAGE LM ERICSSON PUBL Optimized fidelity and reduced signaling in multi-channel audio encoding
8290783, Mar 04 2008 Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung E V Apparatus for mixing a plurality of input data streams
8655670, Apr 09 2010 Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung E V; DOLBY INTERNATIONAL AB Audio encoder, audio decoder and related methods for processing multi-channel audio signals using complex prediction
8768691, Mar 25 2005 III Holdings 12, LLC Sound encoding device and sound encoding method
20020010577,
20020040299,
20050141721,
20050197831,
20080002842,
20080004883,
20080249765,
20080262853,
20090190693,
20090262945,
20100014679,
20110046946,
20110096932,
20110224994,
20110257981,
20110288872,
20120010879,
20130028426,
20130030817,
20130108077,
20130266145,
CN101067931,
CN101501760,
EP4506141,
EP673014,
EP1262956,
EP1278184,
JP2004246038,
JP2005522721,
JP2013525830,
JP4506141,
JP9073299,
KR20020077959,
RU2144261,
RU98103512,
WO2008014853,
WO2009141775,
WO2008014853,
WO2008084427,
WO2009141775,
/
Executed onAssignorAssigneeConveyanceFrameReelDoc
Nov 17 2020Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung E.V.(assignment on the face of the patent)
Date Maintenance Fee Events
Nov 17 2020BIG: Entity status set to Undiscounted (note the period is included in the code).
Dec 17 2023M1552: Payment of Maintenance Fee, 8th Year, Large Entity.


Date Maintenance Schedule
Mar 21 20264 years fee payment window open
Sep 21 20266 months grace period start (w surcharge)
Mar 21 2027patent expiry (for year 4)
Mar 21 20292 years to revive unintentionally abandoned end. (for year 4)
Mar 21 20308 years fee payment window open
Sep 21 20306 months grace period start (w surcharge)
Mar 21 2031patent expiry (for year 8)
Mar 21 20332 years to revive unintentionally abandoned end. (for year 8)
Mar 21 203412 years fee payment window open
Sep 21 20346 months grace period start (w surcharge)
Mar 21 2035patent expiry (for year 12)
Mar 21 20372 years to revive unintentionally abandoned end. (for year 12)