Estimates of spectral magnitude and phase are obtained by an estimation process using spectral information from analysis filter banks such as the modified discrete cosine Transform. The estimation process may be implemented by convolution-like operations with impulse responses. Portions of the impulse responses may be selected for use in the convolution-like operations to trade off between computational complexity and estimation accuracy. Mathematical derivations of analytical expressions for filter structures and impulse responses are disclosed.

Patent
   RE48210
Priority
Jan 27 2004
Filed
Jan 22 2018
Issued
Sep 15 2020
Expiry
Jan 27 2024

TERM.DISCL.
Assg.orig
Entity
Large
0
39
EXPIRING-grace
0. 58. An apparatus for generating time domain signals from spectral components conveying content intended for human perception, the apparatus comprising:
one or more devices;
a non-transitory computer readable medium storing a program of instructions that is executable by the one or more devices to perform a method, the method comprising:
determining a second set of mdct coefficients for a second segment of the mdct signal, the second segment of the source at least partly overlapping the segment of the mdct signal;
using a second truncated set of non-zero filter coefficients to compute a second set of estimated contributions from the second set of mdct coefficients for the second segment of the mdct signal to the set of MDST coefficients for the segment of the mdct signal, the second set of estimated contributions approximating a second set of accurate contributions from the second set of mdct coefficients for the second segment of the mdct signal to the set of MDST coefficients for the segment of the mdct signal, the second set of accurate contributions being determinable based at least in part on a second full set of non-zero filter coefficients that represents a proper superset to the second truncated set of non-zero filter coefficients;
estimating, based at least in part on the second set of estimated contributions from the second set of mdct coefficients for the second segment of the mdct signal to the set of MDST coefficients for the segment of the mdct signal, the set of MDST coefficients for the segment of the mdct signal.
0. 45. A method for generating time domain signals from spectral components conveying content intended for human perception, the method comprising:
determining a set of modified discrete cosine Transform (mdct) coefficients for a segment of an mdct signal;
using a truncated set of non-zero filter coefficients to compute a set of estimated contributions from the set of mdct coefficients for the segment of the mdct signal to a set of modified discrete sine Transform (MDST) coefficients for the segment of the mdct signal, the set of estimated contributions approximating a set of accurate contributions from the set of mdct coefficients for the segment of the mdct signal to the set of MDST coefficients for the segment of the mdct signal, the set of accurate contributions being determinable based at least in part on a full set of non-zero filter coefficients that represents a proper superset to the truncated set of non-zero filter coefficients;
estimating, based at least in part on the set of estimated contributions from the set of mdct coefficients for the segment of the mdct signal to the set of MDST coefficients for the segment of the mdct signal, the set of MDST coefficients for the segment of the mdct signal, wherein the estimating of the set of MDST coefficients excludes from calculations impulse responses that are known to be zero;
causing outputting a derived signal that is generated based at least in part on the set of mdct coefficients and the set of estimated MDST coefficients;
wherein the method is performed by one or more computing devices.
0. 63. A non-transitory computer readable medium storing a program of instructions that is executable by a device to perform a method of generating time domain signals from spectral components conveying content intended for human perception, the method comprising:
determining a set of modified discrete cosine Transform (mdct) coefficients for a segment of an mdct signal;
using a truncated set of non-zero filter coefficients to compute a set of estimated contributions from the set of mdct coefficients for the segment of the mdct signal to a set of modified discrete sine Transform (MDST) coefficients for the segment of the mdct signal, the set of estimated contributions approximating a set of accurate contributions from the set of mdct coefficients for the segment of the mdct signal to the set of MDST coefficients for the segment of the mdct signal, the set of accurate contributions being determinable based at least in part on a full set of non-zero filter coefficients that represents a proper superset to the truncated set of non-zero filter coefficients;
estimating, based at least in part on the set of estimated contributions from the set of mdct coefficients for the segment of the mdct signal to the set of MDST coefficients for the segment of the mdct signal, the set of MDST coefficients for the segment of the mdct signal, wherein the estimating of the set of MDST coefficients excludes from calculations impulse responses that are known to be zero;
causing outputting a derived signal that is generated based at least in part on the set of mdct coefficients and the set of estimated MDST coefficients.
0. 1. A method of processing information representing a source signal conveying content intended for human perception, the method comprising:
receiving first spectral components that were generated by application of an analysis filterbank to the source signal, wherein the first spectral components represent spectral content of the source signal expressed in a first subspace of a multidimensional space;
deriving one or more first intermediate components from at least some of the first spectral components, wherein at least some of the first intermediate components differ from the first spectral components from which they are derived;
forming a combination of the one or more first intermediate components according to at least a portion of one or more impulse responses to obtain one or more second intermediate components;
deriving one or more second spectral components from the one or more second intermediate components, wherein the second spectral components represent spectral content of the source signal expressed in a second subspace of the multidimensional space that includes a portion of the multidimensional space not included in the first subspace;
obtaining estimated measures of magnitude or phase using the first spectral components and the second spectral components; and
applying an adaptive process to the first spectral components to generate processed information, wherein the adaptive process is responsive to the estimated measures of magnitude or phase.
0. 2. The method of claim 1, wherein:
the first spectral components are transform coefficients arranged in one or more blocks of transform coefficients that were generated by application of one or more transforms to one or more segments of the source signal; and
the portions of the one or more impulse responses are based on frequency response characteristics of the one or more transforms.
0. 3. The method of claim 2, wherein the frequency response characteristics of the one or more transforms are dependent on characteristics of one or more analysis window functions that were applied with the one or more transforms to the one or more segments of the source signal.
0. 4. The method of claim 3, wherein at least some of the one or more transforms implement an analysis filter bank that generates the first spectral components with time-domain aliasing.
0. 5. The method of claim 3, wherein at least some of the one or more transforms generate first spectral components having real values expressed in the first subspace, and wherein the second spectral values have imaginary values expressed in the second subspace.
0. 6. The method of claim 5, wherein the transforms that generate first spectral components having real values expressed in the first subspace are discrete cosine Transforms or modified discrete cosine Transforms.
0. 7. The method of claim 1, wherein:
the first spectral components are transform coefficients arranged in one or more blocks of transform coefficients that were generated by application of one or more transforms to one or more segments of the source signal,
the one or more second intermediate components are obtained by combining the one or more first intermediate components according to a portion of the one or more impulse responses, each of the one or more impulse responses comprise a respective set of elements arranged in order, and
the portion of each of the one or more impulse responses excludes every other element in the respective set of elements.
0. 8. The method according to claim 1 that further comprises obtaining estimated measures of magnitude or phase using one or more third spectral components that are derived from at least some of the one or more first spectral components.
0. 9. The method according to claim 8, wherein:
the first spectral components are transform coefficients arranged in one or more blocks of transform coefficients that were generated by application of one or more transforms to one or more segments of the source signal;
the third spectral components are derived from a combination of two or more of the first spectral components; and
the estimated measures of magnitude or phase for a respective segment of the source signal are obtained adaptively using either the third spectral components or using the first and second spectral components.
0. 10. The method according to claim 8, wherein:
the first spectral components are transform coefficients arranged in one or more blocks of transform coefficients that were generated by application of one or more transforms to one or more segments of the source signal;
the third spectral components are derived from a combination of two or more of the first spectral components; and
the estimated measures of magnitude or phase for at least some spectral content of a respective segment of the source signal are obtained using the third spectral components and the estimated measures of magnitude or phase for at least some of the spectral content of the respective segment of the source signal are obtained using the first and second spectral components.
0. 11. The method according to claim 8 or 10 that comprises obtaining measures of magnitude or phase adaptively using either the third spectral components or using the first and second spectral components.
0. 12. The method of claim 1 that comprises adapting the portion of the one or more impulse responses in response to a measure of spectral component significance.
0. 13. The method of claim 12, wherein the measure of spectral component significance is provided by a perceptual model that assesses perceptual significance of the spectral content of the source signal.
0. 14. The method of claim 12, wherein the measure of spectral component significance reflects isolation in frequency of one or more spectral components.
0. 15. The method of claim 1, wherein:
the first spectral components are first transform coefficients arranged in one or more blocks that were generated by application of one or more transforms to one or more segments of the source signal, a respective block having a first number of first transform coefficients;
the second spectral components are second transform coefficients;
a second number of second transform coefficients are derived that represent spectral content that is also represented by some of the first transform coefficients in the respective block; and
the second number is less than the first number.
0. 16. The method according to claim 1, 2, 9, 10 or 12 that comprises:
applying the adaptive process to the first spectral components to generate synthesized spectral components;
deriving one or more third intermediate components from the first spectral components and/or the second spectral components and from the synthesized spectral components; and
generating one or more output signals conveying content intended for human perception by applying one or more synthesis filterbanks to the one or more third intermediate components.
0. 17. The method according to claim 16, wherein at least some of the synthesized spectral components are generated by spectral component regeneration.
0. 18. The method according to claim 16, wherein at least some of the synthesized spectral components are generated by decomposition of first spectral components and/or second spectral components representing a composite of spectral content for a plurality of source signals.
0. 19. The method according to claim 16, wherein at least some of the synthesized spectral components are generated by combining first spectral components and/or second spectral components to provide a composite representation of spectral content for a plurality of source signals.
0. 20. The method according to claim 1, 2, 9, 10 or 12 that comprises:
generating the first spectral components by applying the analysis filter bank to the source signal;
applying the adaptive process to the first spectral component to generate encoded information representing at least some of the first spectral components; and
generating an output signal conveying the encoded information.
0. 21. A medium conveying a program of instructions that is executable by a device to perform a method of processing information representing a source signal conveying content intended for human perception, the method comprising:
receiving first spectral components that were generated by application of an analysis filterbank to the source signal, wherein the first spectral components represent spectral content of the source signal expressed in a first subspace of a multidimensional space;
deriving one or more first intermediate components from at least some of the first spectral components, wherein at least some of the first intermediate components differ from the first spectral components from which they are derived;
forming a combination of the one or more first intermediate components according to at least a portion of one or more impulse responses to obtain one or more second intermediate components;
deriving one or more second spectral components from the one or more second intermediate components, wherein the second spectral components represent spectral content of the source signal expressed in a second subspace of the multidimensional space that includes a portion of the multidimensional space not included in the first subspace;
obtaining estimated measures of magnitude or phase using the first spectral components and the second spectral components; and
applying an adaptive process to the first spectral components to generate processed information, wherein the adaptive process is responsive to the estimated measures of magnitude or phase.
0. 22. The medium of claim 21, wherein:
the first spectral components are transform coefficients arranged in one or more blocks of transform coefficients that were generated by application of one or more transforms to one or more segments of the source signal; and
the portions of the one or more impulse responses are based on frequency response characteristics of the one or more transforms, which are dependent on characteristics of one or more analysis window functions that were applied with the one or more transforms to the one or more segments of the source signal.
0. 23. The medium according to claim 21, wherein the method further comprises obtaining estimated measures of magnitude or phase using one or more third spectral components that are derived from at least some of the one or more first spectral components.
0. 24. The medium according to claim 23, wherein:
the first spectral components are transform coefficients arranged in one or more blocks of transform coefficients that were generated by application of one or more transforms to one or more segments of the source signal;
the third spectral components are derived from a combination of two or more of the first spectral components; and
the estimated measures of magnitude or phase for a respective segment of the source signal are obtained adaptively using either the third spectral components or using the first and second spectral components.
0. 25. The medium according to claim 23, wherein:
the first spectral components are transform coefficients arranged in one or more blocks of transform coefficients that were generated by application of one or more transforms to one or more segments of the source signal;
the third spectral components are derived from a combination of two or more of the first spectral components; and
the estimated measures of magnitude or phase for at least some spectral content of a respective segment of the source signal are obtained using the third spectral components and the estimated measures of magnitude or phase for at least some of the spectral content of the respective segment of the source signal are obtained using the first and second spectral components.
0. 26. The medium according to claim 23, wherein the method comprises obtaining measures of magnitude or phase adaptively using either the third spectral components or using the first and second spectral components.
0. 27. The medium of claim 21, wherein the method comprises adapting the portion of the one or more impulse responses in response to a measure of spectral component significance.
0. 28. The medium of claim 27, wherein the measure of spectral component significance is provided by a perceptual model that assesses perceptual significance of the spectral content of the source signal.
0. 29. The medium of claim 27, wherein the measure of spectral component significance reflects isolation in frequency of one or more spectral components.
0. 30. The medium of claim 21, wherein:
the first spectral components are first transform coefficients arranged in one or more blocks that were generated by application of one or more transforms to one or more segments of the source signal, a respective block having a first number of first transform coefficients;
the second spectral components are second transform coefficients;
a second number of second transform coefficients are derived that represent spectral content that is also represented by some of the first transform coefficients in the respective block; and
the second number is less than the first number.
0. 31. The medium according to claim 21, wherein the method comprises:
applying the adaptive process to the first spectral components to generate synthesized spectral components;
deriving one or more third intermediate components from the first spectral components and/or the second spectral components and from the synthesized spectral components; and
generating one or more output signals conveying content intended for human perception by applying one or more synthesis filterbanks to the one or more third intermediate components.
0. 32. The medium according to claim 21, wherein the method comprises:
generating the first spectral components by applying the analysis filter bank to the source signal;
applying the adaptive process to the first spectral component to generate encoded information representing at least some of the first spectral components; and
generating an output signal conveying the encoded information.
0. 33. An apparatus for processing information representing a source signal conveying content intended for human perception, the apparatus comprising:
means for receiving first spectral components that were generated by application of an analysis filterbank to the source signal, wherein the first spectral components represent spectral content of the source signal expressed in a first subspace of a multidimensional space;
means for deriving one or more first intermediate components from at least some of the first spectral components, wherein at least some of the first intermediate components differ from the first spectral components from which they are derived;
means for forming a combination of the one or more first intermediate components according to at least a portion of one or more impulse responses to obtain one or more second intermediate components;
means for deriving one or more second spectral components from the one or more second intermediate components, wherein the second spectral components represent spectral content of the source signal expressed in a second subspace of the multidimensional space that includes a portion of the multidimensional space not included in the first subspace;
means for obtaining estimated measures of magnitude or phase using the first spectral components and the second spectral components; and
means for applying an adaptive process to the first spectral components to generate processed information, wherein the adaptive process is responsive to the estimated measures of magnitude or phase.
0. 34. The apparatus of claim 33, wherein:
the first spectral components are transform coefficients arranged in one or more blocks of transform coefficients that were generated by application of one or more transforms to one or more segments of the source signal; and
the portions of the one or more impulse responses are based on frequency response characteristics of the one or more transforms, which are dependent on characteristics of one or more analysis window functions that were applied with the one or more transforms to the one or more segments of the source signal.
0. 35. The apparatus according to claim 33 that further comprises means for obtaining estimated measures of magnitude or phase using one or more third spectral components that are derived from at least some of the one or more first spectral components.
0. 36. The apparatus according to claim 35 wherein:
the first spectral components are transform coefficients arranged in one or more blocks of transform coefficients that were generated by application of one or more transforms to one or more segments of the source signal;
the third spectral components are derived from a combination of two or more of the first spectral components; and
the estimated measures of magnitude or phase for a respective segment of the source signal are obtained adaptively using either the third spectral components or using the first and second spectral components.
0. 37. The apparatus according to claim 35, wherein:
the first spectral components are transform coefficients arranged in one or more blocks of transform coefficients that were generated by application of one or more transforms to one or more segments of the source signal;
the third spectral components are derived from a combination of two or more of the first spectral components; and
the estimated measures of magnitude or phase for at least some spectral content of a respective segment of the source signal are obtained using the third spectral components and the estimated measures of magnitude or phase for at least some of the spectral content of the respective segment of the source signal are obtained using the first and second spectral components.
0. 38. The apparatus according to claim 35 that comprises means for obtaining measures of magnitude or phase adaptively using either the third spectral components or using the first and second spectral components.
0. 39. The apparatus of claim 33 that comprises means for adapting the portion of the one or more impulse responses in response to a measure of spectral component significance.
0. 40. The apparatus of claim 39, wherein the measure of spectral component significance is provided by a perceptual model that assesses perceptual significance of the spectral content of the source signal.
0. 41. The apparatus of claim 39, wherein the measure of spectral component significance reflects isolation in frequency of one or more spectral components.
0. 42. The apparatus of claim 33, wherein:
the first spectral components are first transform coefficients arranged in one or more blocks that were generated by application of one or more transforms to one or more segments of the source signal, a respective block having a first number of first transform coefficients;
the second spectral components are second transform coefficients;
a second number of second transform coefficients are derived that represent spectral content that is also represented by some of the first transform coefficients in the respective block; and
the second number is less than the first number.
0. 43. The apparatus according to claim 33 that comprises:
means for applying the adaptive process to the first spectral components to generate synthesized spectral components;
means for deriving one or more third intermediate components from the first spectral components and/or the second spectral components and from the synthesized spectral components; and
means for generating one or more output signals conveying content intended for human perception by applying one or more synthesis filterbanks to the one or more third intermediate components.
0. 44. The apparatus according to claim 33 that comprises:
means for generating the first spectral components by applying the analysis filter bank to the source signal;
means for applying the adaptive process to the first spectral component to generate encoded information representing at least some of the first spectral components; and
means for generating an output signal conveying the encoded information.
0. 46. The method of claim 45, further comprising:
determining a second set of mdct coefficients for a second segment of the mdct signal, the second segment of the source at least partly overlapping the segment of the mdct signal;
using a second truncated set of non-zero filter coefficients to compute a second set of estimated contributions from the second set of mdct coefficients for the second segment of the mdct signal to the set of MDST coefficients for the segment of the mdct signal, the second set of estimated contributions approximating a second set of accurate contributions from the second set of mdct coefficients for the second segment of the mdct signal to the set of MDST coefficients for the segment of the mdct signal, the second set of accurate contributions being determinable based at least in part on a second full set of non-zero filter coefficients that represents a proper superset to the second truncated set of non-zero filter coefficients;
estimating, based at least in part on the second set of estimated contributions from the second set of mdct coefficients for the second segment of the mdct signal to the set of MDST coefficients for the segment of the mdct signal, the set of MDST coefficients for the segment of the mdct signal.
0. 47. The method of claim 46, wherein the second truncated set of non-zero filter coefficients exhibits an even symmetry.
0. 48. The method of claim 46, wherein the second truncated set of non-zero filter coefficients comprises a subset of consecutive non-zero filter coefficients in the second full set of non-zero filter coefficients.
0. 49. The method of claim 45, further comprising using encoder-generated spectral envelope estimation to perform spectral component regeneration.
0. 50. The method of claim 45, wherein the set of mdct coefficients is of an even total number of mdct coefficients.
0. 51. The method of claim 45, wherein the truncated set of non-zero filter coefficients exhibits an odd symmetry.
0. 52. The method of claim 45, wherein the truncated set of non-zero filter coefficients comprises most significant filter coefficients in the full set of non-zero filter coefficients.
0. 53. The method of claim 45, wherein the truncated set of non-zero filter coefficients comprises a subset of consecutive non-zero filter coefficients in the full set of non-zero filter coefficients.
0. 54. The method of claim 45, wherein the truncated set of non-zero filter coefficients comprises values dependent on one or more window functions used in filter banks that have generated the set of mdct coefficients.
0. 55. The method of claim 54, wherein the one or more window functions comprises a sine function.
0. 56. The method of claim 54, wherein the one or more window functions comprises a non-sine function.
0. 57. The method of claim 45, wherein at least one of the mdct signal or the derived signal represents an audio signal.
0. 59. The apparatus of claim 58, wherein the method further comprises:
determining a second set of mdct coefficients for a second segment of the mdct signal, the second segment of the source at least partly overlapping the segment of the mdct signal;
using a second truncated set of non-zero filter coefficients to compute a second set of estimated contributions from the second set of mdct coefficients for the second segment of the mdct signal to the set of MDST coefficients for the segment of the mdct signal, the second set of estimated contributions approximating a second set of accurate contributions from the second set of mdct coefficients for the second segment of the mdct signal to the set of MDST coefficients for the segment of the mdct signal, the second set of accurate contributions being determinable based at least in part on a second full set of non-zero filter coefficients that represents a proper superset to the second truncated set of non-zero filter coefficients;
estimating, based at least in part on the second set of estimated contributions from the second set of mdct coefficients for the second segment of the mdct signal to the set of MDST coefficients for the segment of the mdct signal, the set of MDST coefficients for the segment of the mdct signal.
0. 60. The apparatus of claim 59, wherein the second truncated set of non-zero filter coefficients exhibits an even symmetry.
0. 61. The apparatus of claim 59, wherein the second truncated set of non-zero filter coefficients comprises a subset of consecutive non-zero filter coefficients in the second full set of non-zero filter coefficients.
0. 62. The apparatus of claim 58, wherein the method further comprises using encoder-generated spectral envelope estimation to perform spectral component regeneration.
0. 64. The medium of claim 63, wherein the method further comprises:
determining a second set of mdct coefficients for a second segment of the mdct signal, the second segment of the source at least partly overlapping the segment of the mdct signal;
using a second truncated set of non-zero filter coefficients to compute a second set of estimated contributions from the second set of mdct coefficients for the second segment of the mdct signal to the set of MDST coefficients for the segment of the mdct signal, the second set of estimated contributions approximating a second set of accurate contributions from the second set of mdct coefficients for the second segment of the mdct signal to the set of MDST coefficients for the segment of the mdct signal, the second set of accurate contributions being determinable based at least in part on a second full set of non-zero filter coefficients that represents a proper superset to the second truncated set of non-zero filter coefficients;
estimating, based at least in part on the second set of estimated contributions from the second set of mdct coefficients for the second segment of the mdct signal to the set of MDST coefficients for the segment of the mdct signal, the set of MDST coefficients for the segment of the mdct signal.

and rewritten as

X ODFT ( k ) = n = 0 N - 1 x ( n ) · cos [ 2 π N ( k + 1 2 ) ( n + n 0 ) ] - j · n = 0 N - 1 x ( n ) · sin [ 2 π N ( k + 1 2 ) ( n + n 0 ) ] ( 3 )
where XODFT(k)=ODFT coefficient for spectral component k,

x(n)=source signal amplitude at time n;

Re[X]=real part of X; and

Im[X]=imaginary part of X.

The magnitude and phase of each spectral component k may be calculated as follows:

Mag [ X ODFT ( k ) ] = | X ODFT ( k ) | = Re [ X ODFT ( k ) ] 2 + Im [ X ODFT ( k ) ] 2 ( 4 ) Ph s [ X ODFT ( k ) ] = arctan [ Im [ X ODFT ( k ) ] Re [ X ODFT ( k ) ] ] ( 5 )
where Mag[X]=magnitude of X; and

Phs[X]=phase of X.

Many coding applications implement the analysis filter bank 3 by applying the Modified Discrete Cosine Transform (MDCT) discussed above to overlapping segments of the source signal that are modulated by an analysis window function. This transform may be expressed as:

X MDCT ( k ) = n = 0 N - 1 x ( n ) · cos [ 2 π N ( k + 1 2 ) ( n + n 0 ) ] ( 6 )
where XMDCT(k)=MDCT coefficient for spectral component k. It may be seen that the spectral components that are generated by the MDCT are equivalent to the real part of the ODFT coefficients.
XMDCT(k)=Re[XODFT(k)]  (7)

A particular Modified Discrete Sine Transform (MDST) that generates coefficients representing spectral components in quadrature with the spectral components represented by coefficients of the MDCT may be expressed as:

X MDST ( k ) = n = 0 N - 1 x ( n ) · sin [ 2 π N ( k + 1 2 ) ( n + n 0 ) ] ( 8 )
where XMDST(k)=MDST coefficient for spectral component k. It may be seen that the spectral components that are generated by the MDST are equivalent to the negative imaginary part of the ODFT coefficients.
XMDST(k)=−Im[XODFT(k)]  (9)

Accurate measures of magnitude and phase cannot be calculated directly from MDCT coefficients but they can be calculated directly from a combination of MDCT and MDST coefficients, which can be seen by substituting equations 7 and 9 into equations 4 and 5:

Mag [ X ODFT ( k ) ] = X MDCT 2 ( k ) + X MDST 2 ( k ) ( 10 ) Phs [ X ODFT ( k ) ] = arctan [ - X M D S T ( k ) X M D C T ( k ) ] ( 11 )

The Princen paper mentioned above indicates that a correct use of the MDCT requires the application of an analysis window function that satisfies certain design criteria. The expressions of transform equations in this section of the disclosure omit an explicit reference to any analysis window function, which implies a rectangular analysis window function that does not satisfy these criteria. This does not affect the validity of expressions 10 and 11.

Implementations of the present invention described below obtain measures of spectral component magnitude and phase from MDCT coefficients and from MDST coefficients derived from the MDCT coefficients. These implementations are described below following a discussion of the underlying mathematical basis.

This section discusses the derivation of an analytical expression for calculating exact MDST coefficients from MDCT coefficients. This expression is shown below in equations 41a and 41b. The derivations of simpler analytical expressions for two specific window functions are also discussed. Considerations for practical implementations are presented following a discussion of the derivations.

One implementation of the present invention discussed below is derived from a process for calculating exact MDST coefficients from MDCT coefficients. This process is equivalent to another process that applies an Inverse Modified Discrete Cosine Transform (IMDCT) synthesis filter bank to blocks of MDCT coefficients to generate windowed segments of time-domain samples, overlap-adds the windowed segments of samples to reconstruct a replica of the original source signal, and applies an MDST analysis filter bank to a segment of the recovered signal to generate the MDST coefficients.

Exact MDST coefficients cannot be calculated from a single segment of windowed samples that is recovered by applying the IMDCT synthesis filter bank to a single block of MDCT coefficients because the segment is modulated by an analysis window function and because the recovered samples contain time-domain aliasing. The exact MDST coefficients can be computed only with the additional knowledge of the MDCT coefficients for the preceding and subsequent segments. For example, in the case where the segments overlap one another by one-half the segment length, the effects of windowing and the time-domain aliasing for a given segment II can be canceled by applying the synthesis filter bank and associated synthesis window function to three blocks of MDCT coefficients representing three consecutive overlapping segments of the source signal, denoted as segment I, segment II and segment III. Each segment overlaps an adjacent segment by an amount equal to one-half of the segment length. Windowing effects and time-domain aliasing in the first half of segment II are canceled by an overlap-add with the second half of segment I, and these effects in the second half of segment II are canceled by an overlap-add with the first half of segment III.

The expression that calculates MDST coefficients from MDCT coefficients depends on the number of segments of the source signal, the overlap structure and length of these segments, and the choice of the analysis and synthesis window functions. None of these features are important in principle to the present invention. For ease of illustration, however, it is assumed in the examples discussed below that the three segments have the same length N, which is even, and overlap one another by an amount equal to one-half the segment length, that the analysis and synthesis window functions are identical to one another, that the same window functions are applied to all segments of the source signal, and that the window functions are such that their overlap-add properties satisfy the following criterion, which is required for perfect reconstruction of the source signal as explained in the Princen paper.

w ( r ) 2 + w ( r + N 2 ) 2 = 1 for r [ 0 , N 2 - 1 ]
where w(r)=analysis and synthesis window function; and

    • N=length of each source signal segment.
    • The MDCT coefficients X, for the source signal x(n) in each of the segments i may be expressed as:

X I = n = 0 N - 1 w ( n ) x ( n ) cos ( 2 π N ( p + 1 2 ) ( n + n 0 ) ) ( 12 ) X II ( p ) = n = 0 N - 1 w ( n ) x ( n + N 2 ) cos ( 2 π N ( p + 1 2 ) ( n + n 0 ) ) ( 13 ) X III ( p ) = n = 0 N - 1 w ( n ) x ( n + N ) cos ( 2 π N ( p + 1 2 ) ( n + n 0 ) ) ( 14 )

The windowed time-domain samples {circumflex over (x)} that are obtained from an application of the IMDCT synthesis filter bank to each block of MDCT coefficients may be expressed as:

x ^ l ( r ) = 2 w ( r ) N p = 0 N - 1 X l ( p ) cos ( 2 π N ( p + 1 2 ) ( r + n 0 ) ) ( 15 ) x ^ ll ( r ) = 2 w ( r ) N p = 0 N - 1 X lI ( p ) cos ( 2 π N ( p + 1 2 ) ( r + n 0 ) ) ( 16 ) x ^ lll ( r ) = 2 w ( r ) N p = 0 N - 1 X lll ( p ) cos ( 2 π N ( p + 1 2 ) ( r + n 0 ) ) ( 17 )

Samples s(r) of the source signal for segment II are reconstructed by overlapping and adding the three windowed segments as described above, thereby removing the time-domain aliasing from the source signal x. This may be expressed as:

s ( r ) = { x ^ l ( r + N 2 ) + x ^ ll ( r ) for r [ 0 , N 2 - 1 ] x ^ ll ( r ) + x ^ lll ( r - N 2 ) for r [ N 2 , N - 1 ] ( 18 )

A block of MDST coefficients S(k) may be calculated for segment II by applying an MDST analysis filter bank to the time-domain samples in the reconstructed segment II, which may be expressed as:

S ( k ) = r = 0 N - 1 w ( r ) s ( r ) sin ( 2 π N ( k + 1 2 ) ( r + n 0 ) ) ( 19 )

Using expression 18 to substitute for s(r), expression 19 can be rewritten as:

S ( k ) = r = 0 N 2 - 1 w ( r ) [ x ^ l ( r + N 2 ) + x ^ ll ( r ) ] sin ( 2 π N ( k + 1 2 ) ( r + n 0 ) ) + r = N 2 N - 1 w ( r ) [ x ^ ll ( r ) + x ^ lll ( r - N 2 ) ] sin ( 2 π N ( k + 1 2 ) ( r + n 0 ) ) ( 20 )
This equation can be rewritten in terms of the MDCT coefficients by using expressions 15-17 to substitute for the time-domain samples:

S ( k ) = p = 0 N 2 - 1 w ( r ) ( w ( r + N 2 ) N r = 0 N - 1 X l ( p ) cos ( 2 π N ( p + 1 2 ) ( r + n 0 ) ) ) sin ( 2 π N ( k + 1 2 ) ( r + n 0 ) ) + r = 0 N 2 - 1 w ( r ) ( w ( r ) N p = 0 N - 1 X ll ( p ) cos ( 2 π N ( p + 1 2 ) ( r + n 0 ) ) ) sin ( 2 π N ( k + 1 2 ) ( r + n 0 ) ) + r = N 2 N - 1 w ( r ) ( w ( r ) N p = 0 N - 1 X ll ( p ) cos ( 2 π N ( p + 1 2 ) ( r + n 0 ) ) ) sin ( 2 π N ( k + 1 2 ) ( r + n 0 ) ) + r = N 2 N - 1 w ( r ) ( w ( r - N 2 ) N p = 0 N - 1 X lll ( p ) cos ( 2 π N ( p + 1 2 ) ( r + n 0 ) ) ) sin ( 2 π N ( k + 1 2 ) ( r + n 0 ) ) ( 21 )
The remainder of this section of the disclosure shows how this equation can be simplified as shown below in equations 41a and 41b.

Using the trigonometric identity sin α·cos β=½[sin (α+β)+ sin (α−β)] to gather terms and switching the order of summation, expression 21 can be rewritten as

S ( k ) = 1 N p = 0 N - 1 X l ( p ) r = 0 N 2 - 1 w ( r ) w ( r + N 2 ) · sin [ 2 π N ( k + 1 2 ) ( r + n 0 ) + 2 π N ( p + 1 2 ) ( r + n 0 ) + 2 π N ( p + 1 2 ) ( N 2 ) ] + 1 N p = 0 N - 1 X l ( p ) r = 0 N 2 - 1 w ( r ) w ( r + N 2 ) · sin [ 2 π N ( k + 1 2 ) ( r + n 0 ) - 2 π N ( p + 1 2 ) ( r + n 0 ) - 2 π N ( p + 1 2 ) ( N 2 ) ] + 1 N p = 0 N - 1 X II ( p ) r = 0 N 2 - 1 w ( r ) w ( r ) sin [ 2 π N ( k + p + 1 ) ( r + n 0 ) ] + 1 N p = 0 N - 1 X II ( p ) r = 0 N - 1 w ( r ) w ( r ) sin [ 2 π N ( k - p + 1 ) ( r + n 0 ) ] + 1 N p = 0 N - 1 X II ( p ) r = N 2 N - 1 w ( r ) w ( r ) sin [ 2 π N ( k + p + 1 ) ( r + n 0 ) ] + 1 N p = 0 N - 1 X II ( p ) r = N 2 N - 1 w ( r ) w ( r ) sin [ 2 π N ( k - p ) ( r + n 0 ) ] + 1 N p = 0 N - 1 X III ( p ) r = N 2 N - 1 w ( r ) w ( r - N 2 ) · sin [ 2 π N ( k + 1 2 ) ( r + n 0 ) + 2 π N ( p + 1 2 ) ( r + n 0 ) - 2 π N ( p + 1 2 ) ( N 2 ) ] + 1 N p = 0 N - 1 X III ( p ) r = N 2 N - 1 w ( r ) w ( r - N 2 ) · sin [ 2 π N ( k + 1 2 ) ( r + n 0 ) - 2 π N ( p + 1 2 ) ( r + n 0 ) + 2 π N ( p + 1 2 ) ( N 2 ) ] ( 22 )

This expression can be simplified by combining pairs of terms that are equal to each other. The first and second terms are equal to each other. The third and fourth terms are equal to each other. The fifth and sixth terms are equal to each other and the seventh and eighth terms are equal to each other. The equality between the third and fourth terms, for example, may be shown by proving the following lemma:

1 N p = 0 N - 1 X II ( p ) r = 0 N 2 - 1 w ( r ) w ( r ) sin [ 2 π N ( k + p + 1 ) ( r + n 0 ) ] = 1 N p = 0 N - 1 X II ( p ) r = 0 N 2 - 1 w ( r ) w ( r ) sin [ 2 π N ( k - p ) ( r + n 0 ) ] ( 23 )

This lemma may be proven by rewriting the left-hand and right-hand sides of equation 23 as functions of p as follows:

1 N p = 0 N - 1 X II ( p ) r = 0 N 2 - 1 w ( r ) w ( r ) sin [ 2 π N ( k + p + 1 ) ( r + n 0 ) ] = 1 N p = 0 N - 1 F ( p ) ( 24 a ) 1 N p = 0 N - 1 X II ( p ) r = 0 N 2 - 1 w ( r ) w ( r ) sin [ 2 π N ( k - p ) ( r + n 0 ) ] = 1 N p = 0 N - 1 G ( p ) ( 24 b )
where

F ( p ) = X II ( p ) r = 0 N 2 - 1 w ( r ) w ( r ) sin [ 2 π n ( k + p + 1 ) ( r + n 0 ) ] ( 25 a ) G ( p ) = X II ( p ) r = 0 N 2 - 1 w ( r ) w ( r ) sin [ 2 π n ( k - p ) ( r + n 0 ) ] ( 25 b )
The expression of G as a function of (p) can be rewritten as a function of (N−1−p) as follows:

G ( N - 1 - p ) = X II ( N - 1 - p ) r = 0 N 2 - 1 w ( r ) w ( r ) sin [ 2 π N ( k - ( N - 1 - p ) ) ( r + n 0 ) ] ( 26 )

It is known that MDCT coefficients are odd symmetric; therefore, XII(N−1−p)=−XII(p) for

p [ 0 , N 2 - 1 ] .
By rewriting (k−(N−1−p)) as (k+1+p)−N, it may be seen that (k−(N−1−p))·(r+n0)=(k+1+p)·(r+n0)−N·(r+n0). These two equalities allow expression 26 to be rewritten as:

G ( N - 1 - p ) = - X II ( p ) r = 0 N 2 - 1 w ( r ) w ( r ) sin [ 2 π N ( k + p + 1 ) ( r + n 0 ) - 2 π ( r + n 0 ) ] ( 27 )
Referring to the Princen paper, the value for n0 is ½(N/2+1), which is mid-way between two integers. Because r is an integer, it can be seen that the final term 27π(r+n0) in the summand of expression 27 is equal to an odd integer multiple of π; therefore, expression 27 can be rewritten as

G ( N - 1 - p ) = + X II ( p ) r = 0 N 2 - 1 w ( r ) w ( r ) sin [ 2 π N ( k + p + 1 ) ( r + n 0 ) ) ] = F ( p ) ( 28 )
which proves the lemma shown in equation 23. The equality between the other pairs of terms in equation 22 can be shown in a similar manner.

By omitting the first, third, fifth and seventh terms in expression 22 and doubling the second, fourth, sixth and eighth terms, equation 22 can be rewritten as follows after simplifying the second and eighth terms:

S ( k ) = 2 N p = 0 N - 1 X I ( p ) r = 0 N 2 - 1 w ( r ) w ( r + N 2 ) sin [ 2 π N ( k - p ) ( r + n 0 ) - πp - π 2 ] + 2 N p = 0 N - 1 X II ( p ) r = 0 N 2 - 1 w ( r ) w ( r ) sin [ 2 π N ( k - p ) ( r + n 0 ) ] + 2 N p = 0 N - 1 X II ( p ) r = N 2 N - 1 w ( r ) w ( r ) sin [ 2 π N ( k - p ) ( r + n 0 ) ] + 2 N p = 0 N - 1 X III ( p ) r = N 2 N - 1 w ( r ) w ( r - N 2 ) sin [ 2 π N ( k - p ) ( r + n 0 ) + πp + π 2 ] ( 29 )
Using the following identities:

sin ( α ± πp ) = ( - 1 ) p sin α sin ( α + π 2 ) = + cos α sin ( α - π 2 ) = - cos α ( 30 )
expression 29 can be rewritten as:

S ( k ) = 2 N p = 0 N - 1 ( - 1 ) p + 1 X I ( p ) r = 0 N 2 - 1 w ( r ) w ( r + N 2 ) cos [ 2 π N ( k - p ) ( r + n 0 ) ] + 2 N p = 0 N - 1 X II ( p ) r = 0 N 2 - 1 w ( r ) w ( r ) sin [ 2 π N ( k - p ) ( r + n 0 ) ] + 2 N p = 0 N - 1 X II ( p ) r = N 2 N - 1 w ( r ) w ( r ) sin [ 2 π N ( k - p ) ( r + n 0 ) ] + 2 N p = 0 N - 1 ( - 1 ) p X III ( p ) r = N 2 N - 1 w ( r ) w ( r - N 2 ) cos [ 2 π N ( k - p ) ( r + n 0 ) ] ( 31 )

The inner summations of the third and fourth terms are changed so that their limits of summation are from r=0 to r=(N/2−1) by making the following substitutions:

sin ( 2 π N ( k - p ) ( r + n 0 + N 2 ) ) = ( - 1 ) k - p sin ( 2 π N ( k - p ) ( r + n 0 ) ) cos ( 2 π N ( k - p ) ( r + n 0 + N 2 ) ) = ( - 1 ) k - p cos ( 2 π N ( k - p ) ( r + n 0 ) )
This allows equation 31 to be rewritten as

S ( k ) = 2 N p = 0 N - 1 ( - 1 ) p + 1 X I ( p ) r = 0 N 2 - 1 w ( r ) w ( r + N 2 ) cos [ 2 π N ( k - p ) ( r + n 0 ) ] + 2 N p = 0 N - 1 X II ( p ) r = 0 N 2 - 1 w ( r ) w ( r ) sin [ 2 π N ( k - p ) ( r + n 0 ) ] + 2 N p = 0 N - 1 ( - 1 ) ( k - p ) X II ( p ) r = 0 N 2 - 1 w ( r + N 2 ) w ( r + N 2 ) sin [ 2 π N ( k - p ) ( r + n 0 ) ] + 2 N p = 0 N - 1 ( - 1 ) p ( - 1 ) ( k - p ) X III ( p ) r = 0 N 2 - 1 w ( r + N 2 ) w ( r ) cos [ 2 π N ( k - p ) ( r + n 0 ) ] ( 32 )

Equation 32 can be simplified by using the restriction imposed on the window function mentioned above that is required for perfect reconstruction of the source signal. This restriction is w(r)2+w(r+N/2)2=1. With this restriction, equation 31 can be simplified to

S ( k ) = 2 N p = 0 N - 1 [ ( - 1 ) p + 1 X I ( p ) + ( - 1 ) k X III ( p ) ] · r = 0 N 2 - 1 w ( r ) w ( r + N 2 ) cos [ 2 π N ( k - p ) ( r + n 0 ) ] + 2 N p = 0 N - 1 X II ( p ) r = 0 N 2 - 1 w 2 ( r ) sin [ 2 π N ( k - p ) ( r + n 0 ) ] + 2 N p = 0 N - 1 ( - 1 ) ( k - p ) X II ( p ) r = 0 N 2 - 1 ( 1 - w 2 ( r ) ) sin [ 2 π N ( k - p ) ( r + n 0 ) ] ( 33 )
Gathering terms, equation 33 can be rewritten as

S ( k ) = 2 N p = 0 N - 1 [ ( - 1 ) p + 1 X I ( p ) + ( - 1 ) k X III ( p ) ] · r = 0 N 2 - 1 w ( r ) w ( r + N 2 ) cos [ 2 π N ( k - p ) ( r + n 0 ) ] + 2 N p = 0 N - 1 [ X II ( p ) - ( - 1 ) ( k - p ) X II ( p ) ] r = 0 N 2 - 1 w 2 ( r ) sin [ 2 π N ( k - p ) ( r + n 0 ) ] + 2 N p = 0 N - 1 ( - 1 ) ( k - p ) X II ( p ) r = 0 N 2 - 1 sin [ 2 π N ( k - p ) ( r + n 0 ) ] ( 34 )

Equation 34 can be simplified by recognizing the inner summation of the third term is equal to zero. This can be shown by proving two lemmas. One lemma postulates the following equality:

I α , q ( r ) = r = 0 N 2 - 1 sin ( 2 π N ( q ) ( r + α ) ) = sin ( 2 πq α N + πq 2 - πq N ) sin πq 2 sin πq N ( 35 )

This equality may be proven by rewriting the summand into exponential form, rearranging, simplifying and combining terms as follows:

I α , q ( r ) = r = 0 N 2 - 1 1 2 i [ exp ( + j 2 πq N ( r + a ) ) - exp ( - j 2 πq N ( r + a ) ) ] = 1 2 i exp ( + j 2 πqa N ) r = 0 N 2 - 1 exp ( + j 2 πqr N ) - 1 2 i exp ( - j 2 πqa N ) r = 0 N 2 - 1 exp ( - j 2 πqr N ) = 1 2 i exp ( + j 2 πqa N ) [ 1 - exp ( + j 2 πq N N 2 ) 1 - exp ( + j 2 πq N ] - 1 2 i exp ( - j 2 πqa N ) [ 1 - exp ( - j 2 πq N N 2 ) 1 - exp ( - j 2 πq N ] = 1 2 i exp ( + j 2 πqa N ) exp ( + j πq 2 ) exp ( + j πq N ) [ exp ( - j πq 2 ) - exp ( + j πq 2 ) exp ( - j πq 2 ) - exp ( + j πq 2 ) ] - 1 2 i exp ( - j 2 πqa N ) exp ( - j πq 2 ) exp ( - j πq N ) [ exp ( + j πq 2 ) - exp ( - j πq 2 ) exp ( + j πq N ) - exp ( - j πq 2 ) ] = 1 2 i exp ( + j 2 πqa N + j πq 2 - j πq 2 ) sin πq 2 sin πq N - 1 2 i exp ( - j 2 πqa N - j πq 2 + j πq N ) sin πq 2 sin πq N I q , α ( r ) = sin ( 2 πqa N + πq 2 - πq N ) sin πq 2 sin πq N ( 36 )
The other lemma postulates

r = 0 N 2 - 1 sin [ 2 π N ( k - p ) ( r + n 0 ) ] = 0 for n 0 = 1 2 ( N 2 + 1 ) .
This may be proven by substituting n0 for a in expression 35 to obtain the following:

I n 0 , q ( r ) = sin ( 2 πq ( N 2 + 1 ) N + πq 2 - πq N ) sin πq 2 sin πq N = sin ( πq N ( N 2 + 1 ) + πq 2 - πq N ) sin πq 2 sin πq N = sin ( πq 2 + πq N + πq 2 - πq N ) sin πq 2 sin πq N = sin ( πq ) sin πq 2 sin πq N = 0 for q , an integer . ( 37 )
By substituting (k−p) for q in expression 35 and using the preceding two lemmas, the inner summation of the third term in equation 34 may be shown to equal zero as follows:

r = 0 N 2 - 1 sin [ 2 π N q ( r + n 0 ) ] = r = 0 N 2 - 1 sin [ 2 π N ( k - p ) ( r + n 0 ) ] = 0 for n 0 = 1 2 ( N 2 + 1 ) .

Using this equality, equation 34 may be simplified to the following:

S ( k ) = 2 N p = 0 N - 1 [ ( - 1 ) p + 1 X l ( p ) + ( - 1 ) k X III ( p ) ] · r = 0 N 2 - 1 w ( r ) w ( r + N 2 ) cos [ 2 π N ( k - p ) ( r + n 0 ) ] + 2 N p = 0 N - 1 [ [ 1 - ( - 1 ) ( k - p ) ] X II ( p ) ] r = 0 N 2 - 1 w 2 ( r ) sin [ 2 π N ( k - p ) ( r + n 0 ) ] ( 38 )

The MDST coefficients S(k) of a real-valued signal are symmetric according to the expression S(k)=S(N−1−k), for kϵ[0, N−1]. Using this property, all even numbered coefficients can be expressed as S(2v)=S(N−1−2v)=S(N−2(v+1)+1), for

v [ 0 , N 2 - 1 ] .
Because N and 2(v+1) are both even numbers, the quantity (N−2(v+1)+1) is an odd number. From this, it can be seen the even numbered coefficients can be expressed in terms of the odd numbered coefficients. Using this property of the coefficients, equation 38 can be rewritten as follows:

S ( 2 v ) = 2 N p = 0 N - 1 [ ( - 1 ) p + 1 X I ( p ) + X III ( p ) ] · r = 0 N 2 - 1 w ( r ) w ( r + N 2 ) cos [ 2 π N ( 2 v - p ) ( r + n 0 ) ] + 2 N p = 0 N - 1 [ ( 1 - ( - 1 ) - p ) X II ( p ) ] r = 0 N 2 - 1 w 2 ( r ) sin [ 2 π N ( 2 v - p ) ( r + n 0 ) ] where k = 2 v , v [ 0 , N 2 - 1 ] ( 39 )

The second term in this equation is equal to zero for all even values of p. The second term needs to be evaluated only for odd values of p, or for p=21+1 for

l [ 0 , N 2 - 1 ] . S ( 2 v ) = 2 N p = 0 N - 1 [ ( - 1 ) p + 1 X I ( p ) + X III ( p ) ] · r = 0 N 2 - 1 w ( r ) w ( r + N 2 ) cos [ 2 π N ( 2 v - p ) ( r + n 0 ) ] + 4 N l = 0 N 2 - 1 X II ( 2 l + 1 ) r = 0 N 2 - 1 w 2 ( r ) sin [ 2 π N ( 2 v - ( 2 l + 1 ) ) ( r + n 0 ) ] where v [ 0 , N 2 - 1 ] ( 40 )

Equation 40 can be rewritten as a summation of two modified convolution operations of two functions hI,III and hII with two sets of intermediate spectral components mI,III and mII that are derived from the MDCT coefficients XI, XII, and XIII for three segments of the source signal as follows:

S ( 2 v ) = 2 N p = 0 N - 1 m I , III ( p ) h I , III ( 2 v - p ) + 4 N t = 0 N 2 - 1 m II ( 2 l + 1 ) h II ( 2 v - ( 2 l + 1 ) ) , where m I , III ( τ ) = [ ( - 1 ) τ + 1 X I ( τ ) + X III ( τ ) ] m II ( τ ) = X II ( τ ) h I , III ( τ ) = r = 0 N 2 - 1 ω ( r ) ω ( r + N 2 ) cos [ 2 π N ( τ ) ( r + n 0 ) ] , h II ( τ ) = r = 0 N 2 - 1 ω 2 ( r ) sin [ 2 π N ( τ ) ( r + n 0 ) ] , v [ 0 , N 2 - 1 ] , ( 41 a ) S ( 2 v + 1 ) = S ( N - 2 ( 1 + v ) ) ( 41 b )

The results of the modified convolution operations depend on the properties of the functions hI,III and hII, which are impulse responses of hypothetical filters that are related to the combined effects of the IMDCT synthesis filter bank, the subsequent MDST analysis filter bank, and the analysis and synthesis window functions The modified convolutions need to be evaluated only for even integers.

Each of the impulse responses is symmetric. It may be seen from inspection that hI,III(τ)=hI,III(−τ) and hII(τ)=−hII (−τ). These symmetry properties may be exploited in practical digital implementations to reduce the amount of memory needed to store a representation of each impulse response. An understanding of how the symmetry properties of the impulse responses interact with the symmetry properties of the intermediate spectral components mI,III and mII may also be exploited in practical implementations to reduce computational complexity.

The impulse responses hI,III(τ) and hII(τ) may be calculated from the summations shown above; however, it may be possible to simplify these calculations by deriving simpler analytical expressions for the impulse responses. Because the impulse responses depend on the window function w(r), the derivation of simpler analytical expressions requires additional specifications for the window function. An example of derivations of simpler analytical expressions for the impulse responses for two specific window functions, the rectangular and sine window functions, are discussed below.

The rectangular window function is not often used in coding applications because it has relatively poor frequency selectivity properties; however, its simplicity reduces the complexity of the analysis needed to derive a specific implementation. For this derivation, the window function

w ( r ) = 1 2
for r ϵ[0,N−1] is used. For this particular window function, the second term of equation 41a is equal to zero. The calculation of the MDST coefficients does not depend on the MDCT coefficients for the second segment. As a result, equation 41a may be rewritten as

S ( 2 v ) = 2 N p = 0 N - 1 m I , III ( p ) h I , III ( 2 v - p ) m I , III ( τ ) = [ ( - 1 ) τ + 1 X I ( τ ) + X III ( τ ) ] h I , III ( τ ) = 1 2 r = 0 N 2 - 1 cos [ 2 π N ( τ ) ( r + n 0 ) ] , v [ 0 , N 2 - 1 ] ( 42 )

If N is restricted to have a value that is a multiple of four, this equation can be simplified further by using another lemma that postulates the following equality:

r = 0 N 2 - 1 cos [ 2 π N ( q ) ( r + n 0 ) ] = { ( - 1 ) q sin πq 2 sin πq N q not a multiple of N ( - 1 ) q N · N 2 q , a multiple of N where n 0 = N 2 + 1 2 ( 43 )

This may be proven as follows:

I = r = 0 N 2 - 1 cos [ 2 π N ( q ) ( r + n 0 ) ] = r = 0 N 2 - 1 sin [ 2 π N ( q ) ( r + n 0 ) + π 2 ] = r = 0 N 2 - 1 sin [ 2 π N ( q ) ( r + n 0 ) + 2 π N ( q ) ( N 4 q ) ] = r = 0 N 2 - 1 sin [ 2 π N ( q ) ( r + n 0 + N 4 q ) ] ( 44 )
By using the lemma shown in equation 35 with

a = n 0 + N 4 q ,
expression 44 can be rewritten as

I = sin ( 2 πq ( n 0 + N 4 q ) N + πq 2 - piq N ) sin πq 2 sin πq N ( 45 )
which can be simplified to obtain the following expression:

I = ( - 1 ) q sin πq 2 sin πq N ( 46 )

If q is an integer multiple of N such that q=mN, then the numerator and denominator of the quotient in expression 46 are both equal to zero, causing the value of the quotient to be indeterminate. L'Hospital's rule may be used to simplify the expression further. Differentiating the numerator and denominator with respect to q and substituting q=mN yields the expression

N · cos ( πmN 2 ) 2 · cos ( πm )
Because N is an integer multiple of four, the numerator is always equal to N and the denominator is equal to 2·(−1)m=2·(−1)q/N. This completes the proof of the lemma expressed by equation 43.

This equality may be used to obtain expressions for the impulse response hI,III. Different cases are considered to evaluate the response hI,III(τ). If τ is an integer multiple of N such that τ=mN then hI,III(τ)=(−1)m·N/4. The response equals zero for even values of τ other than an integer multiple of N because the numerator of the quotient in equation 46 is equal to zero. The value of the impulse response hI,III for odd values of τ can be seen from inspection. The impulse response may be expressed as follows:

h I , III ( τ ) = ( - 1 ) m N 4 for τ = mN h I , III ( τ ) = 0 for τ even , τ 0 ( 47 ) h I , III ( τ ) = 1 2 ( - 1 ) τ + 1 2 sin πτ N ( 48 )
The impulse response hI,III for a rectangular window function and N=128 is illustrated in FIG. 6.

By substituting these expressions into equation 42, equations 41a and 41b can be rewritten as:

S ( 2 v ) = 2 N p = 0 N - 1 m I , III ( p ) h I , III ( 2 v - p ) m I , III ( τ ) = [ ( - 1 ) τ + 1 X I ( τ ) + X III ( τ ) ] h I , III ( τ ) = { ( - 1 ) m N 4 , τ = mN 0 , τ mN and τ even 1 2 · ( - 1 ) τ + 1 2 sin 3 τ N , τ odd ( 49 a ) S ( 2 v + 1 ) = S ( N - 2 ( 1 + v ) ) v [ 0 , N 2 - 1 ] ( 49 b )

Using equations 49a and 49b, MDST coefficients for segment II can be calculated from the MDCT coefficients of segments I and III assuming the use of a rectangular window function. The computational complexity of this equation can be reduced by exploiting the fact that the impulse response hI,III(τ) is equal to zero for many odd values of τ.

The sine window function has better frequency selectivity properties than the rectangular window function and is used in some practical coding systems. The following derivation uses a sine window function defined by the expression
w(r)=sin(π/N(r+½))   (50)

A simplified expression for the impulse response hI,III may be derived by using a lemma that postulates the following:

I ( τ ) = r = 0 N 2 - 1 ω ( r ) ω ( r + N 2 ) cos [ 2 π N ( τ ) ( r + n 0 ) ] , = { 0 , τ odd , τ mN + 1 , τ mN - 1 - N S ( - 1 ) m , τ = mN + 1 - N S ( - 1 ) m , τ = mN + 1 ( - 1 ) 3 τ 2 4 [ 1 sin π N ( τ + 1 ) + 1 sin π N ( - τ + 1 ) ] , τ even where ω ( r ) = sin ( π N ( r + 1 2 ) ) ( 51 )

This lemma may be proven by first simplifying the expression for w(r)w(r+N/2) as follows:

sin ( π N ( r + 1 2 ) ) sin ( π N ( r + N 2 + 1 2 ) ) = sin ( π N ( r + 1 2 ) ) sin ( π N ( r + 1 2 ) + π 2 ) = sin ( π N ( r + 1 2 ) ) cos ( π N ( r + 1 2 ) ) = 1 2 sin ( 2 π N ( r + 1 2 ) ( 52 )
Substituting this simplified expression into equation 51 obtains the following:

I ( τ ) = 1 2 r = 0 N 2 - 1 sin [ 2 π N ( r + 1 2 ) ] cos [ 2 π N ( τ ) ( r + n 0 ) ] ( 53 )

Using the following trigonometric identity
sin u cos v=½[sin(u+v)+sin(u−v)]  (54)
equation 53 can be rewritten as follows:

I ( τ ) = 1 4 r = 0 N 2 - 1 sin [ 2 π N ( r + 1 2 ) + 2 π N ( τ ) ( r + n 0 ) ] + 1 4 r = 0 N 2 - 1 sin [ 2 π N ( r + 1 2 ) + 2 π N ( τ ) ( r + n 0 ) ] I ( τ ) = 1 4 r = 0 N 2 - 1 sin [ 2 π N ( r + 1 2 ) + τr + τn 0 ] + 1 4 r = 0 N 2 - 1 sin [ 2 π N ( ( - τ + 1 ) r - ( τn 0 - 1 2 ) ) ] I ( τ ) = 1 4 r = 0 N 2 - 1 sin [ 2 π N ( ( τ + 1 ) r + ( τn 0 + 1 2 ) ) ] + 1 4 r = 0 N 2 - 1 sin [ 2 π N ( ( - τ + 1 ) r - ( τn 0 - 1 2 ) ) ] ( 55 ) I ( τ ) = 1 4 r = 0 N 2 - 1 sin [ 2 π N ( τ + 1 ) ( r + τn 0 + 1 2 τ + 1 ) ] + 1 4 r = 0 N 2 - 1 sin [ 2 π N ( - τ + 1 ) ( r - τn 0 - 1 2 - τ + 1 ) ] ( 56 )

Equation 55 can be simplified by substitution in both terms of I(τ) according to equation 35, setting q=(τ+1) and

a = rn 0 + 1 2 ( τ + 1 )
in the first term, and setting q=(−τ+1) and

a = rn 0 - 1 2 ( - τ + 1 )
in the second term. This yields the following:

I ( τ ) = 1 4 sin ( 2 π N ( τn 0 + 1 2 ) + π 2 ( τ + 1 ) - π N ( τ + 1 ) ) sin π 2 ( τ + 1 ) sin π N ( τ + 1 ) + 1 4 sin ( 2 π N ( - τn 0 + 1 2 ) + π 2 ( - τ + 1 ) - π N ( - τ + 1 ) ) sin π 2 ( - τ + 1 ) sin π N ( - τ + 1 ) I ( τ ) = 1 4 sin ( π N ( τ ) ( N 2 + 1 ) + π 2 ( τ + 1 ) - π N ( τ ) ) sin π 2 ( τ + 1 ) sin π N ( τ + 1 ) + 1 4 sin ( π N ( - τ ) ( N 2 + 1 ) + π 2 ( - τ + 1 ) - π N ( - τ ) ) sin π 2 ( - τ + 1 ) sin π N ( - τ + 1 ) I ( τ ) = 1 4 ( π 2 ( τ ) + π 2 ( τ + 1 ) ) sin π 2 ( τ + 1 ) sin π N ( τ + 1 ) + 1 4 ( π 2 ( - τ ) + π 2 ( - τ + 1 ) ) sin π 2 ( - τ + 1 ) sin π N ( - τ + 1 ) I ( τ ) = 1 4 sin ( π ( τ ) + π 2 ) sin π 2 ( τ + 1 ) sin π N ( τ + 1 ) + 1 4 sin ( π ( - τ ) + π 2 ) sin π 2 ( - τ + 1 ) sin π N ( - τ + 1 ) ( 57 ) I ( τ ) = 1 4 cos ( πτ ) · cos π 2 τ sin π N ( τ + 1 ) + 1 4 cos ( - πτ ) · cos - π 2 τ sin - π N ( τ + 1 ) I ( τ ) = ( - 1 ) T 4 · cos π 2 τ sin π N ( τ + 1 ) + ( - 1 ) T 4 · cos π 2 ( - τ ) sin - π N ( - τ + 1 ) I ( τ ) = ( - 1 ) T 4 cos π 2 τ · [ 1 sin π N ( τ + 1 ) + 1 sin π N ( - τ + 1 ) ] I ( τ ) { ( - 1 ) 3 τ 2 4 [ 1 sin π N ( τ + 1 ) + 1 sin π N ( - τ + 1 ) ] , τ even 0 , τ odd ( 58 )

Equation 58 is valid unless the denominator for either quotient is equal to zero. These special cases can be analyzed by inspecting equation 57 to identify the conditions under which either denominator is zero. It can be seen from equation 57 that singularities occur for τ=mN+1 and τ=mN−1, where m is an integer. The following assumes N is an integer multiple of four.

For τ=mN+1 equation 57 can be rewritten as:

I ( mN + 1 ) = 1 4 sin ( π ( mN + 1 ) + π 2 ) · sin π 2 ( mN + 2 ) sin π 2 ( mN + 1 ) + 1 4 sin ( - π ( mN + 1 ) + π 2 ) · sin π 2 ( - ( mN + 1 ) + 1 ) sin π 2 ( - ( mN + 1 ) + 1 ) = 0 + 1 4 sin ( - πmN - π 2 ) sin - mN π 2 sin - mN π N = - 1 4 sin - mN π sin - mN π N ( 59 )
The value of the quotient is indeterminate because the numerator and denominator are both equal to zero. L'Hospital's rule can be used to determine its value. Differentiating numerator and denominator with respect to m yields the following:

I ( mN + 1 ) = - 1 4 - 2 cos - mNπ 2 - πcos - mn = - N 9 ( - 1 ) m ( 60 )
For τ=mN−1 equation 57 can be rewritten as:

I ( mN - 1 ) = 1 4 sin ( π ( mN - 1 ) + π 2 ) · sin π 2 ( mN + 1 - 1 ) sin π 2 ( mN + 1 - 1 ) + 1 4 sin ( - π ( mN - 1 ) + π 2 ) · sin π 2 ( - ( mN - 1 + 1 ) sin π 2 ( - ( mN - 1 + 1 ) I ( mN - 1 ) = 1 4 sin ( πmN - π 2 ) sin πmN 2 sin πmN 2 + 0 ( 61 )
The value of the quotient in this equation is indeterminate because the numerator and denominator are both equal to zero. L'Hospital's rule can be used to determine its value. Differentiating numerator and denominator with respect to m yields the following:

I ( mN - 1 ) = - 1 4 - πN 2 π N · cos πmN 2 cos πm = - N 8 ( - 1 ) m ( 62 )

The lemma expressed by equation 51 is proven by combining equations 58, 60 and 62.

A simplified expression for the impulse response hII may be derived by using a lemma that postulates the following:

I ( τ ) = r = 0 N 2 - 1 ω ( r ) ω ( r ) sin [ 2 π N ( τ ) ( r + n 0 ) ] , = { 0 , τ odd , τ mN + 1 , τ mN - 1 - N S ( - 1 ) m , τ = mN + 1 - N S ( - 1 ) m + 1 , τ = mN - 1 ( - 1 ) δτ 2 4 [ - 1 sin π N ( τ + 1 ) + 1 sin π N ( - τ + 1 ) ] , τ even where ω ( r ) = sin ( π N ( r + 1 2 ) ) ( 63 )

The proof of this lemma is similar to the previous proof. This proof begins by simplifying the expression for w(r)w(r). Recall that sin2 α=½−½ cos (2a), so that:

sin 2 ( π N ( r + 1 2 ) ) = 1 2 - 1 2 cos ( 2 π N ( r + 1 2 ) ) ( 64 )
Using this expression, equation 63 can be rewritten as:

I ( τ ) = r = 0 N 2 - 1 [ 1 2 - 1 2 cos ( 2 π N ( r + 1 2 ) ) ] sin [ 2 π N ( τ ) ( r + n 0 ) ] = 1 2 r = 0 N 2 - 1 sin [ 2 π N ( τ ) ( r + n 0 ) ] - 1 2 r = 0 N 2 - 1 cos [ 2 π N ( r + 1 2 ) ] sin [ 2 π N ( τ ) ( r + n 0 ) ] ( 65 )

From equation 37 and the associated lemma, it may be seen the first term in equation 65 is equal to zero. The second term may be simplified using the trigonometric identity cos u·sin v=½[ sin (u+v)−sin (u−v)], which obtains the following:

I ( τ ) = - 1 4 r = 0 N 2 - 1 sin [ 2 π N ( r + 1 2 ) + 2 π N ( τ ) ( r + n 0 ) ] + 1 4 r = 0 N 2 - 1 sin [ 2 π N ( r + 1 2 ) - 2 π N ( τ ) ( r + n 0 ) ] ( 66 )

Referring to equation 66, its first term is equal to the negative of the first term in equation 55 and its second term is equal to the second term of equation 55. The proof of the lemma expressed in equation 63 may be proven in a manner similar to that used to prove the lemma expressed in equation 51. The principal difference in the proof is the singularity analyses of equation 59 and equation 61. For this proof, I(mN−1) is multiplied by an additional factor of −1; therefore,

I ( mN - 1 ) = N 8 ( - 1 ) m + 1 .
Allowing for this difference along with the minus sign preceding the first term of equation 55, the lemma expressed in equation 63 is proven.

An exact expression for impulse response hll(τ) is given by this lemma; however, it needs to be evaluated only for odd values of τ because the modified convolution of hII in equation 41a is evaluated only for τ=(2v−(2l+1)). According to equation 63, hII(τ)=0 for odd values of τ except for τ=mN+1 and τ=mN−1. Because hII(τ) is non-zero for only two values of τ, this impulse response can be expressed as:

h II ( τ ) = { - N S ( - 1 ) m , τ = mN + 1 - N S ( - 1 ) m + 1 , τ = mN - 1 0 , otherwise ( 67 )
The impulse responses hI,III(τ) and hII(τ) for the sine window function and N=128 are illustrated in FIGS. 7 and 8, respectively.

Using the analytical expressions for the impulse responses hI,III and hII provided by equations 51 and 67, equations 41a and 41b can be rewritten as:

S ( 2 v ) = 2 N p = 0 N - 1 m I , III ( p ) h I , III ( 2 v - p ) + 4 N l = 0 N 2 - 1 m II ( 2 l + 1 ) h II ( 2 v - ( 2 l + 1 ) ) , where m I , III ( τ ) = [ ( - 1 ) τ + 1 X I ( τ ) + X III ( τ ) ] m II ( τ ) = X II ( τ ) h I , III ( τ ) = { 0 , τodd , τ mN + 1 , τ mN - 1 - N S ( - 1 ) m , τ = mN + 1 - N S ( - 1 ) m , τ = mN - 1 ( - 1 ) δτ 2 4 [ 1 sin π N ( τ + 1 ) + 1 sin π N ( - τ + 1 ) ] , τ even h II ( τ ) = { - N S ( - 1 ) m , τ = mN + 1 - N S ( - 1 ) m + 1 , τ = mN - 1 0 , otherwise ( 68 a ) S ( 2 v + 1 ) = S ( N - 2 ( 1 + v ) ) ( 68 b )

Using equations 68a and 68b, MDST coefficients for segment II can be calculated from the MDCT coefficients of segments I, II and III assuming the use of a sine window function. The computational complexity of this equation can be reduced further by exploiting the fact that the impulse response hI,III(τ) is equal to zero for many odd values of τ.

Equations 41a and 41b express a calculation of exact MDST coefficients from MDCT coefficients for an arbitrary window function. Equations 49a, 49b, 68a and 68b express calculations of exact MDST coefficients from MDCT coefficients using a rectangular window function and a sine window function, respectively. These calculations include operations that are similar to the convolution of impulse responses. The computational complexity of calculating the convolution-like operations can be reduced by excluding from the calculations those values of the impulse responses that are known to be zero.

The computational complexity can be reduced further by excluding from the calculations those portions of the full responses that are of lesser significance; however, this resulting calculation provides only an estimate of the MDST coefficients because an exact calculation is no longer possible. By controlling the amounts of the impulse responses that are excluded from the calculations, an appropriate balance between computational complexity and estimation accuracy can be achieved.

The impulse responses themselves are dependent on the shape of the window function that is assumed. As a result, the choice of window function affects the portions of the impulse responses that can be excluded from calculation without reducing coefficient estimation accuracy below some desired level.

An inspection of equation 49a for rectangular window functions shows the impulse response hI,III is symmetric about τ=0 and decays moderately rapidly. An example of this impulse response for N=128 is shown in FIG. 6. The impulse response hII is equal to zero for all values of τ.

An inspection of equation 68a for the sine window function shows the impulse response hI,III is symmetric about τ=0 and decays more rapidly than the corresponding response for the rectangular window function. For the sine window function, the impulse response hII is non-zero for only two values of τ. An example of the impulse responses hI,III and hII for a sine window function and N=128 are shown in FIGS. 7 and 8, respectively.

Based on these observations, a modified form of equations 41a and 41b that provides an estimate of MDST coefficients for any analysis or synthesis window function may be expressed in terms of two filter structures as follows:

S ( 2 v ) = filter_structure _ 1 ( 2 v ) + filter_structure _ 2 ( 2 v ) ( 69 ) filter_structure _ 1 ( 2 v ) = 2 N p = 0 N - 1 m I , III ( p ) h I , III ( 2 v - p ) ( 70 ) m I , III ( τ ) = [ ( - 1 ) τ + 1 X I ( τ ) + X III ( τ ) ] ( 71 ) h I , III ( τ ) = { 0 if τ [ τ trunc 1 , N - τ trunc 1 ] r = 0 N 2 - 1 ω ( τ ) ω ( r + N 2 ) · cos [ 2 π N ( τ ) ( r + n 0 ) ] , o . ω . ( 72 ) filter_structure _ 2 ( 2 v ) = 4 N l = 0 N 2 - 1 m II ( 2 l + 1 ) h II ( 2 v - ( 2 l + 1 ) ) ( 73 ) m II ( τ ) = X II ( τ ) ( 74 ) h II ( τ ) = { 0 if τ [ τ trunc 2 , N 2 - 1 - τ trunc 2 ] r = 0 N 2 - 1 ω 2 ( r ) sin [ 2 π N ( τ ) ( r + n 0 ) ] ( 75 ) S ( 2 v + 1 ) = S ( N - 2 ( 1 + v ) ) where ( 76 ) v [ 0 , N 2 - 1 ] , n 0 = N 2 + 1 2 and ntaps tot , τ trunc 1 , τ trunc 2 are chosen to satisfy ( 77 ) τ trunc 1 [ 1 , N 2 ] , τ trunc 2 [ 1 , N 4 - 1 ] , ntaps tot = 2 τ trunc 1 - 1 + 2 τ trunc 2 ( 78 )

An example of a device 30 that estimates MDST coefficients according to equation 69 is illustrated by a schematic block diagram in FIG. 3. In this implementation, the intermediate component generator 32 receives MDCT coefficients from the path 1 and derives first intermediate components mI,III from the MDCT coefficients XI and XII of segments I and III, respectively, by performing the calculations shown in equation 71, and derives first intermediate components mII from the MDCT coefficients XII of segment II by performing the calculations shown in equation 74. The intermediate component generator 34 derives second intermediate components by forming a combination of first intermediate components mI,III according to a portion of the impulse response hI,III received from the impulse responses 33 by performing the calculations shown in equation 70, and derives second intermediate components by forming a combination of first intermediate components mII according to a portion of the impulse response hII received from the impulse responses 33 by performing the calculations shown in equation 73. Any portion of the two impulse responses may be used as expressed by the values τtrunc1 and τtrunc2 including the entire responses. The use of longer impulse responses increases computational complexity and generally increases the accuracy of MDST coefficient estimation. The spectral component generator 35 obtains MDST coefficients from the second intermediate components by performing the calculations shown in equations 69 and 76.

The magnitude and phase estimator 36 calculates measures of magnitude and phase from the calculated MDST coefficients and the MDCT coefficients received from the path 31 and passes these measures along the paths 38 and 39. The MDST coefficients may also be passed along the path 37. Measures of spectral magnitude and phase may be obtained by performing the calculations shown above in equations 10 and 11, for example. Other examples of measures that may be obtained include spectral flux, which may be obtained from the first derivative of spectral magnitude, and instantaneous frequency, which may be obtained from the first derivative of spectral phase.

Referring to the impulse responses shown in FIGS. 6-8, for example, it may be seen that the coefficient values obtained by the convolution-type operations of the two filter structures are dominated by the portions of the responses that are near τ=0. A balance between computational complexity and estimation accuracy may be achieved for a particular implementation by choosing the total number of filter taps ntapstot that are used to implement the two filter structures. The total number of taps ntapstot may be distributed between the first and second filter structures as desired according to the values of τtrunc1 and τtrunc2, respectively, to adapt MDST coefficient estimation to the needs of specific applications. The distribution of taps between the two filter structures can affect estimation accuracy but it does not affect computational complexity.

The number and choice of taps for each filter structure can be selected using any criteria that may be desired. For example, an inspection of two impulse responses hI,III and hII will reveal the portions of the responses that are more significant. Taps may be chosen for only the more significant portions. In addition, computational complexity may be reduced by obtaining only selected MDST coefficients such as the coefficients in one or more frequency ranges.

An adaptive implementation of the present invention may use larger portions of the impulse responses to estimate the MDST coefficients for spectral components that are judged to be perceptually more significant by a perceptual model. For example, a measure of perceptual significance for a spectral component could be derived from the amount by which the spectral component exceeds a perceptual masking threshold that is calculated by a perceptual model. Shorter portions of the impulse responses may be used to estimate MDST coefficients for perceptually less significant spectral components. Calculations needed to estimate MDST coefficients for the least significant spectral components can be avoided.

A non-adaptive implementation may obtain estimates of MDST coefficients in various frequency subbands of a signal using portions of the impulse responses whose lengths vary according to the perceptual significance of the subbands as determined previously by an analysis of exemplary signals. In many audio coding applications, spectral content in lower frequency subbands generally has greater perceptual significance than spectral content in higher frequency subbands. In these applications, for example, a non-adaptive implementation could estimate MDST coefficients in subbands using portions of the impulse responses whose length varies inversely with the frequency of the subbands.

The preceding disclosure sets forth examples that describes only a few implementations of the present invention. Principles of the present invention may be applied and implemented in a wide variety of ways. Additional considerations are discussed below.

The exemplary implementations described above are derived from the MDCT that is expressed in terms of the ODFT as applied to fixed-length segments of a source signal that overlap one another by half the segment length. A variation of the examples discussed above as well as a variation of the alternatives discussed below may be obtained by deriving implementations from the MDST that is expressed in terms of the ODFT.

Additional implementations of the present invention may be derived from expressions of other transforms including the DFT, the FFT and a generalized expression of the MDCT filter bank discussed in the Princen paper cited above. This generalized expression is described in U.S. Pat. No. 5,727,119 issued Mar. 10, 1998.

Implementations of the present invention also may be derived from expressions of transforms that are applied to varying-length signal segments and transforms that are applied to segments having no overlap or amounts of overlap other than half the segment length.

Some empirical results suggest that an implementation of the present invention with a specified level of computational complexity is often able to derive measures of spectral component magnitude that is more accurate for spectral components representing a band of spectral energy than it is for spectral components representing a single sinusoid or a few sinusoids that are isolated from one another in frequency. The process that estimates spectral component magnitude may be adapted in at least two ways to improve estimation accuracy for signals that have isolated spectral components.

One way to adapt the process is by adaptively increasing the length of the impulse responses for two filter structures shown in equation 69 so that more accurate computations can be performed for a restricted set of MDST coefficients that are related to the one or more isolated spectral components.

Another way to adapt this process is by adaptively performing an alternate method for deriving spectral component magnitudes for isolated spectral components. The alternate method derives an additional set of spectral components from the MDCT coefficients and the additional set of spectral components are used to obtain measures of magnitude and/or phase. This adaptation may be done by selecting the more appropriate method for segments of the source signal, and it may be done by using the more appropriate method for portions of the spectrum for a particular segment. A method that is described in the Merdjani paper cited above is one possible alternate method. If it is used, this method preferably is extended to provide magnitude estimates for more than a single sinusoid. This may be done by dynamically arranging MDCT coefficients into bands of frequencies in which each band has a single dominant spectral component and applying the Merdjani method to each band of coefficients.

The presence of a source signal that has one dominant spectral component or a few isolated dominant spectral components may be detected using a variety of techniques. One technique detects local maxima in MDCT coefficients having magnitudes that exceed the magnitudes of adjacent and nearby coefficients by some threshold amount and either counting the number of local maxima or determining the spectral distance between local maxima. Another technique determines the spectral shape of the source signal by calculating an approximate Spectral Flatness Measure (SFM) of the source signal. The SFM is described in N. Jayant et al., “Digital Coding of Waveforms,” Prentice-Hall, 1984, p. 57, and is defined as the ratio of the geometric mean and the arithmetic mean of samples of the power spectral density of a signal.

The present invention may be used advantageously in a wide variety of applications. Schematic block diagrams of a transmitter and a receiver incorporating various aspects of the present invention are shown in FIGS. 4 and 5, respectively.

The transmitter shown in FIG. 4 is similar to the transmitter shown in FIG. 1 and includes the estimator 30, which incorporates various aspects of the present invention to provide measures of magnitude and phase along the paths 38 and 39, respectively. The encoder 6 uses these measures to generate encoded information representing the spectral components received from the analysis filter bank 3. Examples of processes that may be used in the encoder 6, which may depend on the measures of magnitude or phase, include perceptual models used to determine adaptive quantization levels, coupling, and spectral envelope estimation for later use by spectral regeneration decoding processes.

The receiver shown in FIG. 5 is similar to the receiver shown in FIG. 2 and includes the estimator 30, which incorporates various aspects of the present invention to provide measures of magnitude and phase along the paths 38 and 39, respectively. The estimator 30 may also provide MDST coefficients along the path 37. The decoder 26 uses these measures to obtain spectral components from encoded information received from the deformatter 23. Examples of processes that may be used in the decoder 26, which may depend on the measures of magnitude or phase, include perceptual models used to determine adaptive quantization levels, spectral component synthesis from composite or coupled representations, and spectral component regeneration.

Devices that incorporate various aspects of the present invention may be implemented in a variety of ways including software for execution by a computer or some other apparatus that includes more specialized components such as digital signal processor (DSP) circuitry coupled to components similar to those found in a general-purpose computer. FIG. 9 is a schematic block diagram of device 70 that may be used to implement aspects of the present invention. DSP 72 provides computing resources. RAM 73 is system random access memory (RAM) used by DSP 72 for signal processing. ROM 74 represents some form of persistent storage such as read only memory (ROM) for storing programs needed to operate device 70 and to carry out various aspects of the present invention. I/O control 75 represents interface circuitry to receive and transmit signals by way of communication channels 76, 77. Analog-to-digital converters and digital-to-analog converters may be included in I/O control 75 as desired to receive and/or transmit analog signals. In the embodiment shown, all major system components connect to bus 71, which may represent more than one physical bus; however, a bus architecture is not required to implement the present invention.

In embodiments implemented in a general purpose computer system, additional components may be included for interfacing to devices such as a keyboard or mouse and a display, and for controlling a storage device having a storage medium such as magnetic tape or disk, or an optical medium. The storage medium may be used to record programs of instructions for operating systems, utilities and applications, and may include embodiments of programs that implement various aspects of the present invention.

The functions required to practice various aspects of the present invention can be performed by components that are implemented in a wide variety of ways including discrete logic components, integrated circuits, one or more ASICs and/or program-controlled processors. The manner in which these components are implemented is not important to the present invention.

Software implementations of the present invention may be conveyed by a variety of machine readable media such as baseband or modulated communication paths throughout the spectrum including from supersonic to ultraviolet frequencies, or storage media that convey information using essentially any recording technology including magnetic tape, cards or disk, optical cards or disc, and detectable markings on media like paper.

Cheng, Corey I., Smithers, Michael J.

Patent Priority Assignee Title
Patent Priority Assignee Title
5285498, Mar 02 1992 AT&T IPM Corp Method and apparatus for coding audio signals based on perceptual model
5297236, Jan 27 1989 DOLBY LABORATORIES LICENSING CORPORATION A CORP OF CA Low computational-complexity digital filter bank for encoder, decoder, and encoder/decoder
5451954, Aug 04 1993 Dolby Laboratories Licensing Corporation Quantization noise suppression for encoder/decoder system
5481614, Mar 02 1992 AT&T IPM Corp Method and apparatus for coding audio signals based on perceptual model
5592584, Mar 02 1992 THE CHASE MANHATTAN BANK, AS COLLATERAL AGENT Method and apparatus for two-component signal compression
5627938, Mar 02 1992 THE CHASE MANHATTAN BANK, AS COLLATERAL AGENT Rate loop processor for perceptual encoder/decoder
5682463, Feb 06 1995 GOOGLE LLC Perceptual audio compression based on loudness uncertainty
5699479, Feb 06 1995 THE CHASE MANHATTAN BANK, AS COLLATERAL AGENT Tonality for perceptual audio compression based on loudness uncertainty
5699484, Dec 20 1994 Dolby Laboratories Licensing Corporation Method and apparatus for applying linear prediction to critical band subbands of split-band perceptual coding systems
5727119, Mar 27 1995 Dolby Laboratories Licensing Corporation Method and apparatus for efficient implementation of single-sideband filter banks providing accurate measures of spectral magnitude and phase
5781888, Jan 16 1996 THE CHASE MANHATTAN BANK, AS COLLATERAL AGENT Perceptual noise shaping in the time domain via LPC prediction in the frequency domain
5945940, Mar 12 1998 Massachusetts Institute of Technology Coherent ultra-wideband processing of sparse multi-sensor/multi-spectral radar measurements
6035177, Feb 26 1996 NIELSEN COMPANY US , LLC, THE Simultaneous transmission of ancillary and audio signals by means of perceptual coding
6131084, Mar 14 1997 Digital Voice Systems, Inc Dual subframe quantization of spectral magnitudes
6161089, Mar 14 1997 Digital Voice Systems, Inc Multi-subframe quantization of spectral parameters
6182030, Dec 18 1998 TELEFONAKTIEKTIEBOLAGET L M ERICSSON PUBL Enhanced coding to improve coded communication signals
6266644, Sep 26 1998 Microsoft Technology Licensing, LLC Audio encoding apparatus and methods
6453289, Jul 24 1998 U S BANK NATIONAL ASSOCIATION Method of noise reduction for speech codecs
6680972, Jun 10 1997 DOLBY INTERNATIONAL AB Source coding enhancement using spectral-band replication
6708145, Jan 27 1999 DOLBY INTERNATIONAL AB Enhancing perceptual performance of sbr and related hfr coding methods by adaptive noise-floor addition and noise substitution limiting
6847737, Mar 13 1998 IOWA STATE UNIVERSITY RESEARCH FOUNDATION, INC ; Iowa State University Methods for performing DAF data filtering and padding
6862326, Feb 20 2001 Comsys Communication & Signal Processing Ltd. Whitening matched filter for use in a communications receiver
6980933, Jan 27 2004 Dolby Laboratories Licensing Corporation Coding techniques using estimated spectral magnitude and phase derived from MDCT coefficients
7242710, Apr 02 2001 DOLBY INTERNATIONAL AB Aliasing reduction using complex-exponential modulated filterbanks
7707030, Jul 26 2002 Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung E.V. Device and method for generating a complex spectral representation of a discrete-time signal
7783032, Aug 16 2002 DEUTSCHE BANK AG NEW YORK BRANCH, AS COLLATERAL AGENT Method and system for processing subband signals using adaptive filters
8126709, Mar 28 2002 Dolby Laboratories Licensing Corporation Broadband frequency translation for high frequency regeneration
8155954, Jul 26 2002 Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung E.V. Device and method for generating a complex spectral representation of a discrete-time signal
20030016772,
20030093282,
20030187663,
20040071284,
20040078205,
20050197831,
20100161319,
JP2000048481,
RE42935, Jan 27 2004 Dolby Laboratories Licensing Corporation Coding techniques using estimated spectral magnitude and phase derived from MDCT coefficients
RE46684, Jan 27 2004 Dolby Laboratories Licensing Corporation Coding techniques using estimated spectral magnitude and phase derived from MDCT coefficients
WO45379,
///
Executed onAssignorAssigneeConveyanceFrameReelDoc
Jul 13 2004CHENG, COREY I Dolby Laboratories Licensing CorporationASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS 0449040811 pdf
Jul 13 2004SMITHERS, MICHAEL J Dolby Laboratories Licensing CorporationASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS 0449040811 pdf
Jan 22 2018Dolby Laboratories Licensing Corporation(assignment on the face of the patent)
Date Maintenance Fee Events
Jan 22 2018BIG: Entity status set to Undiscounted (note the period is included in the code).


Date Maintenance Schedule
Sep 15 20234 years fee payment window open
Mar 15 20246 months grace period start (w surcharge)
Sep 15 2024patent expiry (for year 4)
Sep 15 20262 years to revive unintentionally abandoned end. (for year 4)
Sep 15 20278 years fee payment window open
Mar 15 20286 months grace period start (w surcharge)
Sep 15 2028patent expiry (for year 8)
Sep 15 20302 years to revive unintentionally abandoned end. (for year 8)
Sep 15 203112 years fee payment window open
Mar 15 20326 months grace period start (w surcharge)
Sep 15 2032patent expiry (for year 12)
Sep 15 20342 years to revive unintentionally abandoned end. (for year 12)