Stereophonic audio signal decompression switching to monaural audio signal

Stereophonic audio signal decompression switching to monaural audio signal
RE42949

A communication system for sending a sequence of symbols on a communication link. The system includes a transmitter for placing information indicative of the sequence of symbols on the communication link and a receiver for receiving the information placed on the communication link by the transmitter. The transmitter includes a clock for defining successive frames, each of the frames including M time intervals, where M is an integer greater than 1. A modulator modulates each of M carrier signals with a signal related to the value of one of the symbols thereby generating a modulated carrier signal corresponding to each of the carrier signals. The modulated carriers are combined into a sum signal which is transmitted on the communication link. The carrier signals include first and second carriers, the first carrier having a different bandwidth than the second carrier. In one embodiment, the modulator includes a tree-structured array of filter banks having M leaf nodes, each of the values related to the symbols forming an input to a corresponding one of the leaf nodes. Each of the nodes includes one of the filter banks. Similarly, the receiver can be constructed of a tree-structured array of sub-band filter banks for converting M time-domain samples received on the communication link to M symbol values. A stereophonic audio signal decompression method that includes decoding, using a decoder, a compressed stereophonic audio signal. A de-quantizer de-quantizes the compressed stereophonic audio signal to generate sets of frequency components for synthesizing left and right audio signals. A controller switches to constructing a single set of frequency components by averaging corresponding frequency components in the left and right audio signals when a computational workload exceeds a capacity of a decompression system and a synthesizer synthesizes a monaural audio time domain signal.

PTO Wrapper PDF
Dossier Espace Google

Patent RE42949
Priority Sep 21 1992
Filed Sep 14 2007
Issued Nov 22 2011
Expiry Sep 21 2012
Inventors Tzannes, M…
Assg.orig Hybrid Aud…
Assg.curr HYBRID AUD…
Entity unknown
Referenced by 8
References 25
Maint.: EXPIRED

0. 5. A stereophonic audio signal decompression method comprising:

decoding, using a decoder, a compressed stereophonic audio signal;

de-quantizing, using a de-quantizer, the compressed stereophonic audio signal to generate sets of frequency components for synthesizing left and right audio signals;

switching, using a controller, to constructing a single set of frequency components by averaging corresponding frequency components in the left and right audio signals when a computational workload exceeds a capacity of a decompression system; and

synthesizing, using a synthesizer, a monaural audio time domain signal.

0. 1. A communication system for sending a sequence of symbols on a communication link a sequence of symbols having values representative of said symbols, said communication system comprising a transmitter for placing information indicative of said sequence of symbols on said communication link and a receiver for receiving said information placed on said communication link by said transmitter, said transmitter comprising

a clock for defining successive frames, each said frame comprising M time intervals, where M is an integer greater than 1;

a modulator modulating each of M carrier signals with a signal related to the value of one of said symbols thereby generating a modulated carrier signal corresponding to each of said carrier signals that is to be modulated and generating a sum signal comprising a sum of said modulated carrier signals, said modulator comprising a tree-structured array of filter banks having nodes, including a root node and M leaf nodes, each of said values related to said symbols forming an input to a corresponding one of said leaf nodes, each of said nodes, other than said leaf nodes, comprising one of said filter banks; and

an output circuit for transmitting said sum signal on said communication link, wherein said carrier signals comprise first and second carriers, said first carrier having a different bandwidth than said second carrier.

0. 2. The communication system of claim 1 wherein said receiver comprises:

an input circuit for receiving and storing M time-domain samples transmitted on said communication link; and

a decoder for recovering said M symbol values, said decoder comprising a tree-structured array of sub-band filter banks, said received M time-domain samples forming the input of a root node of said tree-structured array of said decoder and said M symbol values being generated by the leaf nodes of said tree-structured array of said decoder, each said sub-band filter bank comprising a plurality of FIR filters having a common input for receiving an input time-domain signal, each said filter generating an output signal representing a symbol value in a corresponding frequency band.

0. 3. A communication system for sending a sequence of symbols on a communication link, said communication system comprising a transmitter for placing information indicative of said sequence of symbols on said communication link, said transmitter comprising:

a clock for defining successive frames, each said frame comprising M time intervals, where M is an integer greater than 1;

an output circuit transmitting said sum signal on said communication link, wherein said carrier signals comprise first and second carriers, said first carrier having a different bandwidth than said second carrier; and

a receiver comprising:

an input circuit for receiving and storing M time-domain samples transmitted on said communication link; and

a decoder for recovering said M symbol values, said decoder comprising a tree-structured array of sub-band filter banks, said received M time-domain samples forming the input of a root node of said tree-structured array said decorder and said M symbol values being generated by the leaf nodes of said tree-structured array decorder, each said sub-band filter bank comprising a plurality of FIR filters having a common input for receiving an input time-domain signal, each said filter generating an output signal representing a symbol value in a corresponding frequency band.

0. 4. The communication system of claim 3 wherein said modulator comprises a tree-structured array of filter banks having nodes, including a root node and M leaf nodes, each of said values related to said symbols forming an input to a corresponding one of said leaf nodes, each of said nodes, other than said leaf nodes, comprising one of said filter banks.

The present invention comprises audio compression and decompression systems. An audio compression system according to the present invention converts an audio signal into a series of sets of frequency components. Each frequency component represents an approximation to the audio signal in a corresponding frequency band over a time interval that depends on the frequency band. The received audio signal is analyzed in a tree-structured sub-band analysis filter. The sub-band analysis filter bank comprises a tree-structured array of sub-band filters, the audio signal forming the input of the root node of the tree-structured array and the frequency components being generated at the leaf nodes of the tree-structured array. Each of the sub-band filter banks comprises a plurality of FIR filters having a common input for receiving an input audio signal. Each filter generates an output signal representing the input audio signal in a corresponding frequency band, the number of FIR filters in at least one of the sub-band filter bank is greater than two, and the number of said FIR filters in at least one of the sub-band filters is different than the number of FIR filters in another of the sub-band filters. The frequency components generated by the sub-band analysis filter are then quantized using information about the masking features of the human auditory system.FIG. 4 is a block diagram of an audio compression systemdiscussed hereinafter
where the x_i, for i=0 . . . W−1 are the values stored in shift register 320, and the h_iare coefficients of a low pass prototype filter which are stored in controller 325. For those wishing a more detailed explanation of the process for generating sets of filter coefficients, see J. Rothweiler, “POLYPHASE QUADRATURE FILTERS—A NEW SUB-BAND CODING TECHNIQUE” IEEE Proceedings of the 1983 ICASSP Conference, pp 1280-1283. The polyphase components are then generated from the Z_iby the following summing operations:

$\begin{matrix} P_{k} = \sum_{j = 0}^{2 M} Z_{i + 2 Mj} & (2) \end{matrix}$

Consider the filtered signal value in a band f_o±Δf . Denote the minimum value of L in this frequency band by L_min. It should be noted that L_minmay depend on frequency components outside the band in question, since a peak in an adjacent band may mask a signal in the band in question.

According to the masking model, any noise in this frequency band that has an energy less than L_minwill not be perceived by the listener. In particular, the noise introduced by replacing the measured signal amplitude in this band by a quantized approximation therefore will not be perceived if the quantization error is less than L_min. The noise in question will be less than L_minif the signal amplitude is quantized to accuracy equal to S/L_min, where S is the energy of the signal in the band in question.

The above-described quantization procedure requires a knowledge of frequency spectrum of the incoming audio signal at a resolution which is significantly greater than that of the sub-analysis of the incoming signal. In general, the minimum value of the mask function L will depend on the precise location of any peaks in the frequency spectrum of the audio signal. The signal amplitude provided by the sub-band analysis filter measures the average energy in the frequency band; however, it does not provide any information about the specific location of any spectral peaks within the band.

Hence, a more detailed frequency analysis of the incoming audio signal is required. This can be accomplished by defining a time window about each filtered signal component and performing a frequency analysis of the audio samples in this window to generate an approximation to E(f). In prior art systems, the frequency analysis is typically performed by calculating a FFT of the audio samples in the time window.

In one embodiment of a quantization sub-component according to the present invention, this is accomplished by further subdividing each sub-band using another layer of filter banks. The output of each of the sub-band filters in the analysis filter bank is inputted to another sub-band analysis filter which splits the original sub-band into a plurality of finer sub-bands. These finer sub-bands provide a more detailed spectral measurement of the audio signal in the frequency band in question, and hence, can be used to compute the overall mask function L discussed above.

While a separate L_minvalue may be calculated for each filtered signal value from each sub-band filter, the preferred embodiment of the present invention operates on blocks of filtered signal values. If a separate quantization step size is used for each filtered value, then the step size would need to be communicated with each filtered value. The bits needed to specify the step size reduce the degree of compression. To reduce this “overhead”, a block of samples is quantized using the same step size. This approach reduces the number of overhead bits/sample, since the step size need only be communicated once. The blocks of filtered samples utilized consist of a sequential set of filtered signal values from one of the sub-band filters. As noted above, these values can be inputted to a second sub-band analysis filter to obtain a fine spectral measurement of the energy in the sub-band.

One embodiment of such a system is shown in FIG. 9 at 400. The audio signal values are input to a sub-band analysis filter 402 which is similar to that shown in FIG. 5. The filtered outputs are quantized by quantizer 404 in blocks of 8 values. Each set of 8 values leaving sub-band analysis filter 402 is processed by a sub-band analysis filter 408 to provide a finer spectral measurement of the audio signal. Subband analysis filters 408 divide each band into 8 uniform sub-bands. The outputs of sub-band analysis filters 408 are then used by psycho-acoustic analyzer 406 to determine the masking thresholds for each of the frequency components in the block. While the above embodiment splits each band into 8 sub-bands for the purpose of measuring the energy spectrum, it will be apparent to those skilled in the art that other numbers of sub-bands may be used. Furthermore, the number of sub-bands may be varied with the frequency band.

The manner in which an audio decompression system according to the present invention operates will now be explained with the aid of FIG. 10 which is a block diagram of an audio decompression system 410 for decompressing the compressed audio signals generated by a compression system such as that shown in FIG. 9. The compressed signal is first decoded to recover the quantized signal values by a decoder 412. The quantized signal values are then used to generate approximations to the filtered signal values by de-quantizer 414. Since the present invention utilizes multi-rate sampling, the number of filtered signal values depends on the specific frequency bands. In the case in point, there are 21 such bands. As discussed above, the five highest bands are sampled at 8 times the rate of the lowest 8 frequency bands, and the intermediate frequency bands are sampled at twice the rate of the lowest frequency bands. The filtered signal values are indicated by ^kS_m, where m indicates the frequency band, and k indicates the number of the signal value relative to the lowest frequency bands, i.e., k runs from 1 to 8 for the highest frequency bands, and 1 to 2 for the intermediate frequency bands.

The filtered samples are inputted to an inverse sub-band filter 426 which generates an approximation to the original audio signal from the filtered signal values. Filter 402 shown in FIG. 9 and filter 426 form a perfect, or near perfect, reconstruction filter bank. Hence, if the filtered samples had not been replaced by approximations thereto by quantizer 404, the decompressed signal generated by filter bank 426 would exactly match the original audio signal input to filter 402 to a specified precision.

Inverse sub-band filter bank 426 also comprises a tree-structured filter bank. To distinguish the filters used in the inverse sub-band filters from those used in the sub-band filter banks which generated the filtered audio samples, the inverse filter banks will be referred to as synthesizers. The filtered signal values enter the tree at the leaf nodes thereof, and the reconstructed audio signal exits from the root node of the tree. The low and intermediate filtered samples pass through two levels of synthesizers. The first level of synthesizers are shown at 427 and 428. For each group of four filtered signal values accepted by synthesizers 427 and 428, four sequential values which represent filtered signal values in a frequency band which is four times wider are generated. Similarly, for each group of eight filtered signal values accepted by synthesizer 429, eight sequential values which represent filtered signal values in a frequency band which is eight times as wide are generated. Hence, the number of signal values entering synthesizer 430 on each input is now the same even though the number of signal values provided by de-quantizer 414 for each frequency band varied from band to band.

The synthesis of the audio signal from the sub-band components is carried out by analogous operations. Given M sub-band components that were obtained from 2M polyphase components P_i, the original polyphase components can be obtained from the following matrix multiplication:

$\begin{matrix} P_{i} = \sum_{k = 0}^{M - 1} S_{k} co s [\frac{(i + \frac{M}{2}) (2 k + 1) π}{2 M}] & (5 a) \end{matrix}$

As noted above, there are a number of different cosine modulations that may be used. Eq. (5a) corresponds to modulation using the relationship shown in Eq. 3(a). If the modulation shown in Eq. 3(b) is utilized, then the polyphase components are obtained from the following matrix multiplication:

$\begin{matrix} P_{i} = \sum_{k = 0}^{M - 1} S_{k} co s [\frac{(2 i + 1 + M) (2 k + 1) π}{4 M}] & (5 b) \end{matrix}$

The time domain samples x_kare computed from the polyphase components by the inverse of the windowing transform described above. A block diagram of a synthesizer according to the present invention is shown in FIG. 11 at 500. The M frequency components are first transformed into the corresponding polyphase components by a matrix multiplication shown at 510. The resultant 2M polyphase components are then shifted into a 2W entry shift register 512 and the oldest 2M values in the shift register are shifted out and discarded. The contents in the shift register are inputted to array generator 513 which builds a W value array 514 by iterating the following loop 8 times: take the first M samples from shift register 512, ignore the next 2M samples, then take the next M samples. The contents of array 514 are then multiplied by W weight coefficients, h′_iwhich are related to the h_subi used in the corresponding sub-band analysis filter to generate a set of weighted values _wi=h′_i*u_i, which are stored in array 516. Here the u_iare the contents of array 514. The M time domain samples, x_jfor j=0, . . . M−1, are then generated by summing circuit 518 which sums the appropriate w_ivalues, i.e.,

$\begin{matrix} x_{j} = \sum_{i = 0}^{W / M - 1} w_{j + Mi} & (6) \end{matrix}$

While the above-described embodiments of synthesizers and sub-band analysis filters are described in terms of special purpose hardware for carrying out the various operations, it will be apparent to those skilled in the art that the entire operation may be carried out on a general purpose digital computer.

As pointed out above, it would be advantageous to provide a single high-quality compressed audio signal that could be played back on a variety of playback platforms having varying computational capacities. Each such playback platform would reproduce the audio material at a quality consistent with the computational resources of the platform.

Furthermore, the quality of the playback should be capable of being varied in real time as the computational capability of the platform varies. This last requirement is particularly important in playback systems comprising multi-tasking computers. In such systems, the available computational capacity for the audio material varies in response to the computational needs of tasks having equal or higher priority. Prior art decompression systems due not provide this capability.

The present invention allows the quality of the playback to be varied in response to the computational capability of the playback platform without the use of multiple copies of the compressed material. Consider an audio signal that has been compressed using a sub-band analysis filter bank in which the window contains W audio samples. The computational workload required to decompress the audio signal is primarily determined by the computations carried out by the synthesizers. The computational workload inherent in a synthesizer is W multiplies and adds from the windowing operations and 2M²multiplies and adds from the matrix multiplication. The extent to which the filters approximate an ideal band pass filter, in general, depends on the number samples in the window, i.e., W. As the number of samples increases, the discrepancy between the sub-band analysis filter performance and that of an ideal band pass filter decreases. For example, a filter utilizing 128 samples has a side lobe suppression in excess of 48 dB, while a filter utilizing 512 samples has a side lobe suppression in excess of 96 dB. Hence, synthesis quality can be traded for a reduction in computational workload if a smaller window is used for the synthesizers.

In the preferred embodiment of the present invention, the size of the window used to generate the sub-band analysis filters in the compression system is chosen to provide filters having 96 dB rejection of signal energy outside a filter band. This value is consistent with playback on a platform having 16 bit D/A converters. In the preferred embodiment of the present invention, this condition can be met by 512 samples. The prototype filter coefficients, h_i, viewed as a function of i have a more or less sine-shaped appearance with tails extending from a maximum. The tails provide the corrections which result in the 96 dB rejection. If the tails are truncated, the filter bands would have substantially the same bandwidths and center frequencies as those obtained from the non-truncated coefficients. However, the rejection of signal energy outside a specific filter's band would be less than the 96 dB discussed above. As a result, a compression and decompression system based on the truncated filter would show significantly more aliasing than the non-truncated filter.

The present invention utilizes this observation to trade sound quality for a reduction in computational workload in the decompression apparatus. In the preferred embodiment of the present invention, the audio material is compressed using filters based on a non-truncated prototype filter. When the available computational capacity of the playback platform is insufficient to provide decompression using synthesis filters based on the non-truncated prototype filter, synthesizers based on the truncated filters are utilized. Truncating the prototype filter leads to synthesizers which have the same size window as those based on the non-truncated prototype. However, many of the filter coefficients used in the windowing operation are zero. Since the identity of the coefficients which are now zero is known, the multiplications and additions involving these coefficients can be eliminated. It is the elimination of these operations that provides the reduced computational workload.

It should be noted that many playback platforms use D/A converters with less than 16 bits. In these cases, the full 96 dB rejection is beyond the capability of the platform; hence, the system performance will not be adversely effected by using the truncated filter. These platforms also tend to be the less expensive computing systems, and hence, have lower computational capacity. Thus, the trade-off between computational capacity and audio quality is made at the filter level, and the resultant system provides an audio quality which is limited by its D/A converters rather than its computational capacity.

Another method for trading sound quality for a reduction in computational workload is to eliminate the synthesis steps that involve specific high-frequency components. If the sampled values in one or more of the high-frequency bands are below some predetermined threshold value, then the values can be replaced by zero. Since the specific components for which the substitution is made are known, the multiplications and additions involving these components may be eliminated, thereby reducing the computational workload. The magnitude of the distortion generated in the reconstructed audio signal will, of course, depend on the extent of the error made in replacing the sampled values by zeros. If the original values were small, then the degradation will be small. This is more often the case for the high-frequency filtered samples than for the low frequency filtered samples. In addition, the human auditory system is less sensitive at high frequencies; hence, the distortion is less objectionable.

It should also be noted that the computational workload inherent in decompressing a particular piece of audio material varies during the material. For example, the high-frequency filtered sampled may only have a significant amplitude during pans of the sound track. When the high-frequency components are not present or sufficiently small to be replaced by zeros without introducing noticeable distortions, the computational workload can be reduced by not performing the corresponding multiplications and additions. When the high-frequency components are large, e.g., during attacks, the computational workload is much higher.

It should be noted that the computational work associated with generating the P_kvalues from the S_ivalues can be organized by S_i. That is, the contribution to each P_kfrom a given S_iis calculated, then the contribution to each P_kfrom S_i+1, and so on. Since there are 2M P values involved with each value of S, the overhead involved in testing each value of S before proceeding with the multiplications and additions is small compared to the computations saved if a particular S value is 0 or deemed to be negligible. In the preferred embodiment of the present invention, the computations associated with S_iare skipped if the absolute value of S_iis less than some predetermined value, ε.

Because of the variation in workload, the preferred embodiment of the present invention utilizes a buffering system to reduce the required computational capacity from that needed to accommodate the peak workload to that need to accommodate the average workload. In addition, this buffering facilitates the use of the above-described techniques for trading off the required computational capacity against sound quality. For example, when the computational workload is determined to be greater than that available, the value of .epsilon. can be increased which, in turn, reduces the number of calculations needed to generate the P_kvalues.

A block diagram of an audio decompression system utilizing the above-described variable computational load techniques is shown in FIG. 12 at 600. The incoming compressed audio stream is decoded by decoder 602 and de-quantizer 604 to generate sets of frequency components {S_i} which are used to reconstruct the time domain audio signal values. The output of synthesizer 606 is loaded into a FIFO buffer 608 which feeds a set of D/A converters 610 at a constant rate determined by clock 609. The outputs of the D/A converters are used to drive speakers 612. Buffer 608 generates a signal that indicates the number of time domain samples stored therein. This signal is used by controller 614 to adjust the parameters that control the computational complexity of the synthesis operations in synthesizer 606. When this number falls below a predetermined minimum value, the computational algorithm used by synthesizer 606 is adjusted to reduce the computational complexity, thereby increasing the number of time domain samples generated per unit time. For example, controller 614 can increase the value of e described above. Alternatively, controller 614 could force all of the high-frequency components from bands having frequencies above some predetermined frequency to be zero. In this case, controller 614 also instructs de-quantizer 604 not to unpack the high-frequency components that are not going to be used in the synthesis of the signal. This provides additional computational savings. Finally, controller 614 could change the windowing algorithm, i.e., use a truncated prototype filter.

If the number of stored values exceeds a second predetermined value, controller 614 adjusts the computational algorithm to regain audio quality if synthesizer 606 is not currently running in a manner that provides the highest audio quality. In this case, controller 614 reverses the approximations introduced into synthesizer 606 discussed above.

While audio decompression system 600 has been discussed in terms of individual computational elements, it will be apparent to those skilled in the art that the functions of decoder 602, de-quantizer 604, synthesizer 606, buffer 608 and controller 614 can be implemented on a general purpose digital computer. In this case, the functions provided by clock 609 may be provided by the computer's clock circuitry.

In stereophonic decompression systems having parallel computational capacity, two synthesizers may be utilized. A stereophonic decompression system according to the present invention is shown in FIG. 13 at 700. The incoming compressed audio signal is decoded by a decoder 702 and de-quantized by de-quantizer 704 which generates two sets of frequency components 705 and 706. Set 705 is used to regenerate the time domain signal for the left channel with the aid of synthesizer 708, and set 706 is used to generate the time domain signal for the right channel with the aid of synthesizer 709. The outputs of the synthesizers are stored in buffers 710 and 712 which feed time domain audio samples at regular intervals to D/A converters 714 and 715, respectively. The timing of the signal feed is determined by clock 720. The operation of decompression system 700 is controlled by a controller 713 which operates in a manner analogous to controller 614 described above.

If a stereophonic decompression system does not have parallel computational capacity, then the regeneration of the left and right audio channels must be carried out by time-sharing a single synthesizer. When the computational workload exceeds the capacity of the decompression system, the trade-offs discussed above may be utilized to trade audio quality for a reduction in the computational workload. In addition, the computational workload may be reduced by switching to a monaural reproduction mode, thereby reducing the computational workload imposed by the synthesis operations by a factor of two.

A stereophonic decompression system using this type of serial computation system is shown in FIG. 14 at 800. The incoming compressed audio signal is decoded by a decoder 802 and de-quantized by de-quantizer 804 which generates sets of frequency components for use in synthesizing the left and right audio signals. When there is sufficient computational capacity available to synthesize both left and right channels, controller 813 time shares synthesizer 806 with the aid of switches 805 and 806. When there is insufficient computational capacity, controller 813 causes switch 805 to construct a single set of frequency components by averaging the corresponding frequency components in the left and right channels. The resulting set of frequency components is then used to synthesize a single set of monaural time domain samples which is stored in buffers 810 and 812.

The techniques described above for varying the computational complexity required to synthesize a signal may also be applied to vary the computational complexity required to analyze a signal. This is particularly important in situations in which the audio signal must be compressed in real time prior to being distributed through a communication link having a capacity which is less than that needed to transmit the uncompressed audio signal. If a computational platform having sufficient capacity to compress the audio signal at full audio quality is available, the methods discussed above can be utilized.

However, there are situations in which the computational capacity of the compression platform may be limited. This can occur when the computational platform has insufficient computing power, or in cases in which the platform performing the compression may also include a general purpose computer that is time-sharing its capacity among a plurality of tasks. In the later case, the ability to trade-off computational workload against audio quality is particularly important.

A block diagram of an audio compression apparatus 850 utilizing variable computational complexity is shown in FIG. 15 at 850. Compression apparatus 850 must provide a compressed signal to a communication link. For the purposes of this discussion, it will be assumed that the communication link requires a predetermined amount of data for regenerating the audio signal at the other end of the communication link. Incoming audio signal values from an audio source such as microphone 852 are digitized and stored in buffer 854. In the case of stereophonic systems, a second audio stream is provided by microphone 851. To simplify the following discussion, it will be assumed that apparatus 850 is operating in a monaural mode unless otherwise indicated. In this case, only one of the microphones provides signal values.

When M such signal values have been received, sub-band analysis filter bank 856 generates M signal components from these samples while the next M audio samples are being received. The signal components are then quantized and coded by quantizer 858 and stored in an output buffer 860. The compressed audio signal data is then transmitted to the communication link at a regular rate that is determined by clock 862 and controller 864.

Consider the case in which sub-band analysis filter 856 utilizes a computational platform that is shared with other applications running on the platform. When the computational capacity is restricted, sub-band analysis filter bank 856 will not be able to process incoming signal values at the same rate at which said signal values are received. As a result, the number of signal values stored in buffer 854 will increase. Controller 864 periodically senses the number of values stored in buffer 854. If the number of values exceeds a predetermined number, controller 864 alters the operations of sub-band analysis filter bank 856 in a manner that decreases the computational workload of the analysis process. The audio signal synthesized from the resulting compressed audio signal will be of lesser quality than the original audio signal; however, compression apparatus 850 will be able to keep up with the incoming data rate. When controller 864 senses that the number of samples in buffer 854 returns to a safe operating level, it alters the operation of sub-band analysis filter bank 856 in such a manner that the computational workload and audio quality increases.

Many of the techniques described above may be used to vary the computational workload of the sub-band analysis filter. First, the prototype filter may be replaced by a shorter filter or a truncated filter thereby reducing the computational workload of the windowing operation. Second, the higher frequency signal components can be replaced by zero's. This has the effect of reducing “M” and thereby reducing the computational workload.

Third, in stereophonic systems, the audio signals from each of the microphones 851 and 852 can be combined by circuitry in buffer 854 to form a monaural signal which is analyzed. The compressed monaural signal is then used for both the left and right channel signals.

For However, for the purposes of the present discussion, it is sufficient to note that the filters may be implemented as finite impulse response filters with real filter coefficients. If the synthesis filter generates M coefficients per frame representing the amplitude of the transmitted signal, the filter bank accepts M frequency-domain symbols and generates M time-domain coefficients. However, it should be noted that the M coefficients generated may also depend on symbols received prior to the M frequency-domain symbols of the current frame. Similarly, the analysis filter bank demodulates M frequency-domain symbols from M time-domain received signal values in a given frame, and the resulting M symbols may depend on previous frames of M time-domain signal values processed by the filter bank.

The communication bandwidth may alternatively be broken up into subbands of distinct (nonuniform) bandwidths by means of a single nonuniform filter bank transform. The synthesis filter bank, or frequency-domain-to-time-domain transform for converting symbols into signal values for transmission, is depicted in FIG. 4 16 at 300 for a system having K subchannels. If the subchannels are nonuniform in their bandwidth, distinct subchannels of the filter bank will operate at different upsampling rates, the upsampling rate of the k^thsubchannel will be denoted by M_k. The upsampling rates are subject to the critical sampling condition

$\begin{matrix} \sum_{k = 0}^{K - 1} \frac{1}{M_{k}} = 1 & [(1)] (7) \end{matrix}$

Referring to FIG. 4 16, synthesis filter bank 300 generates M_tottime-domain samples in each time frame. Here, M_totis the least common multilple of the upsampling rates M_kprovided by the upsamplers of which 302 is typical. Define the integers n_kby

$\begin{matrix} n_{k} = \frac{M_{tot}}{K_{k}} . & [(2)] (8) \end{matrix}$

In each frame of transform processing, n_ksymbols, denoted by s_k,i, are mapped onto the k^thsubchannel using the sequence, f_k, as the modulating waveform to generate a time domain sequence, x_k, representing the symbols in the k^thsubchannel, i.e.,

$\begin{matrix} x_{k} [n] = \sum_{i} s_{k, i} f_{k} [n - {iM}_{k}] & [(3)] (9) \end{matrix}$

Note that symbols from previous frames may contribute to the output of a given frame. Each of the contributions x_kfrom the K distinct subchannels are added together, as shown at 301, to produce a set of M_tottime-domain signal values x[n] from M_totinput symbols S_k,iduring the given frame. The k^thsubchannel will have a bandwidth that is 1/M_kas large as that occupied by the full transmitted signal.

At the receiver, the incoming discrete signal values x′[n] are passed through an analysis filter bank 400, depicted in FIG. 5 17. The received signal values are denoted by x′ to emphasize that the samples have been altered by the transmission link. Each filter in this bank has a characteristic downsampling ratio M_kimposed after filtering by an finite impulse response filter, producing a set of M_totoutput symbols s per frame. A typical filter is shown at 401 with its corresponding downsampler at 402. The output symbol stream for the k^thsubchannel is given by

$\begin{matrix} s_{k, n}^{'} = \sum_{i} x^{'} [i - {nM}_{k}] * H_{k} [i] & [(4)] (10) \end{matrix}$

Again, input signal values from preceding frames may contribute to the set of symbols output during a given frame.

We require that in an ideal channel, the subchannel waveforms, f_k,together with the receive filters H_ksatisfy perfect-reconstruction or near-perfect-reconstruction conditions, with an output symbol stream that is identical (except for a possible delay of an integer number of samples) to the input symbol stream. This is equivalent to the absence of inter-symbol and inter-channel interference upon reconstruction. Methods for the design of such finite-impulse-response filter bank waveforms are known to the art. The reader is referred to J. Li, T. Q. Nguyen, S. Tantaratana, “A simple design method for nonuniform multirate filter banks,” in Proc. Asilomar Conf. On Signals, Systems, and Computers, November 1994 for a detailed discussion of such filter banks.

Various modifications to the present invention will become apparent to those skilled in the art from the foregoing description and accompanying drawings. Accordingly, the present invention is to be limited solely by the scope of the following claims.

INVENTORS:

Tzannes, Michael A., Jayasimha, Sriram, Stautner, John P., Heller, Peter N., Morrell, William R.

THIS PATENT IS REFERENCED BY THESE PATENTS:

Patent	Priority	Assignee	Title
10687059,	Oct 01 2012	DOLBY VIDEO COMPRESSION, LLC	Scalable video coding using subblock-based coding of transform coefficient blocks in the enhancement layer
11134255,	Oct 01 2012	DOLBY VIDEO COMPRESSION, LLC	Scalable video coding using inter-layer prediction contribution to enhancement layer prediction
11477467,	Oct 01 2012	DOLBY VIDEO COMPRESSION, LLC	Scalable video coding using derivation of subblock subdivision for prediction from base layer
11575921,	Oct 01 2012	DOLBY VIDEO COMPRESSION, LLC	Scalable video coding using inter-layer prediction of spatial intra prediction parameters
11589062,	Oct 01 2012	DOLBY VIDEO COMPRESSION, LLC	Scalable video coding using subblock-based coding of transform coefficient blocks in the enhancement layer
12155867,	Oct 01 2012	DOLBY VIDEO COMPRESSION, LLC	Scalable video coding using inter-layer prediction contribution to enhancement layer prediction
8255212,	Jul 04 2006	DOLBY INTERNATIONAL AB	Filter compressor and method for manufacturing compressed subband filter impulse responses
ER9351,

THIS PATENT REFERENCES THESE PATENTS:

Patent	Priority	Assignee	Title
3976863,	Jul 01 1974	Alfred, Engel	Optimal decoder for non-stationary signals
4157455,	Jul 14 1976	Pioneer Electronic Corporation	FM stereophonic receiver having muting and mode changing
4251690,	Jul 28 1978	Toko, Inc.	Frequency-modulation stereophonic receiver
4399325,	Dec 28 1979	SANYO ELECTRIC CO , LTD	Demodulating circuit for controlling stereo separation
4479235,	May 08 1981	RCA LICENSING CORPORATION, TWO INDEPENDENCE WAY, PRINCETON, NJ 08540, A CORP OF DE	Switching arrangement for a stereophonic sound synthesizer
4653095,	Feb 06 1986		AM stereo receivers having platform motion protection
4713776,	May 16 1983	NEC Corporation	System for simultaneously coding and decoding a plurality of signals
4747142,	Jul 25 1985	TOFTE SOUND SYSTEMS, INC , A CORP OF OR	Three-track sterophonic system
4833715,	Mar 06 1987	ALPS Electric Co., Ltd.	FM stereo receiver
5048054,	May 12 1989	CIF LICENSING, LLC	Line probing modem
5170413,	Dec 24 1990	Motorola, Inc.	Control strategy for reuse system assignments and handoff
5225904,	Sep 15 1989	Intel Corporation	Adaptive digital video compression system
5243629,	Sep 03 1991	AT&T Bell Laboratories	Multi-subcarrier modulation for HDTV transmission
5285474,	Jun 12 1992	Silicon Valley Bank	Method for equalizing a multicarrier signal in a multicarrier communication system
5285498,	Mar 02 1992	AT&T IPM Corp	Method and apparatus for coding audio signals based on perceptual model
5347499,	Feb 27 1992	Samsung Electronics Co., Ltd.	Circuit for selectively setting a monaural playback channel in a stereo audio apparatus
5408580,	Sep 21 1992	HYBRID AUDIO, LLC	Audio compression system employing multi-rate signal analysis
5479447,	May 03 1993	BOARD OF TRUSTEES OF THE LELAND STANFORD, JUNIOR UNIVERSITY, THE	Method and apparatus for adaptive, variable bandwidth, high-speed data transmission of a multicarrier signal over digital subscriber lines
5515442,	Feb 25 1994	Sony Corporation; SONY ELECTRONICS CORPORATION	Mono/stereo switching circuit
5583962,	Jan 08 1992	Dolby Laboratories Licensing Corporation	Encoder/decoder for multidimensional sound fields
5606642,	Sep 21 1992	HYBRID AUDIO, LLC	Audio decompression system employing multi-rate signal analysis
5699432,	Nov 16 1995	Bayerische Motoren Werke Aktiengesellschaft	Circuit for mobile radio receivers
5771293,	Nov 16 1995	Bayerische Motoren Werke Aktiengesellschaft	Switching arrangement for mobile radio receivers
6252909,	Sep 12 1992	HYBRID AUDIO, LLC	Multi-carrier transmission system utilizing channels of different bandwidth
RE40281,	Sep 21 1992	HYBRID AUDIO, LLC	Signal processing utilizing a tree-structured array

ASSIGNMENT RECORDS Assignment records on the USPTO

///

Executed on	Assignor	Assignee	Conveyance	Frame	Reel	Doc
Sep 14 2007		Hybrid Audio LLC	(assignment on the face of the patent)
Dec 10 2010	AWARE, INC	Hybrid Audio LLC	ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS	025534	0671	pdf
Mar 28 2016	HYBRID AUDIO, LLC	HYBRID AUDIO, LLC	ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS	038115	0877	pdf

MAINTENANCE FEES AND DATES: Maintenance records on the USPTO

Date	Maintenance Fee Events

Date	Maintenance Schedule
Nov 22 2014	4 years fee payment window open
May 22 2015	6 months grace period start (w surcharge)
Nov 22 2015	patent expiry (for year 4)
Nov 22 2017	2 years to revive unintentionally abandoned end. (for year 4)
Nov 22 2018	8 years fee payment window open
May 22 2019	6 months grace period start (w surcharge)
Nov 22 2019	patent expiry (for year 8)
Nov 22 2021	2 years to revive unintentionally abandoned end. (for year 8)
Nov 22 2022	12 years fee payment window open
May 22 2023	6 months grace period start (w surcharge)
Nov 22 2023	patent expiry (for year 12)
Nov 22 2025	2 years to revive unintentionally abandoned end. (for year 12)