An apparatus for processing an audio signal and method thereof are disclosed. The present invention includes receiving, by an audio processing apparatus, an audio signal including a first data of a first block encoded with rectangular coding scheme and a second data of a second block encoded with non-rectangular coding scheme; receiving a compensation signal corresponding to the second block; estimating a prediction of an aliasing part using the first data; and, obtaining a reconstructed signal for the second block based on the second data, the compensation signal and the prediction of aliasing part.
|
1. A method for processing an audio signal, comprising:
receiving, by an audio processing apparatus, an audio signal including a first data of a first block encoded with a rectangular first coding scheme using a rectangular window and a second data of a second block encoded with a non-rectangular second coding scheme using a non-rectangular window;
generating an output signal for the first block using the first data of the first block based on the rectangular first coding scheme;
receiving a compensation signal corresponding to the second block;
obtaining a prediction of an aliasing part by applying a window of the second block to the output signal for the first block; and
obtaining a reconstructed signal for the second block based on the second data, the compensation signal and the prediction of the aliasing part, wherein, when the first data is encoded with a lpd (Linear prediction Domain) coding scheme and the window of the second block belongs to a transition window class, the window of the second block has an ascending line with a first slope,
wherein the first slope is gentler different than a second slope.
6. An apparatus for processing an audio signal, comprising:
a de-multiplexer receiving an audio signal including a first data of a first block encoded with a rectangular first coding scheme using a rectangular window and a second data of a second block encoded with a non-rectangular second coding scheme using a non-rectangular window, and receiving a compensation signal corresponding to the second block;
a rectangular decoding unit generating an output signal for the first block using the first data of the first block based on the rectangular first coding scheme, and obtaining a prediction of an aliasing part by applying a window of the second block to the output signal for the first block; and
a non-rectangular decoding unit obtaining a reconstructed signal for the second block based on the second data, the compensation signal and the prediction of the aliasing part, wherein, when the first data is encoded with a lpd (Linear prediction Domain) coding scheme and the window of the second block belongs to a transition window class, the window of the second block has an ascending line with a first slope,
wherein the first slope is gentler different than a second slope.
2. The method of
3. The method of
the long_stop window and the stop_start window have horizontal-asymmetry, and have a zero part in a left half.
4. The method of
5. The method of
7. The apparatus of
8. The apparatus of
the long_stop window and the stop_start window have horizontal-asymmetry, and have a zero part in a left half.
9. The apparatus of
10. The apparatus of
|
In Formula 1, ‘C’ indicates data corresponding to the block C, ‘D’ indicates data corresponding to the block D, ‘r’ indicates reversion, ‘L1’ indicates a result from applying the part L1 of the non-rectangular window, and ‘L2’ indicates a result from applying the part L2 of the non-rectangular window.
In the following description, a method of compensating an uncompensated signal to become identical or similar to an original signal is described with reference to
Meanwhile, a non-rectangular window has symmetry. Characteristics of the non-rectangular window, as shown in
Li2+Ri2=1, where i=1 or 2
L1r=R2
L2r=R1 [Formula 2]
In Formula 2, ‘L1’ indicates a left first part, ‘L2’ indicates a left second part, ‘R1’ indicates a right first part, and ‘R2’ indicates a right second part.
Hence, if the above characteristics of the non-rectangular window are applied, Formula 1 can be summarized in the following.
Uncompensated signal=(−Cr(L1)r+D(L2))(L2)=D(L2)2−Cr(R2L2) (because L1r=R2) [Formula 3]
Hence, in order for the uncompensated signal to become equal to the original signal D, i.e., in order to perform a perfect compensation, a needed signal is shown in
Needed signal for perfect compensation=original signal−uncompensated signal=D−(D(L2)2−Cr(R2L2)) [Formula 4-1]
Meanwhile, using the characteristics shown in Formula 2, Formula 4-1 can be summarized into the following.
Needed signal for perfect compensation=D(R2)2+C(R2L2) (because 1−L22=R22) [Formula 4-2]
In Formula 4-2, a first term (D(R2)2) corresponds to a correction part and a second term (Cr(R2L2)) can be named an aliasing part.
If homogeneous windows (e.g., non-rectangular window and non-rectangular window) are overlapped with each other, the correction part CP and the aliasing part AP correspond parts to be deleted in a manner of being added by performing time domain aliasing cancellation (TDAC). In other words, since heterogeneous windows (i.e., rectangular window and non-rectangular window) are overlapped with each other, the correction part CP and the aliasing part AP are remaining errors instead of being cancelled.
Specifically, the correction part CP corresponds to a part of a current block (e.g., block D) (i.e., a block behind a window crossing point) to which a non-rectangular window (particularly, R2) is applied. And, the aliasing part AP corresponds to a part of a previous block (e.g., block C) (i.e., a block behind a window crossing point) (e.g., a block at which a rectangular window and a non-rectangular block are overlapped with each other) to which a non-rectangular window (particularly, R2 and L2) is applied.
Meanwhile, since a decoder is able to reconstruct a previous block (e.g., block C) using data of the previous block, it is able to generate a prediction of an aliasing part using the reconstructed previous block. This is represented as Formula 5.
Prediction of aliasing part=qCr(R2L2) [Formula 5]
Meanwhile, an error of an aliasing part, which is a difference (or a quantization error) between a prediction of the aliasing part and an original aliasing part can be represented as Formula 6.
Error of aliasing part=er(R2L2)=Cr(R2L2)−qCr(R2L2) [Formula 6]
Using Formula 5 and Formula 6, Formula 4-2 is summarized into Formula 7.
Needed signal for perfect compensation=D(R2)2+Cr(R2L2)=D(R2)2+(qCr+er)(R2L2) [Formula 7]
In Formula 7, D(R2)2 indicates a correction part CP, qCr(R2L2) indicates a prediction of an aliasing part AP, and er(R2L2) indicates an error of the aliasing part.
Hence, the signal needed for perfect compensation is a sum of the correction part CP and the aliasing part AP, as shown in Formula 7.
In the following description, three kinds of methods for compensating a correction part CP and an aliasing part AP are explained with reference to
Referring to
Method A: Compensation signal=D(R2)2+er(R2L2), where ‘D’ is a reconstructed signal [Formula 8-1]
In case of a compensation signal according to the first embodiment, as mentioned in the foregoing description with reference to Formula 5, a prediction of an aliasing part AP can be obtained by a decoder based on data of a previous block (i.e., a block corresponding to an overlapped part between a rectangular window and a non-rectangular window) without transmission from an encoder to a decoder. Even if a compensation signal includes a correction part CP and an error of an aliasing part, the decoder is able to generate a prediction of the aliasing part. Therefore, it is able to obtain a signal for perfect compensation (cf. Formula 7). According to the first embodiment, it is able to save the number of bits by transmitting an error instead of the aliasing part AP itself. Moreover, it is able to obtain a perfectly compensated signal by compensating the error of the aliasing part AP.
According to the second embodiment, a compensation signal includes a signal corresponding to a correction part CP only.
Method B: Compensation signal=D(R2)2, where a reconstructed signal is D−er(R2L2) [Formula 8-2]
As mentioned in the foregoing description (or like the first embodiment), a decoder generates a prediction of an aliasing part AP and then obtains a compensated signal using a compensation signal corresponding to a correction part CP together with the prediction. According to the second embodiment, since an error of the aliasing part AP may remain in the compensated signal, a reconstruction rate or a sound quality may be degraded. Yet, a compression ratio of the compensation signal can be raised higher than that of the first embodiment.
According to the third embodiment, a compensation signal is not transmitted but a decoder estimates a correction part CP and an aliasing part AP.
Method C: Compensation signal=Not transmitted, generated compensation signal in the decoder=qCr(L2R2)+D(R2)2, where a reconstructed signal is D−er(L2)/(R2) [Formula 8-3]
As mentioned in the foregoing description (or like the first embodiment and the second embodiment), a prediction of an aliasing part AP can be generated by a decoder. Meanwhile, a correction part CP can be generated in a manner of compensating a window shape for a signal corresponding to a current block (e.g., block D). In particular, qCr((L2R2) generated using data of the previous block (qC) is added to un-compensated signal like the formula 1. Then D(L2)2−er(L2R2) is generated, by dividing D(L2)2−er(L2R2) by (L2)2 (which may correspond to adding D(R2)2 to D(L2)2−er(L2R2)), D−er(R2)/(L2) is obtained. In formula 8-3, quantized error of current block (block D) is not represented.
A reconstruction rate of the third embodiment may be lower than that of the first or second embodiment. Yet, since the third embodiment does not need bits for transmitting a compensation signal at all, a compression ratio of the third embodiment is considerably high.
TABLE 1
Total
Left
Ascending
Top
Descending
Right
length
zero part
line
line
line
zero part
(A)
N/4 or
0
N/4 or 256
0
N/4 or 256
0
256
(B)
N/2 or
N/8 or
N/4 or 256
N/4 or
N/4 or 256
N/8 or
512
128
256
128
(C)
N or
N3/8 or
N/4 or 256
3N/4 or
N/4 or 256
N/8 or
1024
384
768
128
In Table 1, ‘N’ Indicates a frame length and a numeral indicates the number of samples (e.g., ‘256’ indicates 256 samples.).
Referring to Table 1 and
Non-rectangular windows shown in
In the above description, the examples of the non-rectangular window corresponding to the B coding scheme are explained. Examples of a non-rectangular window corresponding to the C coding scheme (e.g., MDCT) shall be described later together with an audio signal processing apparatus according to a second embodiment.
Referring to
Referring to
Crw=Cr(L1)r+D(R2) [Formula 9]
For reference, the signal is a signal before a decoder applies a window.
The embedding part EP (Crw) can be calculated by a decoder. Instead of coding the whole signal D according to a rectangular coding scheme, transmission can be performed by encoding ‘D−Crw’ (i.e., a transmission part TP shown in the drawing) only. And, the transmission part TP is represented as Formula 10.
TP=D−Crw=−Cr(L1)r−D(1−R2) [Formula 10]
The decoder is able to reconstruct an original signal in a manner of overlapping unfolded data corresponding to a non-rectangular coding scheme with data corresponding to a rectangular coding scheme.
In the above description so far, contents for compensating the defect in case of the overlapping of the heterogeneous coding schemes and the heterogeneous windows (i.e., rectangular window and non-rectangular window) are explained in detail with reference to
Referring now to
The rectangular scheme coding part 122 encodes Nth block of an input signal according to a rectangular coding scheme and then delivers the encoded data (for clarity, this data is named a first data) to the rectangular scheme synthesis part 124 an the multiplexer 130. In this case, as mentioned in the foregoing description, the rectangular coding scheme is the coding scheme for applying a rectangular window. ACELP belongs to the rectangular coding scheme, by which the present invention is non-limited. The rectangular scheme coding part 122 is able to output a result encoded by applying a rectangular window to be block B and the block C by the A coding scheme in
The rectangular scheme synthesis part 124 generates a prediction of an aliasing part AP using the encoded data, i.e., the first data. In particular, the rectangular scheme synthesis part 124 generates an output signal by performing decoding with the rectangular coding scheme. For instance, the block C (and the block B) is reconstructed into its original form by the A coding scheme. Using the output signal and the non-rectangular window, the prediction of the aliasing part AP is obtained, In this case, the prediction of the aliasing part AP can be represented as Formula 5. In Formula 5, ‘qC’ indicates the output signal and ‘R2L2’ indicates the non-rectangular window. And, the prediction of the aliasing part AP is inputted to the compensation information generating part 128.
The non-rectangular scheme coding part 126 generates an encoded data (for clarity, named a second data) by encoding the (N+1)th block by the non-rectangular coding scheme. For instance, the second data can correspond to a result from applying the non-rectangular window to the blocks C to F and then folding the blocks. As mentioned in the foregoing description, the non-rectangular coding scheme can correspond to the B coding scheme (e.g., TCX) or the C coding scheme (e.g., MDCT), by which the present invention is non-limited. And, the second data is delivered to the multiplexer 130.
The compensation information generating part 124 generates a compensation signal using the prediction of the aliasing part and an original input signal. In this case, the compensation signal can be generated according to one of the three kinds of the methods shown in
The multiplexer 130 generates at least one bitstream by multiplexing the first data (e.g., data of the Nth block), the second data (e.g., data of the (N+1)th block) and the compensation signal together and then transmits the generated at least one bitstream to an encoder. Of course, like the former multiplexer 130 shown in
Referring to
The demultiplexer 210 extracts the first data (e.g., data of the Nth block), the second data (e.g., data of the (N+1)th block) and the compensation signal from the at least one bitstream. In this case, the compensation signal can correspond to one of the three types described with reference to
The rectangular scheme decoding part 222 generates an output signal by decoding the first data by the rectangular coding scheme. This is as good as obtaining the block C (and the block B) shown in
Like the rectangular scheme synthesis part 124 shown in
The non-rectangular scheme decoding part 226 generates a signal by decoding the second data by the non-rectangular coding scheme. Since the generated signal is the signal before the compensation of aliasing and the like, it corresponds to the uncompensated signal mentioned in the foregoing description. Hence, this signal can be equal to the former signal represented as Formula 1.
The compensation part 228 generates a signal reconstructed using the compensation signal delivered from the demultiplexer 210, the prediction of the aliasing part obtained by the aliasing prediction part 224 and the uncompensated signal generated by the non-rectangular scheme decoding part 226. In this case, the reconstructed signal is the same as described with reference to
In the following description, an audio signal processing apparatus according to a second embodiment is explained with reference to
First of all, regarding the first embodiment, the Nth block corresponds to the rectangular coding scheme (e.g., A coding scheme) and the (N+1)th block corresponds to the non-rectangular coding scheme (e.g., B coding scheme or C coding scheme), and vice versa. On the contrary, regarding the second embodiment, when (N+1)th block corresponds to the C coding scheme, a window type of the C coding scheme is changed according to whether Nth block corresponds to a rectangular coding scheme (e.g., A coding scheme). In this case, it is a matter of course that the Nth block and the (N+1)th block can be switched to each other in order.
Referring to
In case that a second block (i.e., a current block) is encoded by a non-rectangular coding scheme, the window type determining part 127 determines a type of a window of the second block according to whether a first block (e.g., a previous block, a following block, etc.) is encoded by a rectangular coding scheme. In particular, if the second block is encoded by the C coding scheme belonging to the non-rectangular coding schemes and a window applied to the second block belongs to a transition window class, the window type determining part 127 determines the type (and a shape) of the window of the second block according to whether the first block is encoded by the rectangular coding scheme. Examples of the window type are shown in Table 1.
TABLE 1
Examples of window type in non-rectangular coding scheme (particularly, C coding scheme)
Window shape
Previous/
Left
Width of
Width of
Right
Window
Name per
following
zero
ascending
Top
descending
zero
type
Classification
shape
block
interval
line
line
line
interval
1
Only-long
Non-
Irrespective
0
N
0
N
0
window
transition
window
2
Long_start
Transition
Steep
C coding
0
N
7N/16
N/8
7N/16
window
window
long_start
scheme
window
Gentle
Rectangular
3N/8
N/4
3N/8
long_start
window
window
3
Shirt
Non-
Irrespective
0
Overlapping of 8 short parts, each
window
transitional
having ascending and descending line
window
width set to N/8
4
Long_stop
Transition
Steep
C coding
7N/16
N/8
7/16N
N
0
window
window
long_stop
scheme
window
Gentle
Rectangular
3N/8
N/4
3N/8
long_stop
window
window
5
Stop_start
Transition
Steep
C coding
7N/16
N/8
7N/8
N/8
7N/16
window
window
stop_start
scheme
window
Gentle
Rectangular
3N/8
N/4
3N/4
N/4
3N/8
stop_start
window
window
In Table 1, ‘NT’ indicates a frame length, 1,024 or 960 samples or the like.
Referring to Table 1, 2nd, 4th and 5th windows (i.e., a long_start window, a long_stop window and a stop_start window) among total 5 windows belong to a transition window class. The window belonging to the transition window class, as shown in the table, differs in shape according to a previous or following block corresponds to a rectangular window. In case corresponding to a rectangular coding scheme, a width of an ascending or descending line is N/4. Yet, it can be observed that a class of a transition window has a width of an ascending or descending line becomes N/8 in case corresponding to a non-rectangular coding scheme (e.g., C coding scheme).
Referring to
In other words, the window type determining part 127 preferentially determines a type of a window corresponding to a current block, generates window type information for specifying a specific window applied to the current block (e.g., a frame or subframe) among a plurality of windows (i.e., for indicating a window type), and then delivers the generated window type information to the multiplexer 130. In case that the type of the window corresponding to the current block is classified into a transition window, the window type determining part 127 determines a shape of a window, and more particularly, a width (and a corresponding top line and a length of a left or right zero part) of an ascending or descending line according to whether a previous or following block corresponds to a rectangular coding scheme and then applies the determined window shape to the current block.
Meanwhile, like the former compensation information generating part 128 of the first embodiment, the compensation information generating part 128 generates a compensation signal when heterogeneous windows (e.g., a non-rectangular window and a rectangular window) are overlapped with each other (e.g., the case corresponding to (A) in
As mentioned in the foregoing description, since a defect generated from the heterogeneous windows overlapped with each other can be corrected using the compensation signal, 50% of the heterogeneous windows can be overlapped instead of 100%. Since the heterogeneous windows need not to be overlapped with each other by 100%, it is not necessary to narrow a width of an ascending or descending line of each window classified into a transition window. Therefore, a window can have a slope relatively gentler than that of the case of the 100% overlapping.
Referring to
In case that a current block or a second block corresponds to a non-rectangular coding scheme (particularly, the C coding scheme), the window shape determining part 127 determines a specific window (i.e., a window type) applied to the current block among a plurality of windows based on the window type information delivered from the demultiplexer 210. In case that a window of a current block belongs to a transition window class, the window shape determining part 127 determines a shape of a window of the determined window type according to whether a previous/following block (i.e., a first block) is coded by a rectangular coding scheme. In particular, if the previous/following block is encoded by the rectangular coding scheme and a window of the current block belongs to the transition window class, as mentioned in the foregoing description, the window shape is determined to have an ascending or descending line with a first slope gentler than a second slope. For instance, in case of a long_start window, the window shape is determined as a gentle long_start window (having a descending line with a first slope (e.g., N/4) in Table 1. In case of a long_stop window, the window shape is determined as a gentle long_stop window (e.g., an ascending line with a first slope (N/4)). And, in case of a stop_start window, the window shape is determined in the same manner. In this case, as mentioned in the foregoing description, the first slope (e.g., N/4) is gentler than the second slope. In particular, the second slope is a slope of an ascending or descending line of a steep transition window (e.g., a steep long_stop window, etc.).
The window type and shape determined in the above manner are delivered to the non-rectangular scheme decoding part 226. Subsequently, the non-rectangular scheme decoding part 226 generates an uncompensated signal by decoding a current block by the non-rectangular scheme according to the determined window type and shape.
Like the first embodiment, in case that the overlapping of heterogeneous windows (e.g., a rectangular window and a non-rectangular window) occurs, the compensation part 228 generates a reconstructed signal using the uncompensated signal and the compensation signal (and the prediction of the aliasing part).
In the following description, an audio signal processing apparatus according to a third embodiment is explained with reference to
Referring to
The first scheme coding part 122-1 encodes the input signal by a first coding scheme and the second scheme coding part 126-2 encodes the input signal by a second coding scheme. In this case, the first and second coding schemes are as good as those described with reference to
In case that the input signal corresponds to the second coding scheme, the window type determining part 127-2 determines a window type and shape of a current block with reference to a characteristic (and a window type) of a previous or following block, generates window type information indicating the window type corresponding to the current block (frame or subframe), and then delivers the generated window type information to the multiplexer 130.
In the following description, a window type is explained in detail with reference to Table 1, a window type and shape of a current block according to a coding scheme of a previous/following block are explained with reference to
First of all, one example of a window type corresponding to a second coding scheme can be identical to Table 1. Referring to Table 1, windows (e.g., only-long, long_start, short, long_stop and stop_start) of total five types exist. In this case, the only-long window is a window applied to a signal suitable for a long window due to a stationary characteristic of the signal and the short window is a window applied to a signal suitable for a short window due to a transient characteristic of the signal. The long_start window, the long_stop window and the stop_start window, which are classified as transition windows, are necessary for a process of transition to the short window (or a window with a first coding scheme) from the only-long window or a process for transition to the only-long window (or a window with a first coding scheme) from the short window. The stop_start window is the window used if a previous/following frame corresponds to the short window (or a window with a first coding scheme) despite that a long window is suitable for a current block or frame.
Shapes of the windows of the five types shown in Table 1 are examined in detail as follows. First of all, each of the only-long, short, and stop_start windows has horizontal symmetry, while the rest of the windows have horizontal asymmetry. The long_start window includes a zero part in a right half only, whereas the long_stop window includes a zero part in a left half only.
In the following description, a process for determining a window shape of a current frame according to a previous frame or a following frame is explained in detail. First of all, if a previous frame is an only-long window and a current frame is a long_start window, a shape of a current long_start window can be determined according to whether a following frame corresponds to a short window or a window with a first coding scheme. In particular, a slope of a descending line of the long_start window can vary. A long_start window having a gentle slope of a descending line shall be named a gentle long_start window (cf. a name per shape in Table 1) and a long_start window having a steep slope of a descending line shall be named a steep long_start window. This shall be described in detail with reference to
In particular, a window of a first coding scheme shown in
Like the case shown in
In other words, in case that a following window is a window corresponding to a first coding scheme, 50% of the overlapping is acceptable. Hence, a descending line of a long_start window is maintained gentle with a first slope. As a result, a location of a crossing point becomes the same location (e.g., a point of 3N/2 from a window start point) if the following window follows the first or second coding scheme or is irrespective of the first or second coding scheme. Thus, as the crossing points become equal to each other, inter-window transition becomes different. This shall be described together with a fourth embodiment later in this disclosure.
Referring to
Referring now to Table 1, a short window has a single shape irrespective of a coding scheme of a previous or following block. This is explained with reference to
Meanwhile, if a current frame is a long_stop window and a following frame is an only-long window, a shape of a current long_stop window can be determined according to a previous frame corresponds to a window of a first coding scheme. This shall be explained in detail with reference to a fourth embodiment.
Referring now to
Afterwards, the multiplexer 130 generates at least one stream by multiplexing data (e.g., data of (N+1)th block) encoded by a first coding scheme, data (e.g., data of Nth block) encoded by a second coding scheme and the window type information together.
Referring to
The demultiplexer 210 receives the coding scheme information (e.g., coding identification information and subcoding identification information) described with reference to
The first scheme decoding part 222-1 is a component configured to perform a process reverse to that of the first scheme encoding part 122-1. The first scheme decoding part 222-1 generates an output signal [e.g., an output signal of (N+1)th block] by decoding data by a first coding scheme (e.g., ACELP, TCX, etc.). And, the second scheme decoding part 226-2 generates an output signal (e.g., an output signal of Nth block) by decoding data by a second coding scheme (e.g., MDCT, etc.).
The window shape determining part 227-2 identifies a window type of a current block based on the window type information and then determines a window type among the window types according to a coding scheme of a previous or following block. As mentioned in the foregoing description with reference to
Subsequently, the second scheme decoding part 226-2 applies the window in the shape determined by the window shape determining part 227-2 to the current block.
In the following description, a fourth embodiment of the present invention is explained with reference to
Referring to
A window type determining part 127-2 determines a window of a current block in consideration of inter-block window transition. In particular, the window type determining part 127-2 determines a window type and shape of a current block [e.g., (N+1)th block] according to whether a previous block (e.g., Nth block) is coded by a first coding scheme. In particular, in case that a previous block is coded by a first coding scheme, one (e.g., a short window, a long_stop window and a stop_start window) of three types except an only-log window and a long_start window among 5 kinds of types shown in Table 1 is determined as a window type. Thus, without going through a transition window necessary for inter-coding scheme transition in the first coding scheme, it is able to directly move to a short window used in the second coding scheme or a transition window (i.e., a long_stop window or a stop_start window) used for transition between a short window and a long window.
Such an inter-window path is shown in
Referring to the star marks, in case that a previous block is a block corresponding to a first coding scheme (e.g., ACELP or TCX), as mentioned in the foregoing description, one of a short window, a long_stop window and a stop_start window can become a window corresponding to a second coding scheme. In particular, it is unnecessary to go through a window (e.g., a window corresponding to 1,152 samples) separately provided for a transition to a second coding scheme from a first coding scheme. This is because a crossing point coincides irrespective of a coding scheme, as mentioned in the foregoing description of the third embodiment. The following description is made with reference to
First of all,
Since a rectangular window is shown in
A third case (i.e., a transition to a stop_start window) is not shown in
In case of
Referring now to the fourth embodiment, as mentioned in the above description with reference to
The window type determining part 127-2 shown in
The second scheme coding part 126-2 encodes the current block according to the second coding scheme using the determined window type and shape. And, the multiplexer 130 generates at least one bitstream by multiplexing the data of the previous block, the data of the current block and the window type information of the current block together.
Referring to
The window shape determining part 227-2 determines a specific window for a current block among a plurality of windows based on window type information. In doing so, it is able to determine one of a plurality of the windows in consideration of the transition limitation shown in
Referring to
TABLE 2
Window type information
window type info
only-long window
0
long_start window
1
short window
2
long_stop window
3
stop_start window
1
If window type information is set to 1, it indicates a long_start window and a stop_start window, i.e., two cases. Meanwhile, according to the transition limitation disclosed in
The window shape determining part 227-2 determines a window shape such as a slope of an ascending line of the current block, a slope of a descending line of the current block and the like based on the coding scheme of the previous or following block, according to the above-determined window type. Thus, the fourth embodiment has been described so far. In the following description, another method for solving a problem of a window transition between a first coding scheme and a second coding scheme is explained with reference to
In addition to the long window having the length of 1,152, in case that a short window, which includes total 9 short parts including a short part, having a length of 1,152 is used, as shown in
In the following description, a fifth embodiment of the present invention is explained with reference to
First of all, when a current block corresponds to a first coding scheme, the mode determining part 123-1 identifies whether the current block corresponds to a rectangular coding scheme (e.g., ACELP) or a non-rectangular coding scheme (e.g., TCX). If the current block corresponds to the non-rectangular coding scheme, the mode determining part 123 determines one of modes 1 to 3. As each of the modes 1 to 3 can correspond to a length for applying the non-rectangular scheme thereto, one of a single subframe, two contiguous subframes and four contiguous subframes (i.e., a single frame) can be determined. Moreover, the length can be determined into one of 256 samples, 512 samples and 1,024 samples, as shown in
Thus, in case of a non-rectangular coding scheme, after a mode has been determined, a shape of a window of a current block is determined according to whether a window of a previous or following block is a short window. This process is explained in detail with reference to
In case that a window corresponding to a first coding scheme is overlapped with a long_stop window, as shown in
On the contrary, in case that a window corresponding to a first coding scheme is overlapped with a short window, as shown in
Thus, a width of a descending or ascending line can vary according to a previous or following block is a short window. By equalizing the width, it is able to met the TDAC condition described with reference to
Referring to
For reference, windows corresponding to modes 1 to 3 in Shape 1 can be equal to
Moreover, the previous block corresponds to a last subframe of a previous frame at least and the following block can correspond to a first subframe of a following frame at least.
Referring now to
Once the mode is determined, the mode determining part 123-1 determines a shape of a window among Shapes 1 to 4 according to whether a previous block and/or a following block corresponds to a short window.
And, the multiplexer 123-1 generates at least one bitstream by multiplexing the subcoding identification information, data of the current block and data of the previous or following block together.
Referring to
The window shape determining part 223-2 determines a shape of a window for the determined mode in a manner of identifying one of the Shapes 1 to 4 by determining whether a previous block and/or a following block corresponds to a short window.
The rest of components shall not be described from the following description.
An encoder 100F and a decoder 200F according to a sixth embodiment of the present invention are described with reference to
Referring to
Referring to
Referring now to
In case of a block corresponding to a first coding scheme, if a long term prediction (LTP) is not performed, the first scheme coding part 122-1 generates new information amounting to bits that are saved in case of not performing the long term prediction. Examples of the new information are described as follows.
1) It is able to utilize an excitation codebook. In particular, more code books are designed rather than previous codebooks or a dedicated codebook in a size of surplus bits. In case of using the dedicated codebook, an excitation signal is generated by a combination of an excitation by an original codebook and an excitation by an additional codebook. In case of the dedicated codebook, it is possible to use a codebook configured to encode a pitch component well like the functionality of a long term prediction.
2) It is able to enhance quantization performance of LPC coefficient by allocating additional bits to a linear prediction coding [LPC].
3) It is able to allocate bits to code a compensation signal (i.e., a signal for compensating correction and aliasing parts generated from the overlapping between a non-rectangular window of a second coding scheme and a rectangular window of a first coding scheme) of the first or second embodiment.
4) Transmission amounting to saved bits is not performed. In particular, since a used bit amount is variable as many as a frame in case of audio coding, the saved bits are utilized in other frames.
Meanwhile, the first scheme coding part 122-1 delivers additional bits to the multiplexer 130 by encoding the new information for a block on which the long term prediction is not performed.
Finally, the multiplexer 130 generates at least one bitstream by multiplexing the long term flag (LTP flag), the additional bits corresponding to the new information and data corresponding to each block together.
Referring to
If so, the first scheme decoding part 222-1 performs the long term prediction on a block becoming a target of the long term prediction according to the determination made by the long term prediction control part 222-1. In case that additional bits are transmitted, the first scheme decoding part 222-1 extracts the new information corresponding to the additional bits and then performs decoding of the corresponding block based on the extracted new information.
In the following description, applications of the encoder and decoder according to the present invention described with reference to
Referring to
The plural channel encoder 310 receives a plurality of channel signal (e.g., at least two channel signals) (hereinafter named a multi-channel signal) and then downmixes a plurality of the received channel signal to generate a mono or stereo downmix signal. And, the plural channel encoder 310 generates spatial information required for upmixing the downmix signal into a multi-channel signal. In this case, the spatial information can include channel level difference information, inter-channel correlation information, a channel prediction coefficient, downmix gain information and the like. Optionally, in case that the audio signal encoding apparatus 300 receives a mono signal, the plural channel encoder 310 does not downmix the received mono signal but the mono signal bypasses the plural channel encoder 310.
The band extension encoder 320 is able to generate spectral data corresponding to a low frequency band and extension information for high frequency band extension by applying a band extension scheme to the downmix signal outputted from the plural channel encoder 310. In particular, spectral data of a partial band of the downmix signal is excluded and the band extension information for reconstructing the excluded data can be generated.
The signal generated by the band extension coding unit 320 is inputted to an A coding unit 120A, a B coding unit 120B or a C coding unit 120C according to coding scheme information generated by a signal classifier (not shown in the drawing) (e.g., the former signal classifier 110 shown in
The A to C coding units 10A to 120C are identical to the former coding units described with reference to
First of all, in case that a specific frame or segment of the downmix signal has a dominant speech characteristic, the A coding unit 120A encodes the downmix signal by the A coding scheme (i.e., a rectangular coding scheme belonging to a first coding scheme). In this case, the A coding scheme can follow AMR-WB (adaptive multi-rate wideband) standard, by which the present invention is non-limited. Meanwhile, the A coding unit 120A is able to further use a linear prediction coding (LPC) scheme. In case that a harmonic signal has high redundancy on a time axis, it can be modeled by linear prediction for predicting a current signal from a past signal. In this case, if the linear prediction coding scheme is adopted, coding efficiency can be raised. Meanwhile, the A coding unit 120A can include a time domain encoder.
Secondly, in case that audio and speech characteristics coexist in a specific frame or segment of the downmix signal, the B coding unit 120B encodes the downmix signal by the B coding scheme (i.e., a non-rectangular coding scheme belonging to the first coding scheme). In this case, the B coding scheme may correspond to TCX (transform coded excitation), by which the present invention is non-limited. In this case, the TCX can include a scheme for performing frequency transform on an excitation signal obtained from performing linear prediction (LPC). In this case, the frequency transform can include MDCT (modified discrete cosine transform).
Thirdly, in case that a specific frame or segment of the downmix signal has a dominant audio characteristic, the C coding unit 120C encodes the downmix signal by the C coding scheme (i.e., a non-rectangular coding scheme belonging to a second coding scheme). In this case, the C coding scheme can follow AAC (advanced audio coding) standard or HE-AAC (high efficiency advanced audio coding) standard, by which the present invention is non-limited. Meanwhile, the C coding unit 120C can include an MDCT (modified discrete transform) encoder.
And, the multiplexer 330 generates at least one bitstream by multiplexing spatial information, band extension information and the signal encoded by each of the A to C coding units 120A to 120C together.
Referring to
The demultiplexer 410 extracts the data encoded by the A to C coding schemes, the band extension information, the spatial information and the like from an audio signal bitstream.
The A to C decoding units 220A to 220C correspond to the former A to C encoding units 120A to 120C to perform reverse processes thereof, respectively and their details shall be omitted from the following description.
The band extension decoding unit 420 reconstructs a high frequency band signal based on the band extension information by performing a band extension decoding scheme on an output signal of each of the A to C decoding units 220A to 220C.
In case that the decoded audio signal is a downmix signal, the plural channel decoder 430 generates an output channel signal of a multichannel signal stereo signal included) using the spatial information.
The audio signal processing apparatus according to the present invention is available for various products to use. Theses products can be mainly grouped into a stand alone group and a portable group. A TV, a monitor, a settop box and the like can be included in the stand alone group. And, a PMP, a mobile phone, a navigation system and the like can be included in the portable group.
Referring to
A user authenticating unit 520 receives an input of user information and then performs user authentication. The user authenticating unit 520 can include at least one of a fingerprint recognizing unit 520A, an iris recognizing unit 520B, a face recognizing unit 520C and a voice recognizing unit 520D. The fingerprint recognizing unit 520A, the iris recognizing unit 520B, the face recognizing unit 520C and the speech recognizing unit 520D receive fingerprint information, iris information, face contour information and voice information and then convert them into user informations, respectively. Whether each of the user informations matches pre-registered user data is determined to perform the user authentication.
An input unit 530 is an input device enabling a user to input various kinds of commands and can include at least one of a keypad unit 530A, a touchpad unit 530B and a remote controller unit 530C, by which the present invention is non-limited.
A signal coding unit 540 performs encoding or decoding on an audio signal and/or a video signal, which is received via the wire/wireless communication unit 510, and then outputs an audio signal in time domain. The signal coding unit 540 includes an audio signal processing apparatus 545. As mentioned in the foregoing description, the audio signal processing apparatus 545 corresponds to the above-described encoder 100 (first to sixth embodiments included) or the decoder 200 (first to sixth embodiments included). Thus, the audio signal processing apparatus 545 and the signal coding unit including the same can be implemented by at least one or more processors.
A control unit 550 receives input signals from input devices and controls all processes of the signal decoding unit 540 and an output unit 560. In particular, the output unit 560 is an element configured to output an output signal generated by the signal decoding unit 540 and the like and can include a speaker unit 560A and a display unit 560B. If the output signal is an audio signal, it is outputted to a speaker. If the output signal is a video signal, it is outputted via a display.
Referring to
An audio signal processing method according to the present invention can be implemented into a computer-executable program and can be stored in a computer-readable recording medium. And, multimedia data having a data structure of the present invention can be stored in the computer-readable recording medium. The computer-readable media include all kinds of recording devices in which data readable by a computer system are stored. The computer-readable media include ROM, RAM, CD-ROM, magnetic tapes, floppy discs, optical data storage devices, and the like for example and also include carrier-wave type implementations (e.g., transmission via Internet). And, a bitstream generated by the above mentioned encoding method can be stored in the computer-readable recording medium or can be transmitted via wire/wireless communication network.
Accordingly, the present invention is applicable to processing and outputting an audio signal.
While the present invention has been described and illustrated herein with reference to the preferred embodiments thereof, it will be apparent to those skilled in the art that various modifications and variations can be made therein without departing from the spirit and scope of the invention. Thus, it is intended that the present invention covers the modifications and variations of this invention that come within the scope of the appended claims and their equivalents.
Oh, Hyen-O, Lee, Chang Heon, Kang, Hong Goo, Song, Jeungook
Patent | Priority | Assignee | Title |
Patent | Priority | Assignee | Title |
5848391, | Jul 11 1996 | FRAUNHOFER-GESELLSCHAFT ZUR FORDERUNG DER ANGEWANDTEN FORSCHUNG E V ; Dolby Laboratories Licensing Corporation | Method subband of coding and decoding audio signals using variable length windows |
5890106, | Mar 19 1996 | Dolby Laboratories Licensing Corporation | Analysis-/synthesis-filtering system with efficient oddly-stacked singleband filter bank using time-domain aliasing cancellation |
6134518, | Mar 04 1997 | Cisco Technology, Inc | Digital audio signal coding using a CELP coder and a transform coder |
6475245, | Aug 29 1997 | The Regents of the University of California | Method and apparatus for hybrid coding of speech at 4KBPS having phase alignment between mode-switched frames |
8352279, | Sep 06 2008 | HUAWEI TECHNOLOGIES CO , LTD | Efficient temporal envelope coding approach by prediction between low band signal and high band signal |
8447620, | Oct 08 2008 | Fraunhofer-Gesellschaft zur Foerderung der Angewandten Forschung E V; VOICEAGE CORPORATION | Multi-resolution switched audio encoding/decoding scheme |
20030009325, | |||
20040024588, | |||
20050185850, | |||
20060195314, | |||
20070225971, | |||
20070282603, | |||
20080052068, | |||
20080065373, | |||
20100138218, | |||
20110004479, | |||
20120185257, | |||
CN102576540, | |||
JP2011527453, | |||
JP2012505423, | |||
JP2012530946, | |||
WO2008071353, | |||
WO2010062123, | |||
WO2011013980, | |||
WO2006046546, | |||
WO2007040353, | |||
WO2007040357, | |||
WO2010148516, |
Executed on | Assignor | Assignee | Conveyance | Frame | Reel | Doc |
Mar 05 2012 | KANG, HONG GOO | Industry-Academic Cooperation Foundation, Yonsei University | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 046121 | /0322 | |
Mar 05 2012 | SONG, JUNG WOOK | Industry-Academic Cooperation Foundation, Yonsei University | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 046121 | /0322 | |
Mar 22 2012 | OH, HYEN-O | Industry-Academic Cooperation Foundation, Yonsei University | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 046121 | /0322 | |
Mar 22 2012 | LEE, CHANG HEON | Industry-Academic Cooperation Foundation, Yonsei University | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 046121 | /0322 | |
Dec 17 2015 | Industry-Academic Cooperation Foundation, Yonsei University | INTELLECTUAL DISCOVERY CO , LTD | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 046121 | /0434 | |
Oct 18 2017 | INTELLECTUAL DISCOVERY CO , LTD | UNIFIED SOUND SYSTEMS, INC | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 046121 | /0596 | |
Dec 15 2017 | Dolby Laboratories Licensing Corporation | (assignment on the face of the patent) | / | |||
Aug 27 2018 | UNIFIED SOUND SYSTEMS, INC | Dolby Laboratories Licensing Corporation | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 046802 | /0453 |
Date | Maintenance Fee Events |
Dec 15 2017 | BIG: Entity status set to Undiscounted (note the period is included in the code). |
May 23 2023 | M1552: Payment of Maintenance Fee, 8th Year, Large Entity. |
Date | Maintenance Schedule |
Jul 23 2022 | 4 years fee payment window open |
Jan 23 2023 | 6 months grace period start (w surcharge) |
Jul 23 2023 | patent expiry (for year 4) |
Jul 23 2025 | 2 years to revive unintentionally abandoned end. (for year 4) |
Jul 23 2026 | 8 years fee payment window open |
Jan 23 2027 | 6 months grace period start (w surcharge) |
Jul 23 2027 | patent expiry (for year 8) |
Jul 23 2029 | 2 years to revive unintentionally abandoned end. (for year 8) |
Jul 23 2030 | 12 years fee payment window open |
Jan 23 2031 | 6 months grace period start (w surcharge) |
Jul 23 2031 | patent expiry (for year 12) |
Jul 23 2033 | 2 years to revive unintentionally abandoned end. (for year 12) |