A method for processing a digital audio broadcast signal in a radio receiver, includes: receiving a hybrid broadcast signal; demodulating the hybrid broadcast signal to produce an analog audio stream and a digital audio stream; and using a normalized cross-correlation of envelopes of the analog audio stream and the digital audio stream to measure a time offset between the analog audio stream and the digital audio stream. The time offset can be used to align the analog audio stream and the digital audio stream for subsequent blending of an output of the radio receiver from the analog audio stream to the digital audio stream or from the digital audio stream to the analog audio stream.
|
12. A radio receiver comprising:
processing circuitry configured:
to receive a hybrid broadcast signal;
to demodulate the hybrid broadcast signal to produce an analog audio stream and a digital audio stream;
to use a normalized cross-correlation of envelopes of the analog audio stream and the digital audio stream to measure a time offset between the analog audio stream and the digital audio stream;
to compute a coarse envelope cross-correlation over a first range of lag values to locate a vicinity of the time offset; and
to subsequently compute a fine envelope cross-correlation over a second range of lag values, wherein the second range of lag values is narrower than the first range of lag values.
1. A method for processing a digital audio broadcast signal in a radio receiver, the method comprising:
receiving a hybrid broadcast signal;
demodulating the hybrid broadcast signal to produce an analog audio stream and a digital audio stream;
using a normalized cross-correlation of envelopes of the analog audio stream and the digital audio stream to measure a time offset between the analog audio stream and the digital audio stream;
blending an output of the radio receiver from the analog audio stream to the digital audio stream or from the digital audio stream to the analog audio stream;
phase-adjusting the digital audio stream; and
using the phase-adjusted digital audio stream to temporarily replace input digital audio frames during blend ramps used in the blending of the output of the radio receiver.
2. The method of
using the time offset to align the analog audio stream and the digital audio stream.
3. The method of
using the time offset to scale an analog signal blend metric, to control blend thresholds in the blending of the output of the radio receiver, and to inhibit blending when misalignment is detected.
4. The method of
computing cross-correlation of bass signals to detect potential inversion, to validate time offset measurements, or to improve blend quality.
5. The method of
computing cross-correlation of the analog audio stream and the digital audio stream to predict sound quality of a potential blend.
6. The method of
the normalized cross-correlation of envelopes is computed using a vector of bandpass samples of the analog audio stream and a vector of bandpass samples of the digital audio stream.
7. The method of
using a coarse envelope cross-correlation computed over a first range of lag values to locate a vicinity of the time offset; and
subsequently using a fine envelope cross-correlation computed over a second range of lag values, wherein the second range of lag values is narrower than the first range of lag values.
8. The method of
using quadratic interpolation of peak indices to improve resolution of a computed peak lag.
9. The method of
the normalized cross-correlation of envelopes is computed using a vector of samples of the analog audio stream and a vector of samples of the digital audio stream; and
the normalized cross-correlation of envelopes of the vector of samples of the analog audio stream and the vector of samples of the digital audio stream produces bifurcated and composite correlation peaks that are compared for correlation validation via temporal consistency.
10. The method of
the normalized cross-correlation of envelopes is computed using a vector of samples of the analog audio stream and a vector of samples of the digital audio stream; and
the normalized cross-correlation of envelopes of the analog audio vector and the digital audio vector produces current and previous peaks that are compared for correlation validation via temporal consistency.
11. The method of
calculating phase-adjusted frequency-domain correlation coefficients to validate the time offset.
13. The radio receiver of
14. The radio receiver of
to compute the normalized cross-correlation of envelopes using a vector of samples of the analog audio stream and a vector of samples of the digital audio stream to produce bifurcated and composite correlation peaks; and
to compare the bifurcated and composite correlation peaks for correlation validation via temporal consistency.
15. The radio receiver of
to compute the normalized cross-correlation of envelopes using a vector of samples of the analog audio stream and a vector of samples of the digital audio stream to produce current and previous peaks; and
to compare the current and previous peaks for correlation validation via temporal consistency.
16. The radio receiver of
|
The described methods and apparatus relate to the time alignment of analog and digital pathways in hybrid digital radio systems.
Digital radio broadcasting technology delivers digital audio and data services to mobile, portable, and fixed receivers. One type of digital radio broadcasting, referred to as In-Band On-Channel (IBOC) digital audio broadcasting (DAB), uses terrestrial transmitters in the existing Medium Frequency (MF) and Very High Frequency (VHF) radio bands. HD Radio™ technology, developed by iBiquity Digital Corporation, is one example of an IBOC implementation for digital radio broadcasting and reception.
Both AM and FM In-Band On-Channel (IBOC) hybrid broadcasting systems utilize a composite signal including an analog modulated carrier and a plurality of digitally modulated subcarriers. Program content (e.g., audio) can be redundantly transmitted on the analog modulated carrier and the digitally modulated subcarriers. The analog audio is delayed at the transmitter by a diversity delay. Using the hybrid mode, broadcasters may continue to transmit analog AM and FM simultaneously with higher-quality and more robust digital signals, allowing themselves and their listeners to convert from analog-to-digital radio while maintaining their current frequency allocations.
The digital signal is delayed in the receiver with respect to its analog counterpart such that time diversity can be used to mitigate the effects of short signal outages and provide an instant analog audio signal for fast tuning. Hybrid-compatible digital radios incorporate a feature called “blend” which attempts to smoothly transition between outputting analog audio and digital audio after initial tuning, or whenever the digital audio quality crosses appropriate thresholds.
In the absence of the digital audio signal (for example, when the channel is initially tuned) the analog AM or FM backup audio signal is fed to the audio output. When the digital audio signal becomes available, the blend function smoothly attenuates and eventually replaces the analog backup signal with the digital audio signal while blending in the digital audio signal such that the transition preserves some continuity of the audio program. Similar blending occurs during channel outages which corrupt the digital signal. In this case the analog signal is gradually blended into the output audio signal by attenuating the digital signal such that the audio is fully blended to analog when the digital corruption appears at the audio output.
Blending will typically occur at the edge of digital coverage and at other locations within the coverage contour where the digital waveform has been corrupted. When a short outage does occur, as when traveling under a bridge in marginal signal conditions, the digital audio is replaced by an analog signal.
When blending occurs, it is important that the content on the analog audio and digital audio channels is time-aligned to ensure that the transition is barely noticed by the listener. The listener should detect little other than possible inherent quality differences in analog and digital audio at these blend points. If the broadcast station does not have the analog and digital audio signals aligned, then the result could be a harsh-sounding transition between digital and analog audio. This misalignment or “offset” may occur because of audio processing differences between the analog audio and digital audio paths at the broadcast facility.
The analog and digital signals are typically generated with two separate signal-generation paths before combining for output. The use of different audio-processing techniques and different signal-generation methods makes the alignment of these two signals nontrivial. The blending should be smooth and continuous, which can happen only if the analog and digital audio are properly aligned.
The effectiveness of any digital/analog audio alignment technique can be quantified using two key performance metrics: measurement time and offset measurement error. Although measurement of the time required to estimate a valid offset can be straightforward, the actual misalignment between analog and digital audio sources is often neither known nor fixed. This is because audio processing typically causes different group delays within the constituent frequency bands of the source material. This group delay can change with time, as audio content variation accentuates one band over another. When the audio processing applied at the transmitter to the analog and digital sources is not the same—as is often the case at actual radio stations—audio segments in corresponding frequency bands have different group delays. As audio content changes over time, misalignment becomes dynamic. This makes it difficult to ascertain whether a particular time-alignment algorithm provides an accurate result.
Existing time alignment algorithms rely on locating a normalized cross-correlation peak generated from the analog and digital audio sample vectors. When the analog and digital audio processing is the same, a clearly visible correlation peak usually results.
However, techniques that rely solely on normalized cross-correlation of digital and analog audio vectors often produce erroneous results due to the group-delay difference described above. When the analog and digital audio processing is different, the normalized cross correlation is often relatively low and lacks a definitive peak.
Although multiple measurements averaged over time can reduce the dynamic offset measurement error, this leads to excessive measurement times and potential residual offset error due to persistent group-delay differences. Since an HD Radio receiver may use this measurement to improve real-time hybrid audio blending, excessive measurement time and offset error make this a less attractive solution. Therefore, improved techniques for measuring time offsets are desired.
In a first aspect, a method for processing a digital audio broadcast signal in a radio receiver, includes: receiving a hybrid broadcast signal; demodulating the hybrid broadcast signal to produce an analog audio stream and a digital audio stream; and using a normalized cross-correlation of envelopes of the analog audio stream and the digital audio stream to measure a time offset between the analog audio stream and the digital audio stream.
In another aspect, a radio receiver includes processing circuitry configured to receive a hybrid broadcast signal; to demodulate the hybrid broadcast signal to produce an analog audio stream and a digital audio stream; and to use a normalized cross-correlation of envelopes of the analog audio stream and the digital audio stream to measure a time offset between the analog audio stream and the digital audio stream.
In another aspect, a method for aligning analog and digital signals includes: receiving or generating an analog audio stream and a digital audio stream; using a normalized cross-correlation of envelopes of the analog audio stream and the digital audio stream to measure a time offset between the analog audio stream and the digital audio stream; and using the time offset to align the analog audio stream and the digital audio stream.
Embodiments described herein relate to the processing of the digital and analog portions of a digital radio broadcast signal. This description includes an algorithm for time alignment of analog and digital audio streams for an HD Radio receiver or transmitter. While aspects of the disclosure are presented in the context of an exemplary HD Radio system, it should be understood that the described methods and apparatus are not limited to HD Radio systems and that the teachings herein are applicable to methods and apparatus that include the measurement of time offset between two signals.
Previously known algorithms for time alignment of analog and digital audio streams rely on locating a normalized cross-correlation peak generated from the analog and digital audio sample vectors. When the analog and digital audio processing is the same, a clearly visible correlation peak usually results. For example,
However, audio processing typically causes different group delays within the constituent frequency bands of the source material. This group delay can change with time, as audio content variation accentuates one frequency band over another. When the audio processing applied at the transmitter to the analog and digital sources is not the same—as is often the case at actual radio stations—audio segments in corresponding frequency bands have different group delays. As audio content changes over time, misalignment becomes dynamic. This makes it difficult to ascertain whether a particular time-alignment algorithm provides an accurate result.
As a result of this group delay, when the analog and digital audio processing is different, the normalized cross correlation is often relatively low and lacks a definitive peak.
Correlation of audio envelopes (with phase differences removed) can be used to reduce or eliminate the problems due to group delay differences. The techniques described herein utilize the correlation of audio envelopes to solve the problem of offset measurement error caused by group-delay variations between the digital and analog audio streams.
The techniques described herein are efficient and require significantly less measurement time than previously known techniques because the need for consistency checks is reduced. Additionally, a technique for correcting group-delay differences during the blend ramp is described.
Time alignment between the analog audio and digital audio of a hybrid HD Radio waveform is needed to assure a smooth blend from digital to analog in the HD Radio receivers. Time misalignment sometimes occurs at the transmitter, although alignment should be maintained. Misalignment can also occur at the receiver due to implementation choices when creating the analog and digital audio streams. A time-offset measurement can be used to correct the misalignment when it is detected. It can also be used to adjust blending thresholds to inhibit blending when misalignment is detected and to improve sound quality during audio blends.
The described technique is validated by measuring the normalized cross correlation of the analog and digital audio vectors after correcting any group delay differences between them. This results in a more accurate, efficient, and rapid time offset measurement than previous techniques.
In the described embodiment, multistage filtering and decimation are applied to isolate critical frequency bands and improve processing efficiencies. Normalized cross-correlation of both the coarse and fine envelopes of the analog and digital audio streams is used to measure the time offset. As used in this description, a coarse envelope represents the absolute value of an input audio signal after filtering and decimation by a factor of 128, and a fine envelope represents the absolute value of an input audio signal after filtering and decimation by a factor of 4. Correlation is performed in two steps—coarse and fine—to improve processing efficiency.
A high-level functional block diagram of an HD Radio receiver 10 highlighting the time-alignment algorithm is shown in
Cyclic redundancy check (CRC) bits of the digital audio frames are checked to determine a CRC state. CRC state is determined for each audio frame (AF). For example, the CRC state value could be set to 1 if the CRC checks, and set to 0 otherwise. A blend control function 52 receives a CRC state signal on line 54 and the cross-correlation coefficient on line 44, and produces a blend control signal on line 56.
An audio analog-to-digital (A/D) blend function 58 receives the digital audio on line 60, the analog audio on line 22, the phase-adjusted digital audio on line 48, and the blend control signal on line 56, and produces a blended audio output on line 62. The analog audio signal on line 42 and the digital audio signal on line 40 constitute a pair of audio signal vectors.
In the receiver depicted in
The time offset measurement block 38 in
(1) A cross-correlation coefficient may be passed to the blend algorithm to adjust blend thresholds and inhibit blending when misalignment is detected;
(2) The delay of the digital audio signal may be adjusted in real time using the measured time offset, thereby automatically aligning the analog and digital audio; or
(3) Phase-adjusted digital audio may temporarily replace the input digital audio to improve sound quality during blends.
In another embodiment, a filtered time-offset measurement could also be used for automatic time alignment of the analog and digital audio signals in HD Radio hybrid transmitters.
Details of the time-offset measurement technique are described next.
In this embodiment, monophonic versions of the analog and digital audio streams are used to measure time offset. This measurement is performed in multiple steps to enhance efficiency. It is assumed here that the analog and digital audio streams are sampled simultaneously and input into the measurement device. The appropriate metric for estimating time offset for the analog and digital audio signals is the correlation coefficient function implemented as a normalized cross-correlation function. The correlation coefficient function has the property that it approaches unity when the two signals are time-aligned and identical, except for possibly an arbitrary scale-factor difference. The coefficient generally becomes statistically smaller as the time offset increases. The correlation coefficient is also computed for the envelope of the time-domain signals due to its tolerance to group-delay differences between the analog and digital signals.
Exemplary pseudocode for the executive function that controls the time-offset measurements, MEAS_TIME_ALIGNMENT, is shown below.
MEAS_TIME_ALIGNMENT
M = 2{circumflex over ( )}13; “length of analog audio vector at 44.1 ksps”
N = 2{circumflex over ( )}17; “length of digital audio vector (implementation dependent)”
results = 0; resultsprev = 0; resultsprev2 = 0; “Clear output vectors”
for k = 0...K − 1 ; “K is the number of measurement vectors”
get vector x ;“vector of M analog audio samples”
get vector y ;“vector of N digital audio samples”
[xenv,yenv,xabsf , yabsf ,xbass,ybass] = filter_vectors(x, y)
lagmin = 0
lagmax = length(yenv)−length(xenv); “Set coarse lag range”
[peakabs,offset,corr_coef,corr_phadj,ynormadj,peakbass] =
meas_offset(x,y,xenv,yenv,xabsf,yabsf,xbass,ybass,lagmin,lagmax)
“output arguments are set to zero if not measured due to RETURN or invalid”
resultsprev2 = resultsprev; “save results from two iterations ago”
resultsprev = results; “save results from previous iteration”
results = [peakabs,offset,corr_coef,corr_phadj,ynormadj,peakbass]
“Analyze results to determine if time offset measurement is successful”
if (corr_phadj > 0.8){circumflex over ( )}
(peakabs > 0.8){hacek over ( )}
{open oversize brace}
[(|offset − offset_prev| ≦ 2){circumflex over ( )}(peakabs + peakabs_prev > 1)]{hacek over ( )}
{close oversize brace}
[(|offset − offset_prev2| 23 2){circumflex over ( )}(peakabs + peakabs_prev2 > 1)]
break; “PASS:return results”
end if
“Continue with next measurement vector if results are not validated”
end for
A vector y of N digital audio samples is first formed for the measurement. Another smaller M-sample vector x of analog audio samples is used as a reference analog audio vector.
The goal is to find a vector subset of y that is time-aligned with x. Ideally, the signals are nominally time-aligned with the center of they vector. This allows the time-offset measurement to be computed over a range of ±(N−M)/2 samples relative to the midpoint of they vector. A recommended value of N is 217=131072 audio samples spanning nearly three seconds at a sample rate of 44.1 ksps. The search range is about ±1.4 seconds for M=213=8192 (approximately 186 msec).
The analog and digital audio input vectors are then passed through a filter_vectors function to isolate the desired audio frequency bands and limit processor throughput requirements. The audio spectrum is separated into several distinct passbands for subsequent processing. These bands include the full audio passband, bass frequencies, and bandpass frequencies. The bandpass frequencies are used create the audio envelopes that are required for accurate cross-correlation with phase differences removed. Bass frequencies are removed from the bandpass signals since they may introduce large group-delay errors when analog/digital audio processing is different; however, the isolated bass frequencies may be useful to validate the polarity of the audio signals. Furthermore, high frequencies are removed from the bandpass signal because time-alignment information is concentrated in lower non-bass frequencies. The entire audio passband is used to predict potential blend sound quality and validate envelope correlations.
After filtering, the range of coarse lag values is set and function meas_offset is called to perform the time-offset measurement. The coarse lag values define the range of sample offsets over which the smaller analog audio envelope is correlated against the larger digital audio envelope. This range is set to the difference in length between the analog and digital audio envelopes. After the coarse envelope correlation is complete, a fine envelope correlation is performed at a higher sample rate over a narrower range of lag values.
The results are then analyzed to determine whether the correlation peaks and offset values are valid. Validity is determined by ensuring that key correlation peaks exceed a threshold, and that these peak correlation values and their corresponding offset values are temporally consistent.
If not, the process repeats using new input measurement vectors until a valid time offset is declared. Once a valid time offset has been computed, the algorithm can be run periodically to ensure that proper time-alignment is being maintained.
The executive pseudocode MEAS_TIME_ALIGNMENT calls subsequent functions.
The time-offset measurements as a hierarchical series of functions are described below. These functions are described either as signal-flow diagrams or pseudocode, whichever is more appropriate for the function.
The input audio vectors x and y on lines 70 and 72 are initially processed in multiple stages of filtering and decimation, as shown in
The bandpass filter stages are followed by an absolute-value function 102 and 104 to allow envelope correlation. The resulting xabs and yabs signals on lines 106 and 108 are then filtered by filters 110 and 112 to produce xabsf and yabsf on lines 114 and 116, which are used to determine the fine cross-correlation peak. These signals are further filtered and decimated in filters 118 and 120 to yield the coarse envelope signals xenv and yenv on lines 122 and 124. The coarse envelope cross-correlation is used to locate the vicinity of the correlation time offset, allowing subsequent fine correlation of xabsf and yabsf to be efficiently computed over a narrower range of lag values.
The signals are scaled in time by the number of filter coefficients K, which inversely scales frequency span. The filter coefficients for each predetermined length K can be pre-computed for efficiency using function compute_LPF_coefs, defined below.
Exemplary pseudocode for the function compute_LPF_coefs for generating filter coefficients follows.
Function[h] = compute_LPF_coefs(K)
“Compute K LPF FIR filter coefficients, K is odd, k = 0 to K − 1”
“center coefficient to avoid divide - by - zero”
for
“upper half coefficients”
“copy to lower half coefficients”
end for
“normalize filter coefficient vector for unity dc gain”
The filter inputs include the input vector u, filter coefficients h, and the output decimation rate R.
Exemplary pseudocode for the LPF function is:
Function[v] = LPF(u, h, R)
“u is the input signal vector, h is the filter coefficient vector, R is
decimation rate”
K = length(h); “Number of filter coefficients”
“N is the length of the filter output vector v”
Filter passbands for the various signals of
After filtering, the executive MEAS_TIME_ALIGNMENT estimates the time offset between input analog and digital audio signals by invoking function meas_offset. An embodiment of a signal-flow diagram of the second function meas_offset called by executive MEAS_TIME_ALIGNMENT is shown in
As alluded to above, normalized cross-correlation should be performed on the envelopes of the audio signals to prevent group-delay differences caused by different analog/digital audio processing. For efficiency, this correlation is performed in two steps—coarse and fine—by the function CROSS_CORRELATE.
Referring to
Exemplary pseudocode for the CROSS_CORRELATE function is provided below.
Function[peak,lagpq] = CROSS_CORRELATE(u,v,lagmin,lagmax)
“Compute the cross-correlation of the input vectors, and their componentsa & b”
[coefa,coefb,coef] = con_coef_vectors(u,v,lagmin,lagmax)
“Find the peak of the vectors”
[peaka,lagpqa] = peak_lag(coefa)
[peakb,lagpqb] = peak_lag(coefb)
[peak,lagpq] = peak_lag(coef)
“Check if the measurement peak is valid”
RETURN FAIL if (peak < 0.7){hacek over ( )}(|lagpq − lagpqa| > 0.5){hacek over ( )}(|lagpq −lagpqb| > 0.5)
The CROSS_CORRELATE function first calls function corr_coef_vectors to split in half each input vector and compute cross-correlation coefficients not only for the composite input vectors (coef), but also for their bifurcated components (coefa and coefb). The peak index corresponding to each of the three correlation coefficients (lagpq, lagpqa, and lagpqb) is also determined by function peak_lag. This permits correlation validation via temporal consistency. If the lags at the peaks of the bifurcated components both fall within half a sample of the composite lag (at the native sample rate), and if the composite peak value exceeds a modest threshold, the correlation is deemed valid. Otherwise, control is passed back to meas_offset and MEAS_TIME_ALIGNMENT, and processing will continue with the next measurement vector.
After the inputs to function corr_coef_vectors have been bifurcated, the mean is removed from each half to eliminate the bias introduced by the absolute value (envelope) operation in function filter_vectors. The cross-correlation coefficient also requires normalization by the signal energy (computed via auto-correlation of each input) to ensure the output value does not exceed unity. All of this processing need only be performed once for the shorter analog input vector u. However, the digital input vector v must be truncated to the length of the analog vector, and its normalization factors (Svva and Svvb) and the resulting cross-correlation coefficients are calculated for each lag value between lagmin and lagmax. To reduce processing requirements, the correlation operations are performed only for the bifurcated vectors. The composite correlation coefficient coef is obtained through appropriate combination of the bifurcated components.
Exemplary pseudocode of the first function corr_coef_vectors called by CROSS_CORRELATE is as follows. Note that all correlation operations are concisely expressed as vector dot products.
Function[coefa,coefb,coef] = corr_coef_vectors(u,v,lagmin,lagmax)
“cross - correlate smaller vector u over longer vector v over lag range”
“bifurcate vector u into 2 parts ua and ub each of length Ka”
uam = subvector(u,0...Ka − 1); “extract first half of vector u”
ua = uam − mean(uam)
ubm = subvector(u, Ka...2 · Ka − 1); “extract second half of vector u”
ub = ubm − mean(ubm)
Suua = ua · ua ; “vector dot product, scalar result”
Suub = ub · ub ; “vector dot product, scalar result”
for lag = lagmin...lagmax; “correlation coefficients each lag”
vam = subvector(v,lag...lag + Ka − 1)
va = vam − mean(vam)
vbm = subvector(v,lag + Ka...lag + 2 · Ka − 1)
vb = vbm − mean(vbm)
Svva = va · va
Svvb = vb · vb
Suva = ua · va
Suvb = ub · vb
end for
Exemplary pseudocode of the second function peak_lag called by CROSS_CORRELATE is as follows.
Function[peak,lagpq] = peak_lag(coef)
“Find vector peak and lag index lagpq”
L = length(coef)
peak = 0
lagp = 0
for lag = 0...L − 1
if coeflag > peak
peak = coeflag
lagp = lag
end for
if (lagp = 0) (lagp = L − 1)
peak = 0
lagpq = 0
otherwise
“quadratic fit peak”
Function peak_lag is called by CROSS_CORRELATE to find the peak value and index of the input cross-correlation coefficient. Note that if the peak lies on either end of the input vector, both the outputs (peak and lagpq) will be cleared, effectively failing the cross-correlation operation. This is because it is not possible to determine whether a maximum at either end of the vector is truly a peak. Also, since this function is run at a relatively coarse sample rate (either 44100/4=11025 Hz or 44100/128=344.53125 Hz), the resolution of the peak lag value is fairly granular. This resolution is improved via quadratic interpolation of the peak index. The resulting output lagpq typically represents a fractional number of samples; it is subsequently rounded to an integer number of samples in the meas_offset function.
Function CORRELATION_METRICS in block 148 of
Exemplary pseudocode of the function CORRELATION_METRICS called by meas_offset is as follows.
Function[corr_coef, corr_phadj, ynormadj] =
CORRELATION_METRICS(x, y, offset)
Kt = 2floor{log2[length(x)]} ; “truncate vector size to largest power of 2”
xpart = subvector(x, 0, Kt − 1)
ypart = subvector(y, offset, offset + Kt − 1)
“· dot product scalar result”
“· dot product scalar result”
corr_coef = xnorm · ynorm ; “· dot product scalar result”
XNORM = FFT(xnorm)
YNORM = FFT(ynorm)
XMAG = |XNORM| ; “compute magnitude of each element of XNORM”
YMAG = |YNORM| ; “compute magnitude of each element of YNORM”
corr_phadj = Kt · XMAG · YMAG ; “phase-adjusted correlation
coefficient”
“impose XNORM phase onto YNORM elements”
ynormadj = IFFT(YNORMADJ) ; “phase-adjusted ynorm, ready for
blending”
Although it is important to avoid the effects of group-delay differences by correlating the envelopes of the analog and digital audio signals, it is also important to recognize that these envelopes contain no frequency information. Function CORRELATION_METRICS in block 148 of
Standard time-domain normalized cross-correlation of the input audio signals x and y is also performed at lag value offset by function CORRELATION_METRICS, yielding the output corr_coef. The value of corr_coef can be used to predict the sound quality of the blend. As previously noted, however, corr_coef will likely yield ambiguous results if analog/digital audio processing differs. This would not be the case, however, if the phase of the digital audio input were somehow reconciled with the analog phase prior to correlation. This is achieved in CORRELATION_METRICS by impressing the phase of the analog audio signal onto the magnitude of the digital signal. The resulting phase-adjusted digital audio signal ynormadj could then be temporarily substituted for the input digital audio y during blend ramps to improve sound quality.
Finally, cross-correlation of xbass on line 98 and ybass on line 100 is performed by function CORRELATE_BASS in block 140 of
Exemplary pseudocode of the function CORRELATE_BASS called by meas_offset is as follows.
Function[peakbass] = CORRELATE_BASS(xbass,ybass,lagpqabs)
“cross − correlate shorter vector xbass over longer vector ybass at single
lag value lagpqabs”
lag = round(lagpqabs)
Kb = length(xbass)
Sxx = xbass · xbass; “vector dot product, scalar result for normalization
of xbass”
y = subvector(ybass,lag...lag + Kb − 1); “elect xbass-sized segment of
ybass starting at lag”
Syy = y · y; “normalization of y”
Sxy = xbass · y
“Cross-correlation at peak lag value”
Return values peakabs, offset, and corr_phadj of function meas_offset are all used by the executive MEAS_TIME_ALIGNMENT for validating the time-offset measurement.
The steps used to implement the time-offset measurement algorithm are delineated in the executive pseudocode of MEAS_TIME_ALIGNMENT. The time offset is computed in several stages from coarse (envelope) to fine correlation, with interpolation used between stages. This yields an efficient algorithm with sufficiently high accuracy. Steps 1 through 8 describe the filtering operations defined in the signal-flow diagram of
[xenv, yenv, xabsf, yabsf, xbass, ybass]=filter_vectors(x, y)
Steps 10 through 15 describe the correlation operations defined in the signal-flow diagram of
[peakabs, offset, corr_coef, corr_phadj, ynormadj, peakbass]=meas_offset(x, y, xenv, yenv, xabsf, yabsf, xbass, ybass, lagmin, lagmax)
Step 1—Pre-compute the filter coefficients for each of the four constituent filters in the filter_vectors function defined in the signal-flow diagram of
The number of coefficients for each filter (Klpf, Kbass, Kabs, and Kenv) is defined in
hlpf=compute_LPF_coefs(Klpj)
hbass=compute_LPF_coefs(Kbass)
habs=compute_LPF_coefs(Kabs)
henv=compute_LPF_coefs(Kenv)
Step 2—Prepare monophonic versions of the digital and analog audio streams sampled at 44.1 ksps. It is recommended that the audio be checked for possible missing digital audio frames or corrupted analog audio. Capture another audio segment if corruption is detected on the present segment. Form x and y input vectors. The y vector consists of N digital audio samples. The x vector consists of M<N analog audio samples which are nominally expected to align near the center of they vector.
Step 3—Filter and decimate by rate R=4 (11,025-Hz output sample rate) both analog and digital audio (x and y) to produce new vectors xlpf and ylpf, respectively. The filter output is computed by the FIR filter function LPF defined in the above pseudocode LPF for performing filter processing.
Step 4—Filter vectors xlpf and ylpf to produce new vectors xbass and ybass, respectively. The filter output is computed by the FIR filter function LPF defined in the above pseudocode LPF for performing filter processing.
Step 5—Delay vector xlpf by D=(Kbass−1)/2 samples to accommodate bass FIR filter delay. Then subtract vector xbass from the result to yield new vector xbpf Similarly, subtract vector ybass from ylpf (after delay of D samples) to yield new vector ybpf. The output vectors xbpf and ybpf have the same lengths as vectors xbass and ybass.
Step 6—Create new vectors xabs and yabs by computing the absolute values of each of the elements of xbpf and ybpf
Step 7—Filter vectors xabs and yabs to produce new vectors xabsf and yabsf, respectively. The filter output is computed by the FIR filter function LPF defined in the above pseudocode LPF for performing filter processing.
Step 8—Filter and decimate by rate Renv=32 (344.53125-Hz output sample rate) both analog and digital audio (xabsf and yabsf) to produce new vectors xenv and yenv, respectively. The filter output is computed by the FIR filter function LPF defined in the above pseudocode LPF for performing filter processing.
Step 9—Compute the lag range for the coarse envelope correlation.
Step 10—Use the CROSS_CORRELATE function defined above to compute coarse envelope correlation-coefficient vectors from input vectors xenv and yenv over the range lagmin to lagmax. Find the correlation maximum peakenv and the quadratic interpolated peak index lagpqenv. If the measurement is determined invalid, control is returned to the executive and processing continues with the next measurement vector of analog and digital audio samples. Note that efficient computing can eliminate redundant computations.
Step 11—Compute the lag range for the fine correlation of xabsf and yabsf. Set the range ±0.5 samples around lagpqenv, interpolate by Renv, and round to integer sample indices.
Step 12—Use the CROSS_CORRELATE function defined above, to compute fine correlation coefficient vectors from input vectors xabsf and yabsf over the range lagabsmin to lagabsmax. Find the correlation maximum peakabs and the quadratic interpolated peak index lagpqabs. If the measurement is determined invalid, control is returned to the executive and processing continues with the next measurement vector of analog and digital audio samples. Note that efficient computing can eliminate redundant computations. Although the time offset is determined to be lagpqabs, additional measurements will follow to further improve the confidence in this measurement.
Step 13—Use the CORRELATE_BASS function defined above, to compute correlation coefficient peakbass from input vectors xbass and ybass at index lagpqabs.
Step 14—Compute the offset (in number of 44.1-ksps audio samples) between the analog and digital audio vectors x and y. This is achieved by interpolating fine peak index lagpqabs by R=4 and rounding the result to integer samples.
Step 15—Use the CORRELATION_METRICS function defined above to compute the correlation value corr_coef between the 44.1-ksps analog and digital audio input vectors x and y at the measured peak index offset. The frequency-domain correlation value corr_phadj is also computed after aligning the group delays of the x and y vectors. This is used to validate the accuracy of the time-offset measurement. Finally, this function generates phase-adjusted digital audio signal ynormadj, which can be temporarily substituted for the input digital audio y during blend ramps to improve sound quality.
Exemplary coarse (env), fine (abs), and input audio (x, y) cross-correlation coefficients are plotted together in
The time-offset measurement technique described above was modeled and simulated with a variety of analog and digital input audio sources. The simulation was used to empirically set decision thresholds, refine logical conditions for validating correlation peaks, and gather statistical results to assess performance and compare with other automatic time-alignment approaches.
A test vector was input to the simulation and divided into multiple fixed-length blocks of analog and digital audio samples. Each pair of sample blocks was then correlated and the peak value and index were used to measure the time offset. This process was repeated for all constituent sample blocks within the test vector. The results were then analyzed and significant statistics were compiled for that particular vector.
Simulations were run on 10 different test vectors, with representative audio from various musical genres including talk, classical, rock, and hip-hop. All vectors applied different audio processing to the analog and digital streams, except for F−5+0+0CCC_Mono and F+0 to −9+0+0DRR.
Correlations (as defined in the algorithm description above) were performed on all constituent blocks within a test vector. Time offset and measurement time were recorded for valid correlations. The results were then analyzed and statistics were compiled for each vector. These statistics are tabulated in Table 1.
Since actual time offset is often unknown, mean offset is not a very useful statistic. Instead, the standard deviation of the time offset over all sample blocks comprising a test vector provides a better measure of algorithm precision. Mean measurement time is also a valuable statistic, indicating the amount of time it takes for the algorithm to converge to a valid result. These statistics are bolded in Table 1.
The results of Table 1 indicate that algorithm performance appears to be robust. The average time-offset standard deviation across all test vectors is 4.2 audio samples, indicating fairly consistent precision. The average measurement time across all test vectors is 0.5 seconds, which is well within HD Radio specifications. In fact, the worst-case measurement time across all vectors was just 7.2 seconds.
It is evident from Table 1 that the algorithm yields a relatively large range of estimated time offsets for some test vectors. This range is probably accurate, and is likely caused by different audio processing and the resulting group-delay differences between the analog and digital audio inputs. Unfortunately, there is no way to know the actual time offset at any given instant in each of the test vectors. As a result, ultimate verification of the algorithm can only be achieved through listening tests when implemented on a real-time HD Radio receiver platform.
TABLE 1
Simulation Statistical Results
Time Offset (44.1-kHz samples)
Measurement Time (seconds)
Test Vector
Min
Max
Mean
Std Dev
Min
Max
Mean
Std Dev
NJ_9470 MHz
−1
4
2
1.3
0.2
2.4
0.4
0.3
109Vector
−197
−149
−178.3
10.6
0.2
4.3
0.7
0.7
AM+0+0+0HRN
7
20
12.5
2.7
0.2
7.2
1.1
1
F-5+0+0CCC_Mono
−7
−4
−5
0.5
0.2
0.2
0.2
0
F+0+0+0HuRuN_Mono
−7
14
6.9
4
0.2
2
0.4
0.3
F+0+0+0TuTuN_Mono
−11
32
5.6
9.7
0.2
4.1
1
0.9
F+0+0+0DuRuR_Mono
−19
−3
−9.3
2.6
0.2
2.2
0.4
0.3
F+0+0+0DuRuC_Mono
−8
12
4.2
2.6
0.2
2.2
0.3
0.3
F+0+0+0DuRuN_Mono
−10
28
6.3
6
0.2
3
0.6
0.5
F+0to-9+0+0DRR
−10
1
−3.6
2.4
0.2
0.9
0.2
0.1
In addition to providing automatic time alignment in HD Radio receivers, the described algorithm has other potential applications. For instance, the described algorithm could be used in conjunction with an audio blending method, such as that described in commonly owned U.S. patent application Ser. No. 15/071,389, filed Mar. 16, 2016 and titled “Method And Apparatus For Blending An Audio Signal In An In Band On-Channel Radio System”, to adjust blend thresholds and inhibit blending when misalignment is detected. This provides a dynamic blend threshold control.
The blend algorithm uses an Analog Signal Blend Metric (ASBM) to control its blend thresholds. The ASBM is currently fixed at 1 for MPS audio and 0 for SPS audio. However, the corr_coef or corr_phadj signal from the time-alignment algorithm could be used to scale ASBM on a continuum between 0 and 1. For instance, a low value of corr_coef or corr_phadj would indicate poor agreement between analog and digital audio, and would (with a few other parameters) scale ASBM and the associated blend thresholds to inhibit blending. Other alignment parameters that might be used to scale ASBM include level-alignment information, analog audio quality, audio bandwidth, and stereo separation.
In another embodiment, the time-offset measurement could also be used for automatic time alignment of the analog and digital audio signals in HD Radio hybrid transmitters. The offset (measured in samples at 44.1 ksps) can be filtered with a nonlinear IIR filter to improve the accuracy over a single measurement, while also suppressing occasional anomalous measurement results.
(or about 63%) of the full step size, assuming the step size is less than ±lim. Step changes in time alignment offset are generally not expected; however, they could occur with changes in audio-processor settings.
The IIR filter reduces the standard deviation of the measured offset input values by the square root of α. The filtered offset value can be used to track and correct the time-alignment offset between the analog and digital audio streams.
In another embodiment, the described algorithm could be use for processing of intermittent or corrupted signals.
The time-offset measurement algorithm described above includes suggestions for measurements with an intermittent or corrupted signal. Exception processing may be useful under real channel conditions when digital audio packets are missing (e.g., due to corruption) or when the analog signal is affected by multipath fading, or experiences intentional soft muting and/or bandwidth reduction in the receiver. The receiver may inhibit time-offset measurements if or when these conditions are detected.
There are several implementation choices that can influence the efficiency of the algorithm. The normalization components of the correlation-coefficient computation do not need to be fully computed for every lag value across the correlation vector. The analog audio normalization component (e.g., Suua and Suub in the pseudocode of the first function corr_coef_vectors called by CROSS_CORRELATE) remains constant for every lag, so it is computed only once. The normalization energy, mean, and other components of the digital audio vector and its subsequent processed vectors can be simply updated for every successive lag by subtracting the oldest sample and adding the newest sample. Furthermore, the normalization components could be used later in a level-alignment measurement.
Also, the square-root operation can be avoided by using the square of the correlation coefficient, while preserving its polarity. Since the square is monotonically related to the original coefficient, the algorithm performance is not affected, assuming correlation threshold values are also squared.
After the initial time offset has been computed, the efficiency of the algorithm can be further improved by limiting the range of lag values, assuming alignment changes are small between successive measurements. The size M of the analog audio input vector x could also be reduced to limit processing requirements, although using too small an input vector could reduce the accuracy of the time-offset measurement.
Finally, the phase-adjusted digital audio ynormadj computed in the CORRELATION_METRICS function could actually be calculated in a different function. This signal was designed to improve sound quality by temporarily substituting it for input digital audio during blend ramps. But since blends occur sporadically, it could be more efficient to calculate ynormadj only as needed. In fact, the timing of the ynormadj calculation must be synchronized with the timing of the blend itself, to ensure that the phase-adjusted samples are ready to substitute. As a result, careful coordination with the blend algorithm is required for this feature.
From the above description it should be apparent that various embodiments of the described method for aligning analog and digital signals can be used in various types of signal processing apparatus, including radio receivers and radio transmitters. One embodiment of the method includes: receiving or generating an analog audio stream and a digital audio stream; and using a normalized cross-correlation of envelopes of the analog audio stream and the digital audio stream to measure a time offset between the analog audio stream and the digital audio stream. The normalized cross-correlation of envelopes can be computed using a vector of bandpass samples of the analog audio stream and a vector of bandpass samples of the digital audio stream.
The described method can be implemented in an apparatus such as a radio receiver or transmitter. The apparatus can be constructed using known types of processing circuitry that is programmed or otherwise configured to perform the functions described above.
While the present invention has been described in terms of its preferred embodiments, it will be apparent to those skilled in the art that various modifications can be made to the described embodiments without departing from the scope of the invention as defined by the following claims.
Kroeger, Brian W., Peyla, Paul J.
Patent | Priority | Assignee | Title |
10225070, | Apr 14 2016 | iBiquity Digital Corporation | Time-alignment measurment for hybrid HD radio technology |
10567097, | Dec 16 2016 | NXP B.V. | Audio processing circuit, audio unit and method for audio signal blending |
10666416, | Apr 14 2016 | iBiquity Digital Corporation | Time-alignment measurement for hybrid HD radio technology |
10805025, | Sep 17 2018 | OCTAVE COMMUNICATIONS | Method and system for evaluating signal propagation over a radio channel |
11190334, | Apr 14 2016 | iBiquity Digital Corporation | Time-alignment measurement for hybrid HD radio™ technology |
RE48966, | Apr 14 2016 | iBiquity Digital Corporation | Time-alignment measurement for hybrid HD radio™ technology |
Patent | Priority | Assignee | Title |
6178317, | Oct 09 1997 | iBiquity Digital Corporation | System and method for mitigating intermittent interruptions in an audio radio broadcast system |
6590944, | Feb 24 1999 | iBiquity Digital Corporation | Audio blend method and apparatus for AM and FM in band on channel digital audio broadcasting |
6735257, | Feb 24 1999 | iBiquity Digital Corporation | Audio blend method and apparatus for AM and FM in-band on-channel digital audio broadcasting |
6836520, | Nov 10 2000 | Google Technology Holdings LLC | Method and apparatus for establishing synchronization with a synchronization signal |
6901242, | Oct 09 1997 | iBiquity Digital Corporation | System and method for mitigating intermittent interruptions in an audio radio broadcast system |
6982948, | Nov 10 1999 | iBiquity Digital Corporation | Method and apparatus for transmission and reception of FM in-band on-channel digital audio broadcasting |
7546088, | Jul 26 2004 | iBiquity Digital Corporation | Method and apparatus for blending an audio signal in an in-band on-channel radio system |
7733983, | Nov 14 2005 | iBiquity Digital Corporation | Symbol tracking for AM in-band on-channel radio receivers |
7933368, | Jun 04 2007 | MERRILL LYNCH CREDIT PRODUCTS, LLC, AS COLLATERAL AGENT | Method and apparatus for implementing a digital signal quality metric |
8014446, | Dec 22 2006 | MERRILL LYNCH CREDIT PRODUCTS, LLC, AS COLLATERAL AGENT | Method and apparatus for store and replay functions in a digital radio broadcasting receiver |
8180470, | Jul 31 2008 | iBiquity Digital Corporation | Systems and methods for fine alignment of analog and digital signal pathways |
8408061, | Dec 02 2009 | EVIDENT SCIENTIFIC, INC | Sequentially fired high dynamic range NDT/NDI inspection device |
8724757, | Mar 26 2010 | GENERAL DYNAMICS ADVANCED INFORMATION SYSTEMS, INC; GENERAL DYNAMICS MISSION SYSTEMS, INC | Symbol timing synchronization methods and apparatus |
8811757, | Jan 05 2012 | Texas Instruments Incorporated | Multi-pass video noise filtering |
8976969, | Jun 29 2011 | Skyworks Solutions, Inc | Delaying analog sourced audio in a radio simulcast |
20050003772, | |||
20060019601, | |||
20060083380, | |||
20080299926, | |||
20100027719, | |||
20110188609, | |||
20120108191, | |||
20130003637, | |||
20130003801, | |||
20130109296, | |||
20130115903, | |||
20130343576, | |||
20140193097, | |||
20140342682, | |||
20140355726, | |||
20160302093, | |||
20170041129, | |||
20170169832, |
Executed on | Assignor | Assignee | Conveyance | Frame | Reel | Doc |
Apr 14 2016 | iBiquity Digital Corporation | (assignment on the face of the patent) | / | |||
Jun 14 2016 | KROEGER, BRIAN W | iBiquity Digital Corporation | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 039153 | /0787 | |
Jun 14 2016 | PEYLA, PAUL J | iBiquity Digital Corporation | ASSIGNMENT OF ASSIGNORS INTEREST SEE DOCUMENT FOR DETAILS | 039153 | /0787 | |
Jun 01 2020 | Rovi Guides, Inc | BANK OF AMERICA, N A | SECURITY INTEREST SEE DOCUMENT FOR DETAILS | 053468 | /0001 | |
Jun 01 2020 | Rovi Solutions Corporation | BANK OF AMERICA, N A | SECURITY INTEREST SEE DOCUMENT FOR DETAILS | 053468 | /0001 | |
Jun 01 2020 | iBiquity Digital Corporation | BANK OF AMERICA, N A | SECURITY INTEREST SEE DOCUMENT FOR DETAILS | 053468 | /0001 | |
Jun 01 2020 | PHORUS, INC | BANK OF AMERICA, N A | SECURITY INTEREST SEE DOCUMENT FOR DETAILS | 053468 | /0001 | |
Jun 01 2020 | DTS, INC | BANK OF AMERICA, N A | SECURITY INTEREST SEE DOCUMENT FOR DETAILS | 053468 | /0001 | |
Jun 01 2020 | TESSERA ADVANCED TECHNOLOGIES, INC | BANK OF AMERICA, N A | SECURITY INTEREST SEE DOCUMENT FOR DETAILS | 053468 | /0001 | |
Jun 01 2020 | Tessera, Inc | BANK OF AMERICA, N A | SECURITY INTEREST SEE DOCUMENT FOR DETAILS | 053468 | /0001 | |
Jun 01 2020 | Rovi Technologies Corporation | BANK OF AMERICA, N A | SECURITY INTEREST SEE DOCUMENT FOR DETAILS | 053468 | /0001 | |
Jun 01 2020 | Invensas Corporation | BANK OF AMERICA, N A | SECURITY INTEREST SEE DOCUMENT FOR DETAILS | 053468 | /0001 | |
Jun 01 2020 | Veveo, Inc | BANK OF AMERICA, N A | SECURITY INTEREST SEE DOCUMENT FOR DETAILS | 053468 | /0001 | |
Jun 01 2020 | TIVO SOLUTIONS INC | BANK OF AMERICA, N A | SECURITY INTEREST SEE DOCUMENT FOR DETAILS | 053468 | /0001 | |
Jun 01 2020 | INVENSAS BONDING TECHNOLOGIES, INC | BANK OF AMERICA, N A | SECURITY INTEREST SEE DOCUMENT FOR DETAILS | 053468 | /0001 |
Date | Maintenance Fee Events |
May 18 2021 | M1551: Payment of Maintenance Fee, 4th Year, Large Entity. |
Date | Maintenance Schedule |
Nov 28 2020 | 4 years fee payment window open |
May 28 2021 | 6 months grace period start (w surcharge) |
Nov 28 2021 | patent expiry (for year 4) |
Nov 28 2023 | 2 years to revive unintentionally abandoned end. (for year 4) |
Nov 28 2024 | 8 years fee payment window open |
May 28 2025 | 6 months grace period start (w surcharge) |
Nov 28 2025 | patent expiry (for year 8) |
Nov 28 2027 | 2 years to revive unintentionally abandoned end. (for year 8) |
Nov 28 2028 | 12 years fee payment window open |
May 28 2029 | 6 months grace period start (w surcharge) |
Nov 28 2029 | patent expiry (for year 12) |
Nov 28 2031 | 2 years to revive unintentionally abandoned end. (for year 12) |