ВУЗ: Казахская Национальная Академия Искусств им. Т. Жургенова
Категория: Книга
Дисциплина: Не указана
Добавлен: 03.02.2019
Просмотров: 21720
Скачиваний: 19
DTV Audio Encoding and Decoding 7-51
•
Terrestrial audio broadcasting
•
Delivery of audio over metallic or optical cables, or over RF links
•
Storage of audio on magnetic, optical, semiconductor, or other storage media
7.3.2d
Encoding
The AC-3 encoder accepts PCM audio and produces the encoded bit stream for the ATSC DTV
standard [3]. The AC-3 algorithm achieves high coding gain (the ratio of the input bit rate to the
output bit rate) by coarsely quantizing a frequency-domain representation of the audio signal. A
block diagram of this process is given in Figure 7.3.3. The first step in the encoding chain is to
transform the representation of audio from a sequence of PCM time samples into a sequence of
blocks of frequency coefficients. This is done in the analysis filterbank. Overlapping blocks of
512 time samples are multiplied by a time window and transformed into the frequency domain.
Because of the overlapping blocks, each PCM input sample is represented in two sequential
transformed blocks. The frequency-domain representation then may be decimated by a factor of
2, so that each block contains 256 frequency coefficients. The individual frequency coefficients
are represented in binary exponential notation as a binary exponent and a mantissa. The set of
exponents is encoded into a coarse representation of the signal spectrum, referred to as the spec-
Figure 7.3.2
Example application of the AC-3 audio subsystem for satellite audio transmission.
(
From [3]. Used with permission.)
Downloaded from Digital Engineering Library @ McGraw-Hill (www.digitalengineeringlibrary.com)
Copyright © 2004 The McGraw-Hill Companies. All rights reserved.
Any use is subject to the Terms of Use as given at the website.
DTV Audio Encoding and Decoding
7-52 Compression Technologies for Audio
tral envelope. This spectral envelope is used by the core bit-allocation routine, which determines
how many bits should be used to encode each individual mantissa. The spectral envelope and the
coarsely quantized mantissas for six audio blocks (1536 audio samples) are formatted into an
AC-3 frame. The AC-3 bit stream is a sequence of AC-3 frames.
The actual AC-3 encoder is more complex than shown in the simplified system of Figure
7.3.3. The following functions also are included:
•
A frame header is attached, containing information (bit rate, sample rate, number of encoded
channels, and other data) required to synchronize to and decode the encoded bit stream.
•
Error-detection codes are inserted to allow the decoder to verify that a received frame of data
is error-free.
•
The analysis filterbank spectral resolution may be dynamically altered to better match the
time/frequency characteristic of each audio block.
•
The spectral envelope may be encoded with variable time/frequency resolution.
•
A more complex bit-allocation may be performed, and parameters of the core bit-allocation
routine may be modified to produce a more optimum bit allocation.
•
The channels may be coupled at high frequencies to achieve higher coding gain for operation
at lower bit rates.
•
In the 2-channel mode, a rematrixing process may be selectively performed to provide addi-
tional coding gain, and to allow improved results to be obtained in the event that the 2-chan-
nel signal is decoded with a matrix surround decoder.
Figure 7.3.3
Overview of the AC-3 audio-compression system encoder. (
From [3]. Used with per-
mission.)
Downloaded from Digital Engineering Library @ McGraw-Hill (www.digitalengineeringlibrary.com)
Copyright © 2004 The McGraw-Hill Companies. All rights reserved.
Any use is subject to the Terms of Use as given at the website.
DTV Audio Encoding and Decoding
DTV Audio Encoding and Decoding 7-53
7.3.2e
Decoding
The decoding process is, essentially, the inverse of the encoding process [3]. The basic decoder,
shown in Figure 7.3.4, must synchronize to the encoded bit stream, check for errors, and defor-
mat the various types of data (i.e., the encoded spectral envelope and the quantized mantissas).
The bit-allocation routine is run, and the results are used to unpack and dequantize the mantissas.
The spectral envelope is decoded to produce the exponents. The exponents and mantissas are
transformed back into the time domain to produce the decoded PCM time samples. Additional
steps in the audio decoding process include the following:
•
Error concealment or muting may be applied in the event a data error is detected.
•
Channels that have had their high-frequency content coupled must be decoupled.
•
Dematrixing must be applied (in the 2-channel mode) whenever the channels have been rema-
trixed.
•
The synthesis filterbank resolution must be dynamically altered in the same manner as the
encoder analysis filterbank was altered during the encoding process.
7.3.3
Implementation of the AC-3 System
As illustrated in Figure 7.3.5, the audio subsystem of the ATSC DTV standard comprises the
audio-encoding/decoding function and resides between the audio inputs/outputs and the trans-
port subsystem [4]. The audio encoder is responsible for generating the audio elementary stream,
which is an encoded representation of the baseband audio input signals. (Note that more than one
audio encoder may be used in a system.) The flexibility of the transport system allows multiple
audio elementary streams to be delivered to the receiver. At the receiver, the transport subsystem
is responsible for selecting which audio streams to deliver to the audio subsystem. The audio
Figure 7.3.4
Overview of the AC-3 audio-compression system decoder. (
From [3]. Used with per-
mission.)
Downloaded from Digital Engineering Library @ McGraw-Hill (www.digitalengineeringlibrary.com)
Copyright © 2004 The McGraw-Hill Companies. All rights reserved.
Any use is subject to the Terms of Use as given at the website.
DTV Audio Encoding and Decoding
7-54 Compression Technologies for Audio
subsystem is then responsible for decoding the audio elementary stream back into baseband
audio.
An audio program source is encoded by a digital television audio encoder. The output of the
audio encoder is a string of bits that represent the audio source (the audio elementary stream).
The transport subsystem packetizes the audio data into PES (packetized elementary system)
packets, which are then further packetized into transport packets. The transmission subsystem
converts the transport packets into a modulated RF signal for transmission to the receiver. At the
receiver, the signal is demodulated by the receiver transmission subsystem. The receiver trans-
port subsystem converts the received audio packets back into an audio elementary stream, which
is decoded by the digital television audio decoder.
The partitioning shown in Figure 7.3.5 is conceptual, and practical implementations may dif-
fer. For example, the transport processing may be broken into two blocks; the first would per-
form PES packetization, and the second would perform transport packetization. Or, some of the
transport functionality may be included in either the audio coder or the transmission subsystem.
7.3.3a
Audio-Encoder Interface
The audio system accepts baseband inputs with up to six channels per audio program bit stream
in a channelization scheme consistent with ITU-R Rec. BS-775 [5]. The six audio channels are:
•
Left
•
Center
Figure 7.3.5
The audio subsystem in the DTV standard. (
From [4]. Used with permission.)
Downloaded from Digital Engineering Library @ McGraw-Hill (www.digitalengineeringlibrary.com)
Copyright © 2004 The McGraw-Hill Companies. All rights reserved.
Any use is subject to the Terms of Use as given at the website.
DTV Audio Encoding and Decoding
DTV Audio Encoding and Decoding 7-55
•
Right
•
Left surround
•
Right surround
•
Low-frequency enhancement (LFE)
Multiple audio elementary bit streams may be conveyed by the transport system.
The bandwidth of the LFE channel is limited to 120 Hz. The bandwidth of the other (main)
channels is limited to 20 kHz. Low-frequency response may extend to dc, but it is more typically
limited to approximately 3 Hz (–3 dB) by a dc-blocking high-pass filter. Audio-coding efficiency
(and thus audio quality) is improved by removing dc offset from audio signals before they are
encoded. The input audio signals may be in analog or digital form.
For analog input signals, the input connector and signal level are not specified [4]. Conven-
tional broadcast practice may be followed. One commonly used input connector is the 3-pin XLR
female (the incoming audio cable uses the male connector) with pin 1 ground, pin 2 hot or posi-
tive, and pin 3 neutral or negative.
Likewise, for digital input signals, the input connector and signal format are not specified.
Commonly used formats such as the AES3-1992 2-channel interface are suggested. When multi-
ple 2-channel inputs are used, the preferred channel assignment is:
•
Pair 1: Left, Right
•
Pair 2: Center, LFE
•
Pair 3: Left surround, Right surround
Sampling Parameters
The AC-3 system conveys digital audio sampled at a frequency of 48 kHz, locked to the 27 MHz
system clock [4]. If analog signal inputs are employed, the A/D converters should sample at 48
kHz. If digital inputs are employed, the input sampling rate should be 48 kHz, or the audio
encoder should contain sampling rate converters that translate the sampling rate to 48 kHz. The
sampling rate at the input to the audio encoder must be locked to the video clock for proper oper-
ation of the audio subsystem.
In general, input signals should be quantized to at least 16-bit resolution. The audio-compres-
sion system can convey audio signals with up to 24-bit resolution.
7.3.3b
Output Signal Specification
Conceptually, the output of the audio encoder is an elementary stream that is formed into PES
packets within the transport subsystem [4]. It is possible that digital television systems will be
implemented wherein the formation of audio PES packets takes place within the audio encoder.
In this case, the output of the audio encoder would be PES packets. Physical interfaces for these
outputs (elementary streams and/or PES packets) may be defined as voluntary industry standards
by SMPTE or other organizations; they are not, however, specified in the core ATSC standard.
Downloaded from Digital Engineering Library @ McGraw-Hill (www.digitalengineeringlibrary.com)
Copyright © 2004 The McGraw-Hill Companies. All rights reserved.
Any use is subject to the Terms of Use as given at the website.
DTV Audio Encoding and Decoding