Добавлен: 03.02.2019

Просмотров: 21716

Скачиваний: 19

ВНИМАНИЕ! Если данный файл нарушает Ваши авторские права, то обязательно сообщите нам.
background image

7-46 Compression Technologies for Audio

nal strength weakened, a step-by-step reduction in the picture signal-to-noise ratio would occur
in a way similar to that experienced in broadcast analog signals today. Viewers with poor recep-
tion, therefore, would experience a more graceful degradation in picture quality instead of a sud-
den dropout.

7.2.3

References

1.

ATSC, “Guide to the Use of the ATSC Digital Television Standard,” Advanced Television
Systems Committee, Washington, D.C., doc. A/54, Oct. 4, 1995.

2.

“IEEE Standard Specifications for the Implementation of 8 

×  8 Inverse Discrete Cosine

Transform,” std. 1180-1990, Dec. 6, 1990.

Figure 7.2.4

 ATSC DTV video system decoder functional block diagram. (

From [1]. Used with per-

mission.)

Downloaded from Digital Engineering Library @ McGraw-Hill (www.digitalengineeringlibrary.com)

Copyright © 2004 The McGraw-Hill Companies. All rights reserved.

Any use is subject to the Terms of Use as given at the website.

ATSC DTV System Compression Issues


background image

7-47

Chapter

7.3

DTV Audio Encoding and Decoding

Jerry C. Whitaker, Editor-in-Chief

7.3.1

Introduction

Monophonic sound is the simplest form of aural communication. A wide range of acceptable lis-
tening positions are practical, although it is obvious from most positions that the sound is origi-
nating from one source rather than occurring in the presence of the listener. Consumers have
accepted this limitation without much thought in the past because it was all that was available.
However, monophonic sound creates a poor illusion of the sound field that the program producer
might want to create.

Two channel stereo improves the illusion that the sound is originating in the immediate area

of the reproducing system. Still, there is a smaller acceptable listening area. It is difficult to keep
the sound image centered between the left and right speakers, so that the sound and the action
stay together as the listener moves in the room.

The AC-3 surround sound system is said to have 5.1 channels because there is a left, right,

center, left surround, and right surround, which make up the 5 channels. A sixth channel is
reserved for the lower frequencies and consumes only 120 Hz of the bandwidth; it is referred to
as the 0.1 or low-frequency enhancement (LFE) channel. The center channel restores the variety
of listening positions possible with monophonic sound.

The AC-3 system is effective in providing either an enveloping (ambient) sound field or

allowing precise placement and movement of special effects because of the channel separation
afforded by the multiple speakers in the system.

For efficient and reliable interconnection of audio devices, standardization of the interface

parameters is of critical importance. The primary interconnection scheme for professional digital
audio systems is AES Audio.

7.3.2

AES Audio

AES audio is a standard defined by the Audio Engineering Society and the European Broadcast-
ing Union. Each AES stream carries two audio channels, which can be either a stereo pair or two
independent feeds. The signals are pulse code modulated (PCM) data streams carrying digitized
audio. Each sample is quantized to 20 or 24 bits, creating an audio sample word. Each word is
then formatted to form a subframe, which is multiplexed with other subframes to form the AES

Source: Standard Handbook of Audio and Radio Engineering

Downloaded from Digital Engineering Library @ McGraw-Hill (www.digitalengineeringlibrary.com)

Copyright © 2004 The McGraw-Hill Companies. All rights reserved.

Any use is subject to the Terms of Use as given at the website.


background image

7-48 Compression Technologies for Audio

digital audio stream. The AES stream can then be serialized and transmitted over coaxial or
twisted-pair cable. The sampling rates supported range from 32 to 50 kHz. Common rates and
applications include the following:

32 kHz—used for radio broadcast links

44.1 kHz—used for CD players

48 kHz—used for professional recording and production

Although 18-bit sampling was commonly used in the past, 20 bits has become prevalent today.

At 24 bits/sample, the S/N is 146 dB. This level of performance is generally reserved for

high-end applications such as film recording and CD mastering. Table 7.3.1 lists the theoretical
S/N ratios as a function of sampling bits for audio A/D conversion.

Of particular importance is that the AES format is designed to be independent of the audio

conversion sample rate. The net data rate is exactly 64 times the sample rate, which is generally
48 kHz for professional applications. Thus, the most frequently encountered bit rate for AES3
data is 3.072 Mbits/s. 

The AES3-1992 standard document precisely defines the AES3 twisted pair interconnection

scheme. The signal, which is transmitted on twisted pair copper cable in a balanced format, is bi-
phase coded
. Primary signal parameters include the following:

Output level can range from 2–10 V p-p

Source impedance 110 

Receiver sensitivity 200 mV minimum

Input impedance is recommended to be 110 

Interconnecting cable characteristic impedance 110 

Electrical interface guidelines also have been set by the SMPTE and AES3 committees to

permit transmission of AES3 data on coaxial cable. This single-ended interface is known as
AES3-ID. The signal level, when terminated with 75 

Ω, is 1 V p-p, ±20 percent. The source

impedance is 75 

Ω.

AES3 is inherently synchronous. A master local digital audio reference is normally used so

that all audio equipment will be frequency- and phase-locked. The master reference can originate
from the digital audio equipment in a single room or an external master system providing a refer-
ence signal for larger facilities.

Table 7.3.1 Theoretical S/N as a Function of the Number of Sampling Bits

Number of Sampling Bits

Resolution (number of 

quantizing steps)

Maximum Theoretical S/N

18

262,144

110 dB

20

1,048,576

122 dB

24

16,777,216

146 dB

Downloaded from Digital Engineering Library @ McGraw-Hill (www.digitalengineeringlibrary.com)

Copyright © 2004 The McGraw-Hill Companies. All rights reserved.

Any use is subject to the Terms of Use as given at the website.

DTV Audio Encoding and Decoding


background image

DTV Audio Encoding and Decoding 7-49

7.3.2a

AES3 Data Format

The basic format structure of the AES data frames is shown in Figure 7.3.1. Each sample is car-
ried by a subframe containing the following elements [1]:

20 bits of sample data

4 bits of auxiliary data, which may be used to extend the sample to 24 bits

4 additional bits of data

A preamble

Two subframes make up a frame, which contains one sample from each of the two channels.
Frames are further grouped into 192-frame blocks, which define the limits of user data and
channel status data blocks. A special preamble indicates the channel identity for each sample (X
or Y preamble) and the start of a 192-frame block (Z preamble). To minimize the direct current
(dc) component on the transmission line, facilitate clock recovery, and make the interface polar-
ity insensitive, the data is channel coded in the biphase-mark mode.

The preambles specifically violate the biphase-mark rules for easy recognition and to ensure

synchronization. When digital audio is embedded in the serial digital video data stream, the start
of the 192-frame block is indicated by the Z bit, which corresponds to the occurrence of the Z-
type preamble.

The validity bit indicates whether the audio sample bits in the subframe are suitable for con-

version to an analog audio signal. User data is provided to carry other information, such as time
code. Channel status data contains information associated with each audio channel.

There are three levels of implementation of the channel status data: minimum, standard, and

enhanced. The standard implementation is recommended for use in professional video applica-
tions; the channel status data typically contains information about signal emphasis, sampling fre-
quency, channel mode (stereo, mono, etc.), use of auxiliary bits (extend audio data to 24 bits or
other use), and a CRC for error checking of the total channel status block.

Figure 7.3.1

 AES audio data format structure. (

From [1]. Used with permission.)

Downloaded from Digital Engineering Library @ McGraw-Hill (www.digitalengineeringlibrary.com)

Copyright © 2004 The McGraw-Hill Companies. All rights reserved.

Any use is subject to the Terms of Use as given at the website.

DTV Audio Encoding and Decoding


background image

7-50 Compression Technologies for Audio

7.3.2b

SMPTE 324M

SMPTE 324M (proposed at this writing) defines a synchronous, self-clocking serial interface for
up to 12 channels of linearly encoded audio and auxiliary data [2]. The interface is designed to
allow multiplexing of six two-channel streams compliant with AES3. Audio sampled at 48 kHz
and clock-locked to video is the preferred implementation for studio applications. However, the
324M interlace supports any frequency of operation supported by AES3, provided that all the
audio channels are sampled by a common clock. Ideally, all the channels should be audio syn-
chronous
 for guaranteed audio phase coherence. An audio channel is defined as being synchro-
nous with another when the two channels are running from the same clock and the analog inputs
are concurrently sampled.

The 324M standard is intended to provide a reliable method of distributing multiple cophased

channels of digital audio around the studio without losing the initial relative sample-phase rela-
tionship. A mechanism is provided to allow more than one 12-channel stream to be realigned
after a relative misalignment of up to 

±8 samples.

The interface, intended to be compatible with the complete range of digital television scan-

ning standards and standard film rates, can be used for distribution of multiple channels of audio
in either a pre-mix or post-mix situation. In the post-mix case, channel assignment is defined in
SMPTE 320M.

7.3.2c

Audio Compression

Efficient recording and/or transmission of digital audio signals demands a reduction in the
amount of information required to represent the aural signal [3]. The amount of digital informa-
tion needed to accurately reproduce the original PCM samples taken of an analog input may be
reduced by applying a digital compression algorithm, resulting in a digitally compressed repre-
sentation of the original signal. (In this context, the term compression applies to the digital infor-
mation that must be stored or recorded, not to the dynamic range of the audio signal.) The goal of
any digital compression algorithm is to produce a digital representation of an audio signal which,
when decoded and reproduced, sounds the same as the original signal, while using a minimum
amount of digital information (bit rate) for the compressed (or encoded) representation. The AC-
3 digital compression algorithm specified in the ATSC DTV system can encode from 1 to 5.1
channels of source audio from a PCM representation into a serial bit stream at data rates ranging
from 32 to 640 kbits/s.

A typical application of the bit-reduction algorithm is shown in Figure 7.3.2. In this example,

a 5.1 channel audio program is converted from a PCM representation requiring more than 5
Mbits/s (6 channels × 48 kHz × 18 bits = 5.184 Mbits/s) into a 384 kbits/s serial bit stream by the
AC-3 encoder. Radio frequency (RF) transmission equipment converts this bit stream into a
modulated waveform that is applied to a satellite transponder. The amount of bandwidth and
power thus required by the transmission has been reduced by more than a factor of 13 by the AC-
3 digital compression system. The received signal is demodulated back into the 384 kbits/s serial
bit stream, and decoded by the AC-3 decoder. The result is the original 5.1 channel audio pro-
gram.

Digital compression of audio is useful wherever there is an economic benefit to be obtained

by reducing the amount of digital information required to represent the audio signal. Typical
applications include the following:

Downloaded from Digital Engineering Library @ McGraw-Hill (www.digitalengineeringlibrary.com)

Copyright © 2004 The McGraw-Hill Companies. All rights reserved.

Any use is subject to the Terms of Use as given at the website.

DTV Audio Encoding and Decoding