時間領域裏的聲音

聲波的表象和行為

The metric (acoustical) definition of sound is variation in pressure waves and density caused by the propagation of the waves through a medium. Between about 25Hz, and 18kHz, human hearing systems sense these waves as they cause the ear drum to move. This mechanical movement is transduced into electrochemical signals in the cochlea as nerve impulses, and sent to the auditory region of the brain for analysis. Sound waves, being variation in air pressure over time, may be represented as a varying voltage or a stream of data over time. This is a 'time/amplitude' representation of sound, also known as the amplitude time line. The amplitude represents the molecular displacement caused by the changes in air pressure. In the digital domain, amplitude is typically represented as a value between 1 and -1 where 1 and -1 represent maximum positive and negative amplitudes of the signal, and 0 represents zero amplitude.

**Figure 1.1**. A simple sinusoidal waveform represented as varying amplitude over time.

The waveform in Fig. 1.1 is called a sine wave or sinusoid. Sine waves can be considered the fundamental building blocks of sound. The figure demonstrates that the amplitude varies over time, but that pattern of variance repeats periodically.

**Figure 1.2**. A more complex waveform.

The waveform in Fig. 1.2 is more complicated than the sinusoid in 1.1. There are peaks and troughs of different amplitudes, and, although the pattern does repeat itself over time (see if you can find it) it is harder to spot. In the same way that a sine wave behaves in a simple way and sounds simple, this sound behaves with greater complexity and also sounds more complex. For this reason, detailed, complex sounds that change over time often have no discernible features when viewed this close up- there may be no repeating pattern or behaviour which we can use to tell us something about the sound.

**Figure 1.3**. A time-domain plot of a drum kit over 2 seconds.

In Fig. 1.3 we are given a look at a sound over the course of about 2 seconds rather than 2 milliseconds. From this perspective, we can see the way the overall sound amplitude changes over time; in particular, the parts with high amplitude can easily be seen as drum hits - they appear suddenly and drop in amplitude very quickly as one would expect from striking a drum head. It may have been very difficult to tell what kind of instrument was being played if this sound was viewed over the range of a few milliseconds. From this, we should conclude that the short time interval and long time interval perspectives both show different types of information and that selecting the right perspective to suit one's needs is important.

正弦波、頻率和音高

As indicated in Fig. 1.1, the sine wave has a periodic form that repeats every $T\,$ seconds which is known as the period, cycle. The wave also has a positive maximum amplitude, $A\,$ and a negative maximum amplitude, $-A\,$ . The frequency, $f\,$ , of a sine wave is the number of cycles per second and is measured in Hertz (Hz). We can obtain the frequency from wavelength from the following equation:

f={\frac {1}{T}}\,

Furthermore, we can express a sine wave with the following mathematical form (with angles in radians). This form may be useful to programmers interested in creating their own controllable sine functions in code:

p(t)=Asin\left({\frac {2\pi t}{T}}\right)=Asin(2\pi ft)

Psychometrically, higher frequencies (eg above 1.5kHz) are often associated with words such as 'brightness', whereas lower frequencies (eg below 200Hz) are often associated with 'depth' or 'bass'. The intermediate range may be associated with the term 'warmth'. For example, an instrument such as an electric guitar played clean may be called 'bright' or 'sharp' whereas an acoustic double-bass may be referred to as 'dark' and 'warm'. Being psychometric, terms like these are not objective quantities we can measure precisely, but are often used in describing the timbre, or tone color of a particular sound. The various amplitudes of frequencies present in a sound, and their evolution over time are the major factors associated with timbre, and there are infinite shades of timbre that can be achieved through combinations of different frequencies that make up a sound. In psychometric terms, human hearing associates whole number frequency ratios with pitch, and associate particular frequencies with particular notes in the standard Western scale:

Wavelength (t)	Frequency (Hz)	Note name
156.82 cm	220.0	A3
139.71 cm	246.94	B3
131.87 cm	261.63	C4
117.48 cm	293.66	D4
104.66 cm	329.63	E4
98.79 cm	349.23	F4
88.01 cm	392.0	G4
78.41 cm	440.0	A4

Fig. 1.4. The relationship between wavelength, frequency and note name.

Note that this table covers the range of an octave. The frequency doubles and the wavelength halves.

正弦曲線的構造和解構

Using Fourrier analysis, sine waves can be considered the fundamental components of sound, since a single sine wave is a single frequency. Within Fourrier analysis, combining sinusoids of different frequency, aptitude and phase can recreate the frequency spectrum of any sound. Similarly, complex sounds may be analyzed in terms of frequencies, amplitudes and phase.

Fig. 1.5 demonstrates the appearance of two sine waves summed together. The characteristics of both waves are combined in the resultant waveform. This technique is the basis of additive synthesis which is covered later in the book. Furthermore, in the way that the sound is constructed, it is possible to filter out the two component frequencies from the whole; this is typically done by analysis of the waveform in the frequency domain, which is covered in the subsequent chapter.