Sound can be recorded and stored and played using either digital or analog techniques. Both techniques introduce errors and distortions in the sound, and these methods can be systematically compared. Musicians and listeners have argued over the superiority of digital versus analog sound recordings. Arguments for analog systems include the absence of fundamental error mechanisms which are present in digital audio systems, including aliasing and quantization noise. Advocates of digital point to the high levels of performance possible with digital audio, including excellent linearity in the audible band and low levels of noise and distortion.: 7
Two prominent differences in performance between the two methods are the bandwidth and the signal-to-noise ratio (S/N ratio). The bandwidth of the digital system is determined, according to the Nyquist frequency, by the sample rate used. The bandwidth of an analog system is dependent on the physical and electronic capabilities of the analog circuits. The S/N ratio of a digital system may be limited by the bit depth of the digitization process, but the electronic implementation of conversion circuits introduces additional noise. In an analog system, other natural analog noise sources exist, such as flicker noise and imperfections in the recording medium. Other performance differences are specific to the systems under comparison, such as the ability for more transparent filtering algorithms in digital systems and the harmonic saturation and speed variations of analog systems.
The dynamic range of an audio system is a measure of the difference between the smallest and largest amplitude values that can be represented in a medium. Digital and analog differ in both the methods of transfer and storage, as well as the behavior exhibited by the systems due to these methods.
The dynamic range of digital audio systems can exceed that of analog audio systems. Consumer analog cassette tapes have a dynamic range of 60 to 70 dB. Analog FM broadcasts rarely have a dynamic range exceeding 50 dB. The dynamic range of a direct-cut vinyl record may surpass 70 dB. Analog studio master tapes can have a dynamic range of up to 77 dB. An LP made out of perfect diamond has an atomic feature size of about 0.5 nanometer, which, with a groove size of 8 micron, yields a theoretical dynamic range of 110 dB. An LP made out of perfect vinyl LP would have a theoretical dynamic range of 70 dB. Measurements indicate maximum actual performance in the 60 to 70 dB range. Typically, a 16-bit analog-to-digital converter may have a dynamic range of between 90 and 95 dB,: 132 whereas the signal-to-noise ratio (roughly the equivalent of dynamic range, noting the absence of quantization noise but presence of tape hiss) of a professional reel-to-reel ¼-inch tape recorder would be between 60 and 70 dB at the recorder's rated output.: 111
The benefits of using digital recorders with greater than 16-bit accuracy can be applied to the 16 bits of audio CD. Stuart stresses that with the correct dither, the resolution of a digital system is theoretically infinite, and that it is possible, for example, to resolve sounds at −110 dB (below digital full-scale) in a well-designed 16 bit channel.: 3
There are some differences in the behaviour of analog and digital systems when high level signals are present, where there is the possibility that such signals could push the system into overload. With high level signals, analog magnetic tape approaches saturation, and high frequency response drops in proportion to low frequency response. While undesirable, the audible effect of this can be reasonably unobjectionable. In contrast, digital PCM recorders show non-benign behaviour in overload;: 65 samples that exceed the peak quantization level are simply truncated, clipping the waveform squarely, which introduces distortion in the form of large quantities of higher-frequency harmonics. In principle, PCM digital systems have the lowest level of nonlinear distortion at full signal amplitude. The opposite is usually true of analog systems, where distortion tends to increase at high signal levels. A study by Manson (1980) considered the requirements of a digital audio system for high quality broadcasting. It concluded that a 16-bit system would be sufficient, but noted the small reserve the system provided in ordinary operating conditions. For this reason, it was suggested that a fast-acting signal limiter or 'soft clipper' be used to prevent the system from becoming overloaded.
With many recordings, high level distortions at signal peaks may be audibly masked by the original signal, thus large amounts of distortion may be acceptable at peak signal levels. The difference between analog and digital systems is the form of high-level signal error. Some early analog-to-digital converters displayed non-benign behaviour when in overload, where the overloading signals were 'wrapped' from positive to negative full-scale. Modern converter designs based on sigma-delta modulation may become unstable in overload conditions. It is usually a design goal of digital systems to limit high-level signals to prevent overload.: 65 To prevent overload, a modern digital system may compress input signals so that digital full-scale cannot be reached: 4
Unlike analog duplication, digital copies are exact replicas that can be duplicated indefinitely and without generation loss, in principle. Error correction allows digital formats to tolerate significant media deterioration though digital media is not immune to data loss. Consumer CD-R compact discs have a limited and variable lifespan due to both inherent and manufacturing quality issues.
With vinyl records, there will be some loss in fidelity on each playing of the disc. This is due to the wear of the stylus in contact with the record surface. Magnetic tapes, both analog and digital, wear from friction between the tape and the heads, guides, and other parts of the tape transport as the tape slides over them. The brown residue deposited on swabs during cleaning of a tape machine's tape path is actually particles of magnetic coating shed from tapes. Sticky-shed syndrome is a prevalent problem with older tapes. Tapes can also suffer creasing, stretching, and frilling of the edges of the plastic tape base, particularly from low-quality or out-of-alignment tape decks.
When a CD is played, there is no physical contact involved as the data is read optically using a laser beam. Therefore, no such media deterioration takes place, and the CD will, with proper care, sound exactly the same every time it is played (discounting aging of the player and CD itself); however, this is a benefit of the optical system, not of digital recording, and the Laserdisc format enjoys the same non-contact benefit with analog optical signals. CDs suffer from disc rot and slowly degrade with time, even if they are stored properly and not played. M-DISC, a recordable optical technology which markets itself as remaining readable for 1,000 years, is available in certain markets, but as of late 2020 has never been sold in the CD-R format. (Sound could, however, be stored on an M-DISC DVD-R using the DVD-Audio format.)
For electronic audio signals, sources of noise include mechanical, electrical and thermal noise in the recording and playback cycle. The amount of noise that a piece of audio equipment adds to the original signal can be quantified. Mathematically, this can be expressed by means of the signal-to-noise ratio (SNR or S/N ratio). Sometimes the maximum possible dynamic range of the system is quoted instead.
With digital systems, the quality of reproduction depends on the analog-to-digital and digital-to-analog conversion steps, and does not depend on the quality of the recording medium, provided it is adequate to retain the digital values without error. Digital media capable of bit-perfect storage and retrieval have been commonplace for some time, since they were generally developed for software storage which has no tolerance for error.
The process of analog-to-digital conversion will, according to theory, always introduce quantization distortion. This distortion can be rendered as uncorrelated quantization noise through the use of dither. The magnitude of this noise or distortion is determined by the number of quantization levels. In binary systems this is determined by and typically stated in terms of the number of bits. Each additional bit adds approximately 6 dB in possible SNR, e.g. 24 x 6 = 144 dB for 24 bit quantization, 126 dB for 21-bit, and 120 dB for 20-bit. The 16-bit digital system of Red Book audio CD has 216= 65,536 possible signal amplitudes, theoretically allowing for an SNR of 98 dB.: 49
Rumble is a form of noise characteristic caused by imperfections in the bearings of turntables. The platter tends to have a slight amount of motion besides the desired rotation and the turntable surface also moves up, down and side-to-side slightly. This additional motion is added to the desired signal as noise, usually of very low frequencies, creating a rumbling sound during quiet passages. Very inexpensive turntables sometimes used ball bearings, which are very likely to generate audible amounts of rumble. More expensive turntables tend to use massive sleeve bearings, which are much less likely to generate offensive amounts of rumble. Increased turntable mass also tends to lead to reduced rumble. A good turntable should have rumble at least 60 dB below the specified output level from the pick-up.: 79–82 Because they have no moving parts in the signal path, digital systems are not subject to rumble.
Wow and flutter are a change in frequency of an analog device and are the result of mechanical imperfections. Wow is a form of flutter that occurs at a slower rate. Wow and flutter are most noticeable on signals which contain pure tones. For LP records, the quality of the turntable will have a large effect on the level of wow and flutter. A good turntable will have wow and flutter values of less than 0.05%, which is the speed variation from the mean value. Wow and flutter can also be present in the recording, as a result of the imperfect operation of the recorder. Owing to their use of precision crystal oscillators for their timebase, digital systems are not subject to wow and flutter.
For digital systems, the upper limit of the frequency response is determined by the sampling frequency. The choice of sample sampling frequency in a digital system is based on the Nyquist–Shannon sampling theorem. This states that a sampled signal can be reproduced exactly as long as it is sampled at a frequency greater than twice the bandwidth of the signal, the Nyquist frequency. Therefore, a sampling frequency of 40 kHz is mathematically sufficient to capture all the information contained in a signal having frequency components less than or equal to 20 kHz. The sampling theorem also requires that frequency content above the Nyquist frequency be removed from the signal before sampling it. This is accomplished using anti-aliasing filters which require a transition band to sufficiently reduce aliasing. The bandwidth provided by the 44,100 Hz sampling frequency used by the standard for audio CDs is sufficiently wide to cover the entire human hearing range, which roughly extends from 20 Hz to 20 kHz.: 108 Professional digital recorders may record higher frequencies, while some consumer and telecommunications systems record a more restricted frequency range.
Some analog tape manufacturers specify frequency responses up to 20 kHz, but these measurements may have been made at lower signal levels. Compact Cassettes may have a response extending up to 15 kHz at full (0 dB) recording level. At lower levels (−10 dB), cassettes are typically limited to 20 kHz due to self-erasure of the tape media.
The frequency response for a conventional LP player might be 20 Hz to 20 kHz, ±3 dB. The low-frequency response of vinyl records is restricted by rumble noise (described above), as well as the physical and electrical characteristics of the entire pickup arm and transducer assembly. The high-frequency response of vinyl depends on the cartridge. CD4 records contained frequencies up to 50 kHz. Frequencies of up to 122 kHz have been experimentally cut on LP records.
Digital systems require that all high-frequency signal content above the Nyquist frequency must be removed prior to sampling, which, if not done, will result in these ultrasonic frequencies "folding over" into frequencies in the audible range, producing a kind of distortion called aliasing. Aliasing is prevented in digital systems by an anti-aliasing filter. However, designing an analog filter that precisely removes all frequency content exactly above or below a certain cutoff frequency, is impractical. Instead, a sample rate is usually chosen which is above the Nyquist requirement. This solution is called oversampling, and allows a less aggressive and lower-cost anti-aliasing filter to be used.
Early digital systems may have suffered from a number of signal degradations related to the use of analog anti-aliasing filters, e.g., time dispersion, nonlinear distortion, ripple, temperature dependence of filters etc.: 8 Using an oversampling design and delta-sigma modulation, the less aggressive analog anti-aliasing filter can be supplemented by a digital filter. This approach has several advantages. The digital filter can be made to have a near-ideal transfer function, with low in-band ripple, and no aging or thermal drift.: 18
Analog systems are not subject to a Nyquist limit or aliasing and thus do not require anti-aliasing filters or any of the design considerations associated with them. Instead, the limits of analog storage formats are determined by the physical properties of their construction.
CD quality audio is sampled at 44,100 Hz (Nyquist frequency = 22.05 kHz) and at 16 bits. Sampling the waveform at higher frequencies and allowing for a greater number of bits per sample allows noise and distortion to be reduced further. DAT can sample audio at up to 48 kHz, while DVD-Audio can be 96 or 192 kHz and up to 24 bits resolution. With any of these sampling rates, signal information is captured above what is generally considered to be the human hearing range.
Work done in 1981 by Muraoka et al. showed that music signals with frequency components above 20 kHz were only distinguished from those without by a few of the 176 test subjects. A perceptual study by Nishiguchi et al. (2004) concluded that "no significant difference was found between sounds with and without very high frequency components among the sound stimuli and the subjects... however, [Nishiguchi et al] can still neither confirm nor deny the possibility that some subjects could discriminate between musical sounds with and without very high frequency components."
In blind listening tests conducted by Bob Katz in 1996, recounted in his book Mastering Audio: The Art and the Science, subjects using the same high-sample-rate reproduction equipment could not discern any audible difference between program material identically filtered to remove frequencies above 20 kHz versus 40 kHz. This demonstrates that presence or absence of ultrasonic content does not explain aural variation between sample rates. He posits that variation is due largely to performance of the band-limiting filters in converters. These results suggest that the main benefit to using higher sample rates is that it pushes consequential phase distortion from the band-limiting filters out of the audible range and that, under ideal conditions, higher sample rates may not be necessary. Dunn (1998) examined the performance of digital converters to see if these differences in performance could be explained by the band-limiting filters used in converters and looking for the artifacts they introduce.
A signal is recorded digitally by an analog-to-digital converter, which measures the amplitude of an analog signal at regular intervals specified by the sampling rate, and then stores these sampled numbers in computer hardware. Numbers on computers represent a finite set of discrete values, which means that if an analog signal is digitally sampled using native methods (without dither), the amplitude of the audio signal will simply be rounded to the nearest representation. This process is called quantization, and these small errors in the measurements are manifested aurally as low level noise or distortion. This form of distortion, sometimes called granular or quantization distortion, has been pointed to as a fault of some digital systems and recordings particularly some early digital recordings, where the digital release was said to be inferior to the analog version. However, "if the quantisation is performed using the right dither, then the only consequence of the digitisation is effectively the addition of a white, uncorrelated, benign, random noise floor. The level of the noise depends on the number of the bits in the channel.": 6
The range of possible values that can be represented numerically by a sample is determined by the number of binary digits used. This is called the resolution, and is usually referred to as the bit depth in the context of PCM audio. The quantization noise level is directly determined by this number, decreasing exponentially (linearly in dB units) as the resolution increases. With an adequate bit depth, random noise from other sources will dominate and completely mask the quantization noise. The Redbook CD standard uses 16 bits, which keeps the quantization noise 96 dB below maximum amplitude, far below a discernible level with almost any source material. The addition of effective dither means that, "in practical terms, the resolution is limited by our ability to resolve sounds in noise. ... We have no problem measuring (and hearing) signals of –110dB in a well-designed 16- bit channel." DVD-Audio and most modern professional recording equipment allows for samples of 24 bits.
Analog systems do not necessarily have discrete digital levels in which the signal is encoded. Consequently, the accuracy to which the original signal can be preserved is instead limited by the intrinsic noise-floor and maximum signal level of the media and the playback equipment.
Since analog media is composed of molecules, the smallest microscopic structure represents the smallest quantization unit of the recorded signal. Natural dithering processes, like random thermal movements of molecules, the nonzero size of the reading instrument, and other averaging effects, make the practical limit larger than that of the smallest molecular structural feature. A theoretical LP composed of perfect diamond, with a groove size of 8 micron and a feature size of 0.5 nanometer, has a quantization that is similar to a 16-bit digital sample.
It is possible to make quantization noise audibly benign by applying dither. To do this, noise is added to the original signal before quantization. Optimal use of dither has the effect of making quantization error independent of the signal,: 143 and allows signal information to be retained below the least significant bit of the digital system.: 3
Dither algorithms also commonly have an option to employ some kind of noise shaping, which pushes the frequency of much of the dither noise to areas that are less audible to human ears, lowering the level of the noise floor apparent to the listener.
Dither is commonly applied during mastering before final bit depth reduction, and also at various stages of DSP.
One aspect that may degrade the performance of a digital system is jitter. This is the phenomenon of variations in time from what should be the correct spacing of discrete samples according to the sample rate. This can be due to timing inaccuracies of the digital clock. Ideally, a digital clock should produce a timing pulse at exactly regular intervals. Other sources of jitter within digital electronic circuits are data-induced jitter, where one part of the digital stream affects a subsequent part as it flows through the system, and power supply induced jitter, where noise from the power supply causes irregularities in the timing of signals in the circuits it powers.
The accuracy of a digital system is dependent on the sampled amplitude values, but it is also dependent on the temporal regularity of these values. The analog versions of this temporal dependence are known as pitch error and wow-and-flutter.
Periodic jitter produces modulation noise and can be thought of as being the equivalent of analog flutter. Random jitter alters the noise floor of the digital system. The sensitivity of the converter to jitter depends on the design of the converter. It has been shown that a random jitter of 5 ns may be significant for 16 bit digital systems.
In 1998, Benjamin and Gannon researched the audibility of jitter using listening tests.: 34 They found that the lowest level of jitter to be audible was around 10 ns (rms). This was on a 17 kHz sine wave test signal. With music, no listeners found jitter audible at levels lower than 20 ns. A paper by Ashihara et al. (2005) attempted to determine the detection thresholds for random jitter in music signals. Their method involved ABX listening tests. When discussing their results, the authors commented that:
So far, actual jitter in consumer products seems to be too small to be detected at least for reproduction of music signals. It is not clear, however, if detection thresholds obtained in the present study would really represent the limit of auditory resolution or it would be limited by resolution of equipment. Distortions due to very small jitter may be smaller than distortions due to non-linear characteristics of loudspeakers. Ashihara and Kiryu  evaluated linearity of loudspeaker and headphones. According to their observation, headphones seem to be more preferable to produce sufficient sound pressure at the ear drums with smaller distortions than loudspeakers.
After initial recording, it is common for the audio signal to be altered in some way, such as with the use of compression, equalization, delays and reverb. With analog, this comes in the form of outboard hardware components, and with digital, the same is typically accomplished with plug-ins in a digital audio workstation (DAW).
A comparison of analog and digital filtering shows technical advantages to both methods. Digital filters are more precise and flexible. Analog filters are simpler, can be more efficient and do not introduce latency.
When altering a signal with a filter, the outputted signal may differ in time from the signal at the input, which is measured as its phase response. All analog equalizers exhibit this behavior, with the amount of phase shift differing in some pattern, and centered around the band that is being adjusted. Although this effect alters the signal in a way other than a strict change in frequency response, it is usually not objectionable to listeners.Steve Green, A New Perspective on Decimation and Interpolation Filters (PDF), Cirrus Logic, retrieved 20 February 2022</ref>
Because the variables involved can be precisely specified in the calculations, digital filters can be made to objectively perform better than analog components. Other processing such as delay and mixing can be done exactly.
Digital filters are also more flexible. For example, the linear phase equalizer does not introduce frequency-dependent phase shift. This filter may be implemented digitally using a finite impulse response filter but has no practical implementation using analog components.
A practical advantage of digital processing is the more convenient recall of settings. Plug-in parameters can be stored on the computer, whereas parameter details on an analog unit must be written down or otherwise recorded if the unit needs to be reused. This can be cumbersome when entire mixes must be recalled manually using an analog console and outboard gear. When working digitally, all parameters can simply be stored in a DAW project file and recalled instantly. Most modern professional DAWs also process plug-ins in real time, which means that processing can be largely non-destructive until final mix-down.
Many plug-ins exist now that incorporate analog modeling. There are audio engineers that endorse them and feel that they compare equally in sound to the analog processes that they imitate. Analog modeling carries some benefits over their analog counterparts, such as the ability to remove noise from the algorithms and modifications to make the parameters more flexible. On the other hand, other engineers also feel that the modeling is still inferior to the genuine outboard components and still prefer to mix "outside the box".
Subjective evaluation attempts to measure how well an audio component performs according to the human ear. The most common form of subjective test is a listening test, where the audio component is simply used in the context for which it was designed. This test is popular with hi-fi reviewers, where the component is used for a length of time by the reviewer who then will describe the performance in subjective terms. Common descriptions include whether the component has a bright or warm sound, or how well the component manages to present a spatial image.
Another type of subjective test is done under more controlled conditions and attempts to remove possible bias from listening tests. These sorts of tests are done with the component hidden from the listener, and are called blind tests. To prevent possible bias from the person running the test, the blind test may be done so that this person is also unaware of the component under test. This type of test is called a double-blind test. This sort of test is often used to evaluate the performance of lossy audio compression.
Critics of double-blind tests see them as not allowing the listener to feel fully relaxed when evaluating the system component, and can therefore not judge differences between different components as well as in sighted (non-blind) tests. Those who employ the double-blind testing method may try to reduce listener stress by allowing a certain amount of time for listener training.
Early digital audio machines had disappointing results, with digital converters introducing errors that the ear could detect. Record companies released their first LPs based on digital audio masters in the late 1970s. CDs became available in the early 1980s. At this time analog sound reproduction was a mature technology.
There was a mixed critical response to early digital recordings released on CD. Compared to vinyl record, it was noticed that CD was far more revealing of the acoustics and ambient background noise of the recording environment. For this reason, recording techniques developed for analog disc, e.g., microphone placement, needed to be adapted to suit the new digital format.
Some analog recordings were remastered for digital formats. Analog recordings made in natural concert hall acoustics tended to benefit from remastering. The remastering process was occasionally criticised for being poorly handled. When the original analog recording was fairly bright, remastering sometimes resulted in an unnatural treble emphasis.
The Super Audio CD (SACD) format was created by Sony and Philips, who were also the developers of the earlier standard audio CD format. SACD uses Direct Stream Digital (DSD) based on delta-sigma modulation. Using this technique, the audio data is stored as a sequence of fixed amplitude (i.e. 1- bit) values at a sample rate of 2.884 MHz, which is 64 times the 44.1 kHz sample rate used by CD. At any point in time, the amplitude of the original analog signal is represented by the relative preponderance of 1's over 0's in the data stream. This digital data stream can therefore be converted to analog by passing it through an analog low-pass filter.
The DVD-Audio format uses standard, linear PCM at variable sampling rates and bit depths, which at the very least match and usually greatly surpass those of standard CD audio (16 bits, 44.1 kHz).
In the popular Hi-Fi press, it had been suggested that linear PCM "creates [a] stress reaction in people", and that DSD "is the only digital recording system that does not [...] have these effects". This claim appears to originate from a 1980 article by Dr John Diamond. The core of the claim that PCM recordings (the only digital recording technique available at the time) created a stress reaction rested on using the pseudoscientific technique of applied kinesiology, for example by Dr Diamond at an AES 66th Convention (1980) presentation with the same title. Diamond had previously used a similar technique to demonstrate that rock music (as opposed to classical) was bad for your health due to the presence of the "stopped anapestic beat". Diamond's claims regarding digital audio were taken up by Mark Levinson, who asserted that while PCM recordings resulted in a stress reaction, DSD recordings did not. However, a double-blind subjective test between high resolution linear PCM (DVD-Audio) and DSD did not reveal a statistically significant difference. Listeners involved in this test noted their great difficulty in hearing any difference between the two formats.
The vinyl revival is in part because of analog audio's imperfection, which adds "warmth". Some listeners prefer such audio over that of a CD. Founder and editor Harry Pearson of The Absolute Sound journal says that "LPs are decisively more musical. CDs drain the soul from music. The emotional involvement disappears". Dub producer Adrian Sherwood has similar feelings about the analog cassette tape, which he prefers because of its "warmer" sound.
Those who favor the digital format point to the results of blind tests, which demonstrate the high performance possible with digital recorders. The assertion is that the "analog sound" is more a product of analog format inaccuracies than anything else. One of the first and largest supporters of digital audio was the classical conductor Herbert von Karajan, who said that digital recording was "definitely superior to any other form of recording we know". He also pioneered the unsuccessful Digital Compact Cassette and conducted the first recording ever to be commercially released on CD: Richard Strauss's Eine Alpensinfonie. The perception of analog audio being demonstrably superior was also called into question by music analysts following revelations that audiophile label Mobile Fidelity Sound Lab had been covertly using Direct Stream Digital files to produce vinyl releases marketed as coming from analog master tapes, with lawyer and audiophile Randy Braun stating that "These people who claim they have golden ears and can hear the difference between analog and digital, well, it turns out you couldn't."
While the words analog audio usually imply that the sound is described using a continuous signal approach, and the words digital audio imply a discrete approach, there are methods of encoding audio that fall somewhere between the two. Indeed, all analog systems show discrete (quantized) behaviour at the microscopic scale. While vinyl records and common compact cassettes are analog media and use quasi-linear physical encoding methods (e.g. spiral groove depth, tape magnetic field strength) without noticeable quantization or aliasing, there are analog non-linear systems that exhibit effects similar to those encountered on digital ones, such as aliasing and "hard" dynamic floors (e.g. frequency-modulated hi-fi audio on videotapes, PWM encoded signals).
A 16-bit system, therefore, gives a theoretical signal-to-noise ratio of 98 dB...
Signal-to-Noise NAB (1/4-inch two-track 2.0 mm track, RMS, A-weighted) 30 ips - 75 dB
((cite magazine)): Cite magazine requires
((cite conference)): CS1 maint: postscript (link)
((cite conference)): CS1 maint: postscript (link)