Home Music Lane

It's all about the music, dude! Sit down, relax and listen to some tunes.

the fallacy of stereo

165.121.38.75

STEREO: A MISUNDERSTANDING

THE THEORY, SOUND-SYSTEMS, AND PROBLEMS OF HEARING

copyright 1982 The Anstendig Institute

Revised 1984

There is a common misconception that the addition of stereophonic
sound-reproduction was the necessary, correct step in perfecting monophonic
recording. It is believed that, because we hear with two ears, sound should be
recorded with two microphones if it is to sound natural. It is also believed
that stereophony exists as a natural, scientific phenomenon. Neither belief is
correct. The attempt to reproduce the way sound is heard by means of
stereophonic sound reproduction is a misunderstanding that is the result of a
fault in logic. Since recording is a duplication of sounds, only the sounds can
be duplicated, not the manner in which they are heard. The introduction of
stereophony and its universal acceptance has had the unfortunate effect of
slowing progress in the improvement of recorded sound quality and keeping the
general level of musical experience substantially below that which is truly
possible, both through recordings and in live performance.

Hearing is classically accepted as the most important of the senses. Of all five
senses, the effects of hearing are the most powerful. It is humanity's chief
means of becoming familiar with and communicating emotions. Today, recordings
are the means through which the greater part of society is introduced to the
vast scale of human experiences that can be had through sound. It is important,
therefore, that society take a careful look at the universal use of stereophony
in sound reproduction.

I. THE FLAWED LOGIC

The word "stereo" is currently used as a blanket designation for all sound
reproduction. This is a misrepresentation. Stereo is only a means of achieving
an effect of directionality. In fact, it is only one of many ways directionality
can be sonically produced, and a very limited one at that. Stereo is limited to
producing only a frontal, horizontal plane, with no means of reproducing sounds
that come from above, below or behind, nor can it accurately reproduce depth.
(Impressions of depth are a combination of the arbitrary disposition of the
loudspeakers and the listener in relation to the listening room, which is
different in each situation. It is a form of auditory illusion, not an accurate
duplication of the depth of the recorded event.)

Most people have the impression that the stereo signal is a complete entity that
is made up of two incomplete halves of a complete signal, each of which
essentially contains only half of the sounds. That is not true. When two
microphones are used, each channel is a single, complete monophonic signal
documenting every bit of the particular sound event, but each from a slightly
different position (in theory, only about as far apart as our two ears, i,e.,
the width of a human head).

It is important to understand that there are no stereo sound sources. From any
given position in space, all sound sources are monophonic.1 In live sound as
well as sound reproduction, the effects that produce the impression of dimension
and direction take place within the listener and not in the sound source(s). The
stereophonic signal does not, in itself, include the spatial, stereophonic
effect. It only includes two mono signals, which would produce no effect of
spatial dimension if they were played by themselves, played through two separate
speakers standing next to each other, or electrically combined and fed through
one speaker (played back monophonically). The spatial effect only occurs through
separation of the two signals in space in relation to the listener, and that
effect changes in relation to any change in position of the two speakers and the
listener. Live sounds may occur at various distances and in various directions
in relation to the hearer, but each one is always a separate monophonic sound
whether the source is stationary or moving. Sounds are only directional in
relation to the hearer. They are given directionality during the act of hearing,
which occurs after the sounds are produced or reproduced and therefore has
nothing to do with the manner in which the sounds are produced.

Stereo is based on the premise that, because sound is heard with two ears, the
correct way to reproduce sound is to simulate the way it is heard, i.e., by
recording two separate signals, using two microphones separated by a distance
equivalent to the width of a human head. That is a misunderstanding of the
realities. Stereophony as thus defined is an attempt to reproduce the way sound
is heard. This is illogical and impossible. Human hearing could never be
duplicated in the recording process because hearing consists of more than just
two ears. The shape of the ears plays a role in distinguishing the direction of
sounds, and the rest of the body also plays a role in the hearing and
experiencing of sound. None of these aspects of hearing can be duplicated by
microphones.

The hearing experience takes place only in the hearer and only after the sound
has been reproduced by the sound system. This phenomenon is incidental to and
completely separate from both the production of the sounds and the
characteristics of the sounds. What comes out of the speaker is a duplication
(more accurately, an approximation) of the sound as it was produced by the
source and colored by the space in which it was produced. It is not, nor can it
ever be, a duplication of the hearing process. In fact, the shape of the sound
source and the materials of which it is made determine the characteristics of a
sound. Any recording, whether in stereo, quad, or any other mode, can only
duplicate the sound as produced by the sound source, not as heard by a listener.
The characteristics of the sound source and of the sound itself are what must
determine the technical means used to record it. How a sound happens to be heard
is completely incidental to and has no bearing on the production or accurate
reproduction of that sound.

Stereophony should play no role in considerations regarding sound quality in the
construction or evaluation of components, even those meant for stereo
reproduction. All auditioning and evaluating of the accuracy of components,
especially loudspeakers, whether by the manufacturer or the buyer, should be
done with a mono signal, with no attempt to reproduce spatial effects. All
aspects of the rest of the sound system, except the depiction of space, should
also be auditioned monophonically. The only function of the electrical
components of a sound-reproduction system and the loudspeakers is to produce an
acoustic signal that as closely as possible resembles the electrical signal fed
into it by the source. Nothing more! In fact, it is impossible for a sound
system to do anything more than that. The signal itself does not, and cannot,
include any effects, such as the depiction of space. Those effects take place in
the listener after the sounds have already been reproduced monophonically.
Technically the aim is to reproduce two entirely monophonic signals as
accurately as possible. The two channels should be kept completely separate from
each other, all the way through the sound-system until they have been reproduced
monophonically in space by the loudspeakers.

The fact that the two signals of stereo are cut in the same record groove and
that most components have two channels that share the same power supply and
therefore have some interaction is merely an economic compromise. If it were
possible, each channel should be absolutely independent from beginning to end.
But that is generally impossible, because, to perfectly synchronize the
channels, the two signals have to be combined somewhere. Either the two channels
are combined in the record groove or as parallel tracks on the same tape. On any
systems that would be practical for the end-user, some interaction of the
signals is unavoidable either in the signals on the record groove itself, in the
needle as it traces the signals, in cross-talk between the channels of the
recorder, or in the sound system.

The problem of designing a sound-system, including building a loudspeaker, is to
arrive at the most accurate possible reproduction of each separate signal that
is fed through it, whether that signal is a monophonic signal or one channel of
a stereo recording. Even if two or more signals are ultimately to be combined to
produce spatial effects, the only way to assure that each signal will be
reproduced as accurately as possible is to reproduce each signal as separately
as possible. As will be shown in Section V, the effects of combining two signals
distract the listener from the more important qualities of sound. Therefore, all
system development, testing, and evaluation of sound-quality should be done
monophonically, even if stereo is desired. Especially with loudspeakers, any
technical decisions of design, such as the size and shape of the speaker or how
the drivers are mounted in the speaker, should be arrived at only with the need
for accurate rendition of a single signal in mind. Practices such as mounting
the drivers unsymmetrically in stereo-pairs within the cabinet have nothing to
do with how accurately that speaker will reproduce sound and can, in fact,
compromise the sound-quality if the preferred position of the drivers for stereo
listening is not the ideal position for accurate reproduction of a single
signal.

III. IT IS IMPOSSIBLE TO KNOW THE REAL SPATIAL RELATIONS OF A STEREO RECORDING

In order to know if a system's reproduction of spatial relationships is
accurate, one would have to know if the reproduction matches those relationships
exactly as they were at the microphones during the recording. Since heads are
differently shaped and no one can be in exactly the same place as the
microphones, the spatial effects of direction, depth, etc., will be different
for each person in the room. Even the engineer, listening with speakers or
earphones, who decides on the microphone placement and mixes the signals to his
liking, is only deciding subjectively how he wants the impressions of space. The
monitoring equipment has already changed the spatial relationships and made them
different from the spatial relationships at the microphones. And those
relationships in the monitoring booth will be different from every other
listening room.

An attempt to achieve precise reproduction of the spatial dimensions of a sound
event by means of stereo is therefore doomed from the beginning. All that can be
achieved is a particular spatial effect that may be preferred by the particular
listener but cannot lay claim to being a reproduction of the original.
Therefore, the prevalent procedure of evaluating sound-quality on the basis of
the reproduction of such spatial effects as “soundstage", "imaging",
"dimensionality" (terms currently used in professional circles) or on the basis
of impressions of height, width, or depth are futile, since it can never be
known whether the reproduction matches the original. All that is possible is to
prefer a certain sound-system's reproduction of spatial dimensions

over that of another system, but it is not possible to know when the
reproduction corresponds to the original, even if the listener had been in the
room in which the sound originated.

The characteristics of sound are so bound up with the size, shape, dimensions
and materials of the source that they can only be reproduced exactly by
duplicating the entire original physical situation. That would mean the same
musicians in the same hall (or an exact duplication of the hall), sitting in
exactly the same positions, etc., which is an impossibility. Therefore,
absolutely exact reproduction of the spatial characteristics of a sound by
another sound medium is impossible. It certainly cannot be achieved by
differently shaped objects of different reflectivity, i.e., speakers, in a
differently-shaped space of a different reflectivity, i.e., the listening room.
Thus, and definitively, any attempt at reconstruction of the spatial
characteristics of a sound source can only be a flawed approximation, which the
listener can never be sure is the way the original sounded.

Furthermore, in stereo reproduction, there is only one very small area,
equidistant from the speakers, within which the volume of the two separate
channels is balanced. The equalization, i.e., the loudness of the different
frequencies (highs, lows, middle, etc.) in relation to each other, can also be
different in various parts of a room. But a room's equalization can be
compensated for during playback and is a variable that has to be adjusted anyway
for differing volume levels in relation to an individual listener's hearing at
the time of the playback.2 The one perfect area for the listener relative to the
two stereo speakers is a small area in the exact center between and in front of
the speakers, which extends only a small distance front to back. In any other
positions, not only are the stereo balances wrong, but part of the content is
missing. Obviously, for larger numbers of people (theater productions, movies,
etc.), mono sound reproduction is more accurate for the bulk of the audience; it
is, in fact, the only non-flawed possibility of reproducing the entire musical
content.

IV. THE POINT OF ALL MUSIC IS TO EXPRESS SOMETHING

The expressive content of sounds is contained in the dynamic variations of the
sounds. In fact, it is the dynamic content of the sound. The presentation of the
dynamic subtleties is, therefore, the most important problem of sound
reproduction.3 Problems of instability in the sound, which can plague the stereo
spatial effect relative to the listener's position in the room, do not occur in
the dynamic content of the sounds, which remains the same throughout the room.
No matter how the balance of frequencies or stereo imaging may be changed, the
sounds retain their dynamic-expressive character relative to each other as they
flow in time.

Until the advent of stereo, spatial relationships were unimportant, even
undesireable in the bulk of the world's music. In most classical music, the
introduction of directional effects in the sound-reproduction distracts the
listener from the important factors that actually contain the musical
experience. The most important aspects of sound, especially those of classical
music, have nothing to do with spatial effects and can be reproduced
satisfactorily in mono.

A stereo signal introduces extraneous "effects” that distract from the more
important dynamic aspects of music. Except for the pickup cartridge, stereo
effects have nothing to do with the quality of the system components. The reason
is that, in the sonic arts, spatial relationships are a very insignificant
component of sound and are particularly insignificant in music. In most
classical music, they can be eliminated without at all degrading the quality of
the artistic experience.

The reason spatial effects distract from the expressive qualities of music lies
in the limitations of human consciousness. Most people can only concentrate on
one thing at a time, which, in music, is usually the melodic line. Few can
concentrate on two things at a time. Since our consciousnesses are too limited
to be simultaneously aware of all the components of music, concentrating on
spatial effects distracts from the important aspects of music.

To understand why the stereo-spatial aspects of music-reproduction have been
accorded such predominance, to the point of obscuring the truly important
aspects of music, one must know that the easiest-to-hear aspects of sound are
the directions the sounds are coming from. The most difficult-to-hear aspects
are the subtle expressive nuances.4 Many people cannot hear subtle expressive
nuances. Few are oriented towards listening for those nuances and practically no
one takes pains to be sure they are hearing them correctly. Furthermore,
long-playing record-playing equipment has, without exception, not as yet been
able to reproduce the finest nuances of records. The record-listening public has
not, therefore, experienced nuances as fine as they can be. It is taken for
granted that they are hearing the exactly the same nuances as in the original.

In controlled situations, our institute has found that, although they do
experience something, many people are incapable of accurately hearing expressive
nuances either live or reproduced. They experience either a coarser form of the
actual emotion of the performances or a completely different emotion.5 Even
those capable of hearing fine nuances cannot hear them the moment they sit down
to listen, especially with recordings. It takes quite a while for most people to
settle down enough physically to begin to register the subtleties of the music
and to experience the emotional content. To understand why, one must realize
that what is heard is not the sound vibrations coming from the sound source;
what is heard is the vibrations of the hearer's own body when it is caused to
vibrate by the soundwaves striking it. Therefore, any nuances finer than the
vibrational state of the body itself are not heard. Essentially, unless the body
is in a physical state that is as fine as the music being listened to, the music
is filtered through, and degraded by, the coarseness in the way the body is
vibrating. This point is crucial to understanding why spatial effects figure
prominently in most people's considerations of sound reproduction. Besides being
easy to hear, spatial effects do not demand a particularly great refinement of
body. Being able to notice and make-out spatial dimensions and directional
effects impresses listeners who are not hearing the full content of the music,
and gives them the impression that they are getting something out of the
recording, when they are actually missing the point of the music.

If, from the beginning of a listening session, one would carefully observe what
aspects of the music one becomes progressively aware of, one will notice that,
besides notes and words, the first things one is able to hear are the simple
spatial relationships (right, left, center, etc.). The last thing one is able to
hear is the expressive, i.e., the emotional, content. The notes and spatial
relationships can be called the “informational" aspects of sound, while the
expressive content can be called the “experiential” aspect.6 The point to be
made is that, without the experiential aspects, there really is no music, and
that a distortion or change in the expressive content of a recorded performance
is tantamount to changing the words in a sentence so that they mean something
totally different from what the writer expressed. In other words, a complete
falsification. On the other hand, it makes no difference to the quality or
intensity of the way one experiences the expressive content of the music if the
so-called "sound stage" is changed to give one or another impression of height,
depth, and width, nor does it matter if the orchestra seems to be spread out in
front of the listener (unless the music was specifically written for stereo, or
has some of the expressive content contained in the directions of the sounds.
The Beatle's album Sergeant Pepper's Lonely Hearts Club Band has excellent
examples of both).

The spreading out of the sound in space is totally unimportant to and contrary
to the aims of most music written before stereo became popular. In their
orchestration, composers took great pains to create particular sound colorings
by blending together the sounds of different instruments. Halls were designed so
the sounds would thoroughly blend together before reaching the listener. When a
conductor has balanced his orchestra, there is no need for separation of the
instruments by spreading them out in differing directions in order to hear the
different voices; whatever is supposed to be heard can be differentiated even
from so far away that all the sounds of the orchestra essentially come from the
same direction. Similarly, if a recording of such a well-balanced performance is
correctly equalized to match the original, the balance that the conductor has
achieved can be heard in mono, without the supposed help of stereo “separation”.
This is an important point for the music-loving public because it means that
older recordings of such excellent performances can, to a great degree, be
restored since it is mainly their imbalances in the frequencies that obscure
their detail and not a lack of stereo effects.

One must assume that composers know what their music should sound like, but,
originally, composers were singularly unimpressed by stereo. Virgil Thomson went
so far as to call it a “technological pretext” giving the recording companies
“another excuse for recording the standard works all over again" (A Virgil
Thomson Reader, p. 144). Another composer has mentioned that stereo is an excuse
to sell new, more expensive equipment. No composer whom I asked or with whom I
listened to music was the slightest bit interested in the depiction of spatial
effects.

V. CONSCIOUSNESS IS LIMITED

Few people can concentrate on more than one thing at a time; but music consists
of many things happening all at once. In fact, music is the ultimate
consciousness-expander because, if you are not a Mozart, there is almost always
more to be aware of than is humanly possible. Even with a single melodic line
there are both the notes and the expression to be conscious of. For all but a
very few particularly "gifted” individuals, consciously registering the
expressive content of music demands every bit of concentration, awareness, and
poise that can be mustered, especially when the expression is as fine and
delicate as it should be in most classical music. In the finest ear-training and
conducting classes, which even included seasoned professionals, there are
enormous differences in sensibility to nuance and expressive content.
Particularly interesting is that neither the ability to recognize tones (perfect
pitch) nor extraordinary memories that allowed students to write down, from
memory, anything the teacher dictated, was of help in hearing the expressive
content. For example, many conductors (and other musicians too) with amazing
ears for recognizing notes and hearing mistakes were and are strikingly
deficient in expressive interpretive qualities. In most of these cases, the
orientation towards the informational (mental) aspects of music takes up all of
their powers of concentration and keeps them from registering the nuances of
expression. Therefore, the addition of artificial informational material, such
as the spatial effects of stereo, will distract most people from the more
important experiential aspects of music.

VI. THE BODY IS SENSITIVE TO DIRECTIONAL IMBALANCES IN SOUND PRESSURE LEVEL

While the depiction of directional effects has little effect on the experience
of most fine music, monophonic reproduction with only one loudspeaker is not the
solution. That is because the body is sensitive to unequal sound-pressure
levels, i.e., whether or not the sounds around it are of equal strength
(volume). The body itself, which is highly sensitive to physical imbalances, has
to recreate the vibrations produced by the sound-source, and this happens most
effectively when the whole body is equally subjected to those vibrations. Music
coming predominantly from one side creates an uncomfortable feeling of imbalance
that is especially disturbing and distracting when the body is in the requisite
relaxed, sensitive state necessary to hear fine musical nuances. Our tests have
shown that music from four equidistant speakers, arranged as in quadraphonic
listening, is the best arrangement, whether they play mono, stereo, or quad. The
sound-pressure level is then most evenly distributed around the body.

The body's sensitivity to the lateral balance of sound is one reason why stereo
seems to many to be superior to mono with one speaker. In stereo, when the
listener is located exactly between the two speakers that are balanced for
volume level, the sounds at least come from both sides. But in this respect,
mono is still preferable, because the sound from both sides has the same volume
level, while it varies in stereo. Because the musical experience is
predominately physical, mono with at least four speakers surrounding the
listener is the most effective way to experience recorded music.

VII. EPILOGUE

Originally, stereo was thought to be the next necessary step in perfecting
sound-reproduction. But it was not. Monophonic sound-reproduction was still
gravely flawed when stereo was introduced. The first step should have been to
perfect monophonic sound-reproduction. Some companies were well on their way
towards doing so. The last monophonic Mercury recordings were very close. It
remained mainly for playback techniques and equipment to be perfected in order
to retrieve the information which was on the grooves.

The introduction of stereo halted progress by introducing a whole new set of
problems, namely the preservation and reproduction of two signals
simultaneously. The state of the technique at that time was not able to combine
two signals and still preserve the quality already achieved in monophonic
recordings, especially not in phonograph pick-up cartridges. Sound quality,
particularly in the playback, deteriorated markedly.

It is an individual's prerogative to want sound-reproduction that includes some
sort of depiction of the placement of sounds in space. But to call stereophony
accurate sound reproduction is a falsification. Stereo is an extraneous effect
added to sound, a special phenomenon similar to 3-D in photography and cinema:
both stereo and 3-D are effects that may be interesting, even "kicky”, but they
are only effects and have little to do with the way we really hear or see.

Since hearing the expressive content demands all of most listeners'
concentration, the addition of other effects such as those of stereophony, keeps
the listeners from experiencing the real content of the music if they pay
attention to those effects. Such distracting sound-reproduction has been the
rule for over three decades among laymen and even among professionals, most of
whom use recordings to help study scores (with the prevalence of recorded sound
in our society, those who do not outrightly use recordings for study, still
cannot avoid listening to and being influenced by recordings). The legacy of
stereophonic sound-reproduction is a loss of sensitivity to and awareness of
delicate, fine interpretative nuance in music. A full understanding of this fact
must be cause for considerable alarm, because music is the flagship of a
society. It leads, serves as an example for, and sets the tone of every other
civilized pursuit within that society. It is the best civilization has and must
be preserved at the highest possible levels.

1Even sounds consisting of many combined sound sources, as in recording
techniques using many microphones are monophonic. With multi-miking, each
microphone documents the complete monophonic event from that microphone's
position. Each channel of the stero-signal plays a monophonic signal consisting
of the combined signals from its microphones. But the use of many microphones is
not even stereo. It is a whole new technique that has nothing to do with either
the natural spatial relationships of the original sounds or the principles of
stereo.

2See our papers on sound equalization.

3For most recordings, even digital, it is necessary to compress the overall
dynamic range of the performance. That should be done section by section, i.e.,
the louder sections should all be reduced the same amount and the softer
sections all raised in volume by the same amount. In that way, the dynamic
subtleties within each section of the music will be preserved. Automatic dynamic
range expanders are not desireable because they will expand and compress the
dynamic range of everything, even the dynamics within a single melody, thus
changing the whole expressive content of the performance.

4Various papers of The Anstendig Institute deal with the problems of hearing
fine nuances, particularly those due to the fact that the body must be vibrating
as finely as the nuances or they will be changed and degraded by the vibrations
of the body itself.

5The author's insights into the hearing of expressive nuance comes from many
years in the ear-training classes of some of the finest music schools and long
testing with volunteers at The Anstendig Institute.

6This is explained in other papers of The Anstendig Institute, particularly
“Hearing: The Informational and the Experiential”.

Papers on related subjects are available free of charge on request.

The Anstendig Institute is a non-profit, tax-exempt, research institute that was
founded to investigate the vibrational influences in our lives and to pursue
research in the fields of sight and sound; to provide material designed to help
the public become aware of and understand vibrational influences; to instr


This post is made possible by the generous support of people like you and our sponsors:
  Sonic Craft  


Follow Ups Full Thread
Follow Ups


You can not post to an archived thread.