Positive Feedback Logo

High Fidelity: Is hifi system voicing a matter of taste? Part 2.

01-20-2019 | By Jeff Day |

In Part 1 I spent some time musing about what high-fidelity means in relation to musicality, and now in Part 2 I’ll explore some ideas about what high-fidelity means in terms of non-musical audiophile-style sonics that many audiophiles cherish.

As a recap, in Part 1 I described audiophile-style sonics as how a stereo system performs related to  reproducing the non-musical artifacts of the recording process, things like “transparency” (the ability to aurally ‘see’ into the recording), “resolution” (the amount of detail in the audio signal that is audibly presented), “soundstage” (the three dimensions of the recorded space in width, height and depth), the “soundspace” (the acoustic sense of  ‘space’ of the recording venue), and “imaging” (the ability to localize instruments & musicians on the soundstage), for example.

I’ll use the same criteria I did for musicality, where high-fidelity describes how close a stereo system comes to making recorded music sound like live music in terms of timbre, tone color, melody, harmony, rhythm, tempo, and dynamics, and then apply that same principle to the non-musical aspects of audiophile-style sonics descriptors like transparency, resolution, soundstage, soundspace, and imaging – it makes for a very interesting and challenging discussion indeed!

Friends John LaChapelle (left) & Larry Coryell (right) playing jazz. Photo by Dr. Kannan Krishnaswami.

Perhaps one of the first questions to be considered is, “Do the non-musical audiophile-style parameters of transparency, resolution, soundstage, soundspace, and imaging have any relation to live music?”

I suppose the answer would be a qualified, “Yes.”


Let’s talk about “soundstage” first, which audiophiles use to describe the three dimensions of the recorded musical performance in width, height, and depth.

If we consider a live music performance for an audience at some particular venue, then the soundstage refers to an actual place with a size that can be defined, like the small auditorium in the photo above.

The venue can be very large, as with Simon and Garfunkel’s The Concert in Central Park where they played for a crowd of more than 500,000 people, or it could be a sizable concert hall like the Grand Hall Of The Moscow Conservatory where Bob Fine and Wilma Cozart of Mercury Living Presence fame recorded the Osipov State Russian Folk Orchestra for the album Balalaika Favorites, or the more intimately sized Oscar Peterson Trio’s Exclusively for My Friends albums recorded by Hans Georg Brunner-Schwer in a series of private concerts in his home studio.

The Concert in Central Park

No home stereo system I am aware of is going to provide a soundstage that will accurately replicate the actual width, height, and depth of the venues for The Concert in Central Park or the Balalaika Favorites albums.

So, in the sense of an accurate representation of a live concert reference, the soundstage heard in the home listening room will not be high-fidelity because it can’t match the size of the original venue. 

Balalaika Favorites

Yes, those performances will still be enjoyable and amazing even though they don’t accurately recreate the size of the original performance venue, but rather a miniature representation of that venue.

In a larger home listening room, one might be able hear Exclusively for My Friends recreated with a soundstage that is a fair representation of the dimensions of the venue it was recorded in, making it possible to hear the musical performance on your stereo in relative high-fidelity to the original performance.

Exclusively for My Friends

What about studio recordings? In the case where the studio is the venue, and the musicians were recorded like they were on Exclusively for My Friends where the musicians played for a small audience in a studio, then you might be able to get believable fidelity to the original height, width, and depth of the venue with your home stereo in a larger listening room.

But what about those studio albums where one musician was recorded in Paris, another in Los Angeles, and another in New York, and then the individual recordings were combined by a recording engineer to create an album?

In that sort of album the “soundstage” is an artificial construct that doesn’t represent any real venue, so by definition there is no reference to refer back to, so it can’t be high-fidelity in reference to an actual venue.

When it comes to the soundstage parameter loved by audiophiles there are a limited number of albums where a home stereo has the ability to reproduce the height, width, and depth of a reference musical venue in true high-fidelity.


In audiophile terms “imaging” refers to the ability of a listener to clearly identify the sound “image” of the musicians in the three dimensions of the height, width, and depth of the soundstage.

The ability of a listener to identify the location of a sound in direction and distance is referred to as sound localization.

Sound localization has been studied extensively by scientists and engineers, and there are too many factors involved to articulate all of its aspects in a short article like this one, but allow me to point out a few interesting aspects related to sound localization.  

Sound localization occurs as a function of a sound’s three-dimensional position in width (azimuth or horizontal angle), height (elevation or vertical angle), and distance (for stationary sounds).

Sound localization in the width dimension (the horizontal angle) can be described by duplex theory, which says that as the sound from our hypothetical guitar player reaches our ears, which are in two different positions, there are small differences in arrival time and intensity that each ear hears, which allows us as a listener to localize the guitar’s sound in the width dimension.

So, if you hear the guitar in the middle of the stage it is because its sound is reaching both ears at the same time and with the same intensity. If you hear the guitar on the left of the stage its sound is reaching your left ear first and with greater intensity, and then reaching your right ear slightly later and with lower intensity. If you hear the guitar on the right of the stage it is because its sound is reaching your right ear first and with greater intensity, and then reaching your left ear slightly later and with lower intensity.

Larry playing a solo. Photo courtesy of Dr. Kannan Krishnaswami.

Sound localization in the height dimension (the vertical angle) is similar in concept to width localization, but is oriented to the way we process the time and intensity differences of a sound reaching our ears from a vertical angle, rather than a horizontal angle as with the width dimension. 

Localization of a sound’s distance is a complicated affair, with factors such as the direct sound to reflected sound ratio, loudness, frequency, time delay, movement, and level differences, all playing a role.

All of these aspects of sound localization are occurring simultaneously as we listen to our guitar player, and our perception of the localization we hear will change depending on how far we are from the guitar player.

Let me share an example of listening to a classical trio in a small 350 seat auditorium. Because I was there early for their practice session, I was able to sit in a couple of different locations in the auditorium and hear the differences of how I heard them on the stage at each location.

When I was sitting in the front row, ten or so feet away from the performers, who were seated in a row horizontally across the stage in front of me, I could close my eyes and fairly accurately point to their locations across the width of the stage.  

The further away from the musicians I sat, the less accurately I was able to identify their exact positions on the stage, and at my furthest point from them their positions became more diffuse as musical “images” on the stage due to less localization.

Why is that? My ability to localize the individual instruments changed as the result of the distance, and as I got further away there was a decrease in loudness, an attenuation of high frequencies, the ratio of the direct signal to the reverberated signal changed (more reverberation), as well as a decreased horizontal angle to my ears for the width dimension. All of these factors worked together to reduce my ability to localize their exact position on the stage.

With live music our ability to localize instruments at a venue will be altered depending on how far away we are sitting from the musicians, and our ability to accurately localize the sound of a musician will decrease with distance.

When listening to recorded music our ability to localize the positions of the musicians will be influenced by the type and number of microphones used, and where they are placed to record the musical performance.

If we compare our live music reference to our recorded music listening experience, how well does the fidelity of our recorded music to the live music reference hold up?

Given that our ability to localize sound depends on distance (and other factors) for live music, and our recorded music experience is influenced by a number of microphone factors, they are rather different experiences in many cases, so fidelity to the live event may be relatively low.


There’s another term that audiophiles use to describe a music venue called “soundspace”, which is a term that describes the sense of spaciousness of a venue, or the sense of spaciousness in a recording of a musical performance.

The sense of spaciousness in a music venue can be natural, resulting from the sound of a vocal or instrument reflecting off a “live” surface, and then decaying as the sound is absorbed in the room by seating, people, and air.

A reverberant old cathedral like York Minster Cathedral could be an example of this kind of space.

Evensong in York Minster Cathedral, photo by Allan Engelhardt, and shared under license of the Creative Commons license of Wikipedia.

This sense of spaciousness is due to reverberation, where for an auditorium or room an optimum reverberation time would be considered to be around 2 seconds.

If a particular venue doesn’t have a reverberation time that is optimum, then it can be artificially created by positioning microphones so that they capture a combination of direct and reflected sound that you will give you the desired reverberation time.

One time while visiting my parents in Boise, Idaho, we went to hear a concert at the Cathedral of the Rockies performed by Dave Brubeck. Dave’s daughter and son-in-law were both involved in music at the cathedral on a daily basis as their day jobs, with one playing the pipe organ, and the other directing the choir, and their children were involved as musicians as well, which is why Dave chose the cathedral for a concert – it was a family get together that also served as a benefit concert for the church.  

Prior to the start of the concert they were moving microphones around until they achieved a combination of direct sound and reflected sound that provided the desired reverberation time in the main chapel of the cathedral. You can read more about this kind of approach to achieving a sense of spaciousness HERE.  

So, the “soundspace” effect of spaciousness that audiophiles enjoy in recorded music can actually exist at a venue, and when it does then it can be a reference to establish fidelity.

When spaciousness is artificially created by adjusting the reverberation time for a recording through the placement of microphones (or by other means), then it is no longer representative of the actual spaciousness heard at a venue, and would not be considered high-fidelity in reference back to our venue’s actual sound, even though it would sound more pleasing to the listener.


Now let’s talk about the term “resolution” as audiophiles refer to it, where it describes the amount of recorded detail or nuance that you can aurally perceive.

How much resolution of detail you’ll hear from our un-amplified guitarist will depend in part on how close you are sitting to the guitarist.

If you are sitting close to our guitarist you’ll hear a certain amount of resolved detail, and then the further away you sit from the guitarist you’ll hear progressively less resolved detail. So simply, you’ll hear more resolution of detail and nuance if you’re sitting close, and less resolution of detail as you move further away.

With distance, perceptions of the amount of detail and nuance heard from a musician and an instrument change in the same sort of way that we discussed for image localization, and as we move further away there is a decrease in loudness such that the softer sounds of nuance can diminish, there will be an attenuation of high frequencies such that high-frequency detail diminishes, the ratio of the direct signal to the reverberated signal changes to provide more reverberation, and so forth. So, the type and amount of detail that we perceive changes with distance.

In terms of accuracy (a given measurement’s nearness to a known reference), the amount of resolved detail our listeners will perceive will be dependent upon the distance from the guitar, and there will be one “real” amount of resolution at each distance from the guitarist.

At one distance from the guitarist, say 10 feet, the listeners will hear a particular amount and kind of resolved detail, and at 30 feet they will hear a different kind and amount of resolved detail, and so forth. 

Now let’s add in another complication, that of the recording process.

Microphone type and placement will have quite a large effect on the amount of resolution that will be perceived by a listener of a recording, so we could say that high-fidelity “resolution” for a recording is producing the same amount of detail in a recording at a given distance that our hypothetical acoustic listeners would hear at the same distance.

If a microphone is placed closer to our musician so as to produce much more detail than our acoustic listeners would hear at a given listening distance, or vice versa, then you could not really consider the resulting recording to be high-fidelity in relation to the original acoustic music event used as the reference.

Let’s use our guitarist as an example. If you place one microphone very near the sound hole, and one microphone very near the neck, you’ll hear a lot more detail and nuance than a person would perceive listening acoustically from 10 or 20 feet away, so what you hear would be something different than high-fidelity.  


Let’s look at one last term that audiophiles like to refer to as “transparency”, which refers to how far a listener can aurally “see” into a recording.

I think audiophiles’ “transparency” is sort of the opposite of what a recording engineer is referring to when they say a recording sounds “muddy”, which means a recording lacks clarity, detail, and has poor separation between instruments, and so a transparent recording is one that possesses clarity, detail, and good separation of instruments.

In other words, audiophile “transparency” actually refers to combination of factors, like resolution, spaciousness, imaging, and soundstage, for example.

So, does transparency have any basis in live music? I would say a qualified “yes”, but audiophiles rarely think about the transparency traits of live music, and when audiophiles refer to transparency they are generally talking about a recording and how a stereo reproduces it.

However, that sense of transparency will depend on the acoustics of the music venue, the reverberation time, the amount of background noise, the placement of the musicians, the equalization of the sound, as well as all those factors we discussed for resolution, spaciousness, imaging, and soundstage, and so there is a link to reality when talking about transparency.

A recording engineer might say muddiness in a recording is caused by, “… too many different sources piling up and masking each other in the critical bass and low mid-range areas”, making it more of a frequency issue, which you can read more about in the discussion HERE.

High Fidelity

Ok, I’ve muddied the waters enough by musing about the relationship of high-fidelity to audiophile-style sonics like soundstage, imaging, resolution, soundspace, and transparency, but let’s cut to the chase and consider what it means for us as listeners who desire high-fidelity.

My take on high-fidelity for musicality traits versus audiophile-style sonics traits, is that for the musicality traits of timbre, tone color, melody, harmony, rhythm, tempo, and dynamics, there is an actual definable reference for our stereos to reproduce, and how close they come to reproducing those traits of live music determines whether they are high-fidelity or not.

In terms of accuracy the “error bars” are relatively small, and as per my example in Part 1, if a stereo makes a steel string guitar sound like a nylon string guitar it could not be considered high-fidelity.

I will posit that while the non-musical audiophile-style sonic traits of soundstage, imaging, resolution, soundspace, and transparency, actually do exist in live music to some extent, accurately reproducing them in recorded music is very difficult, and there will be few - if any - stereos that can realistically achieve high-fidelity to the live music, and then only on a very limited number of recordings.

As we discussed earlier, no home stereo will be able to achieve high-fidelity to the actual height, width, and depth dimensions the soundstages when playing back The Concert in Central Park or Balalaika Favorites albums, for example, although for a smaller more intimate venue as recorded in Exclusively for My Friends a reasonable approximation might be possible in a larger listening room.

So, for the soundstage criteria most of the time we will be hearing “miniatures” of the recording venue, so we have to accept that it will not be high-fidelity in the literal sense.

The same holds true for imaging, where we will be hearing “miniatures” of the actual size of the musicians and instruments on many albums, which would not be considered high-fidelity to the actual size of instruments and musicians in life.

Achieving high-fidelity in resolution is more likely than with soundstage and imaging, but if a stereo system creates more or less than the believable range you would hear in live music, then it would not be considered high-fidelity.

Well you get the idea, high-fidelity in terms of audiophile-style sonics is more unlikely than it is with musicality.

I posit that for most recordings the idea of audiophile-style sonics being high-fidelity is really not applicable, and instead it would better serve us as listeners to think of those recordings as works of sonic art that deviate from reality, and high-fidelity, and to admire and enjoy them for what they are – works of audio art created by talented recording and mastering engineers, the best of which can often be as emotionally evocative and enjoyable as live music.

So, is hifi system voicing a matter of taste?

So, to answer the question, "Is hifi system voicing a matter of taste?", I would say "In part."

For the fundamentals of musicality, I believe a worthy voicing goal is to attempt to achieve high-fidelity to live music as closely as possible.

For audiophile-style sonics, since high-fidelity to live music is already highly unlikely, the voicing goal is more a matter of taste, with my personal preference being a presentation that is natural sounding and consonant with live music to the degree possible, rather than calling attention to those traits by exaggerating them.

In Part 3 of these musings on high-fidelity, I’ll offer some musings on what affects a listener's perceptions of what they hear. Not every listener hears the same thing when listening to a common source of sound. Sometimes one listener hears things very differently than the norm of what other people hear from that source of sound or music. Other times a listener can't perceive any differences at all when listening to something, even when the norm is that others can. Why is that?

As always, thanks for stopping by, and may the tone be with you!

Print Friendly, PDF & Email

Jeff's Categories

More Articles by Jeff

Get our Newsletter


Recent Discussions