Sound Image vs. Soundstage — Which Type of Speaker (& System) Do You Prefer?

In audio circles, when people try to put the sound quality of equipment into words, speakers and audio systems have long been described — as a matter of convenience — as either “sound image type” or “soundstage type.” These two categories cannot be divided with any sharp, precise line, but by encountering a wide variety of speakers and audio components, one gradually gets a feel for the tendencies involved. So this time, as a shared understanding I would like audio enthusiasts to have in common, I want to try explaining what a sound image is, and what a soundstage is.

audiopro Image12 Vienna Acoustics Mozart Signature T-2

Index

What is a Sound Image?

A sound image refers to the phenomenon in a stereo audio system where a point or mass with a certain sense of presence appears between the left and right speakers, serving as a visual substitute for instruments, vocals, and so on. This is what is called the sound image, or stereo image. In the case of a simple recording with equal left-right balance and a predominance of direct sound, the basic expectation is that the voice or instrument will be positioned at the centre point between the two speakers — primarily along the line connecting the two tweeters.

ステレオイメージ音像定位、中央定位
Centre imaging — the visual concept of a point source

In recordings of live performances that contain a large proportion of indirect sound, the placement of the instruments means that rather than appearing at the centre, each instrument’s position shifts left or right, and a sense of front-to-back depth also becomes visually apparent. With an instrument like a piano, which is not a point source to begin with (the piano’s pitch changes where the sound originates), the imaging shifts left or right and the sense of depth varies depending on where the recording microphone is placed. With an instrument like the violin, where the performer moves their body considerably while playing, the image may sometimes waver and shift visually as well.

Even in electronic music where no acoustic source exists, it is possible during mastering to alter the L/R balance of vocals and individual instruments, deliberately spreading the image left and right so that sound sources do not overlap. The size of the resulting sound image depends on the speaker and the system, but those considered to have good imaging in audiophile terms tend strongly towards a pinpoint, near-point-source quality. When the image is slightly broader, people describe it as being able to make out the size of the mouth; when it sounds still larger, expressions like “life-sized such-and-such” are used by way of analogy.

With a reasonably accurate setup in a modern pure audio system, this kind of visual sound image does appear between the two speakers — it is one of the basics of speaker placement, and a fundamental milestone in 2-channel stereo or multi-channel surround reproduction. In headphone systems, the sound image in most cases localises as small images within and around the head.

What is a Soundstage?

A soundstage refers not only to the sound images between the left and right speakers, but to the entire stage on which the music is performed appearing — principally between the two speakers — as a space. The size of the stage imaging that is created here depends on the directivity of the speakers and the choice and placement of audio components, and in some cases it is possible to extend it well beyond the space between the two speakers, spreading broadly in all directions — up, down, left, right, and in depth. In the case of a live recording, this sense of spatial spread is cultivated by the reverberations of the concert hall or live stage that was captured in the recording. It is also possible to recreate a soundstage artificially after the fact — making it sound as though a wide stage is actually there — by making full use of DSP during playback, or electrical echo and effects processing applied during recording.

Incidentally, reproducing this stage image with greater clarity, accuracy, and breadth has become the implicit trend in modern pure audio and high-end audio. It is not possible to draw a clear line between soundstage type and sound image type, but please think of it as representing a relative tendency along a single continuum.

How Do You Read 音場 — “Onjō” or “Onba”?

Incidentally, the word 音場 (soundstage) can be read two ways — “onjō” and “onba” — and neither is wrong. That said, personally I am more familiar with “onjō.” People have always said “onjōkan” rather than “onbakan,” haven’t they. I think this is something that settled naturally from years of “onjō” being the conventional reading in audio publications and the like.

There is also a difference in nuance between “onba,” used as an engineering term or in recording environments with a concrete, real-world meaning, and “onjō,” used when putting into words a sensory, invisible audiophile concept. There is a deliberate reason for maintaining that distinction, and I had always assumed it was why “onjō” naturally won out. Since the word refers to a soundstage whose visual boundaries are ambiguous, I personally feel that “onjō,” read as a compound rather than stopping short at “on-ba,” better suits the atmosphere and feel of the concept. Besides — “onba” as an expression… it is something I only started hearing for the first time in recent years, on YouTube and the like.

What is a Sound Image Type Speaker?

This refers to a speaker that, when reproducing music through a stereo audio system, prioritises the direct sound components contained within the music — and sometimes reproduces them in a way that further emphasises the direct sound. Its characteristic is that it renders the visual sound image with density and crispness, giving a strong sense of presence and tangibility to the performance right in front of you.

JBL L100 Classic 75
JBL L100 Classic 75

The advantage of a sound image type audio system is that, assuming a given amount of acoustic energy, more of that energy is directed towards projecting the sound image — making it easier to achieve a musical presentation that feels alive and tangible. Because the listener’s attention is drawn to the direct aspects of the performance, the musical image is easier to grasp, and it becomes easier to concentrate on the music itself. On the technical side, approaches vary widely: using drivers with sharp directivity and excellent response and control, effectively absorbing sound within the cabinet, and eliminating unwanted resonance and box colouration as much as possible, among others.

When the design philosophy is taken to the extreme of putting absolutely everything into projecting the direct sound, the trade-off for the speaker’s assertive emphasis on direct sound can be that fine indirect sounds and reverberation components become relatively subdued — resulting in a sonic character that is thin in spatial and atmospheric quality and lacking in musical detail. This tendency is liable to be particularly pronounced in lower-priced products where resolution is insufficient.

What is a Soundstage Type Speaker?

Conversely, by emphasising the indirect sound components or slowing the decay of reverb, the reverberant elements contained within the music itself are amplified, and this is what gives rise to speakers known as soundstage type — ones that produce a rich sense of resonance across a wide space. Technically speaking, designs that particularly emphasise this aspect are frequently found in satellite speakers for multi-channel surround systems, and similarly, high-end audio speakers with multiple drive units are also aimed at enlarging the stage image through acoustic amplification created by subtle time delays and phase shifts between the drivers. Drivers with wide directivity, the classic bass-reflex port design that amplifies bass with a time delay, and the addition of out-of-phase components resulting from more complex driver arrangements are all means of creating an artificial sense of soundstage. A sonic approach that uses a resonant cabinet with minimal damping material to exploit box colouration is also a technical direction commonly found in soundstage type speakers.

Vienna Acoustics T-2

The weakness of the soundstage type is that, assuming the same amount of acoustic energy, that energy is taken up not just by the sound image but also in large measure by the reverb and reverberation components needed for soundstage reproduction — and the trade-off is that the tangibility of the direct sound tends to become thin. In the case of designs that put everything into reverb emphasis, as with some satellite speakers, the sound image can seem relatively weak and hollow, and a similar sonic tendency can also be observed when an amplifier with insufficient power reserves is connected to the speaker.

Also, I have used speakers as the representative example here because of the large influence they have, but in practice, each audio component and accessory has, to a greater or lesser extent, a vector leaning more towards soundstage or more towards sound image. For this reason, the expressive tendency of the whole system can also be tilted more towards soundstage, or more towards sound image, depending on the overall system setup and combination of components.

A Brief History of Soundstage Reproduction

コンサートホール

From the late 1960s through the 1980s, soundstage reproduction was already a critically important criterion of evaluation among classical music audiophiles. In a certain sense, it seems that the pursuit of how faithfully to recreate the atmosphere of a concert hall was even more intense than it is today.

· The classification of soundstage type and sound image type
· Reproduction of hall tone
· Instrument localisation and sense of depth
· Natural spread of the stereo image
· The sense of presence — as though the performers were actually there

It is surprising that such concepts already existed at that time, but when you think about it, classical recordings from around 1960 onwards — when stereo recording was just beginning — already demonstrate, even when heard on today’s audio systems, that three-dimensional sonic spaces were already being captured clearly and in detail, as we can tell from the many celebrated recordings of that era.

In terms of concrete approaches to recording and reproduction, people were already experimenting with things such as:

· Exploring speaker placement, angle, and listening position
· The equilateral triangle listening position
· How to create a three-dimensional soundstage from simple two-channel reproduction
· A tendency among classical listeners to prefer soundstage type speakers with a broad spread
· The importance placed on the spatial openness that valve amplifiers bring

All of this trial and error was already underway at the time. As a result, one-point stereo recording techniques such as the Decca Tree came to be highly regarded, and in the 1970s there was even a period when quadraphonic — four-channel stereo systems using matrix formats such as CD-4 and SQ — was being tried out. In the analogue record era, the recording format itself was directly tied to soundstage reproduction, and labels like Decca and Mercury Living Presence poured their passion into capturing the atmosphere of the concert hall, from microphone placement through to the entirety of their recording technique. What is particularly interesting is the possibility that, before the advent of digital, sensitivity to what one might call “the spatial continuity of analogue” may have been keener.

Classical music listeners of the 1960s and 70s may not have used the word “soundstage” very much, but they were pursuing essentially the same thing. The expressions used at the time included things like:

“Hall tone,” “space,” “three-dimensionality”
“Presence,” “realism”
“Spread of the sound,” “depth”

and so on. Even in audio criticism of the day, expressions like “depth of the soundstage” and “clarity of the space” were already in use, and these are essentially synonymous with what we now call “soundstage quality.”

In stereo reproduction of classical music, given the relationship between the concert hall and the performers arranged within it, the more faithful the recording, the more the reproduction of the stage without exaggeration is, in a sense, a natural and inevitable destination. In jazz and pop recordings, however, the stage is inherently small and close, and from early on it was normal for the stereo image to be exaggerated or processed. From this emerges a different approach to reproduction — a stereo system oriented not towards the recreation of soundstage or sound image, but towards being immersed in the heat and energy of the music and the performance. The style, once pursued by audiophiles of an earlier era, of playing large-diameter speakers at close range is one form of approach optimised for getting the most out of that kind of source material.

コロナ社
¥3,410 (2026/06/16 23:30時点 | Amazon調べ)

From the Arrival of Digital Recording to the Present Day

Around 1980, digital recording — newly arrived in the recording industry — became mainstream in no time at all, and with the introduction of the CD, digitally recorded albums began to spread rapidly. In the analogue record era, recreating the spatial sound of a recording studio required considerable equipment, and was a high hurdle even for enthusiasts. But as digital equipment became more widespread from that point on, it gradually became known that even relatively affordable home audio systems were capable of soundstage reproduction.

レコーディングスタジオ

From around this time, a shift began — away from the style of receiving the direct sound head-on from large-diameter speakers of the kind that had been the mainstream up to that point, such as JBL, Altec, and Tannoy, and towards a sound in which an ordered stage floats up from empty space. This new approach began to permeate a wider audience of audio enthusiasts through in-store demonstrations and the pages of audio magazines. The consumer-market recognition of high phase-accuracy monitor speakers represented by the B&W Matrix, Yamaha‘s DSP technology, Dolby Surround, Bose, and others — three-dimensional soundstage reproduction gradually came to be celebrated as the new standard of high fidelity that the latest audio equipment of the day had to offer.

It was perhaps from the 1990s onwards that a consciously soundstage-first approach became conspicuous in pure audio. In the 90s the market still had a mixture of both old and new design philosophies, but from my own experience, I have the impression that around the year 2000, the new products in consumer audio shifted quite suddenly from sound image type to soundstage type speakers and components.

In modern pure audio, to truly draw out the stereo imaging potential that audio equipment is capable of, one needs approaches that in a sense place the equipment ahead of the living environment — such as pulling both speakers well away from the rear wall to avoid a collapse of the soundstage through diffuse reflections, and not placing a television or audio rack between them, and so forth. But once you go that far, it inevitably diverges from the approach I have been pursuing for years here at the “miniature garden” AUDIO STYLE — that of bringing the essence of the concert hall into the living room. Because it inverts the natural hierarchy between the inhabitant, the music, and the equipment in a living space, this is precisely where one must “know when enough is enough.” A moderate approach and a sense of balanced proportion matter in all things — or something to that effect, at any rate.

On the Non-existence of Sound Image and Soundstage in Live Performance

From here I want to write from the experience of someone who has attended more classical music concerts than can be counted, has performed himself, and has also experienced live jazz at close quarters.

オペラハウス ピアノコンチェルト

Having written all of the above, I should point out that the concepts of sound image and soundstage are, in the first place, concepts re-created on the recording and playback audio side — they do not exist in live performance as it existed before all this. ※ I mention this deliberately, because many people are unaware of it.

A soundstage does exist in the sense that a hall or stage has its own acoustics, merging everything into a single whole — but the clearly defined sound images that audio people talk about, and the neatly separated, well-ordered soundstage of individual instruments, are things you basically cannot experience in real, live music. In a reverberant concert hall, the direct sound is greatly blurred even at close range, and the proportion of indirect to direct sound reaching the audience is far greater. This is the reality. Strictly speaking, therefore, if one is truly seeking original sound reproduction or a faithful recreation of live performance, the very appearance of a clearly defined sound image with sharp boundaries is itself something of a nonsense to begin with.

Recording engineers exercise their ingenuity within the mechanisms of microphone capture, mixing, and 2-channel stereo playback to get as close as possible to the live performance at the source. So why does this artificial concept of “sound image and soundstage” exist within recorded music at all? The answer lies in the fundamental technical constraints that arise from the nature of recording technology and the mechanism of playback. At the most basic level, the very method of capturing with stereo microphones and reproducing through two speakers is inherently ill-suited, by its very design, to three-dimensional spatial audio reproduction. This is a premise that needs to be understood first.

著:日本音響学会, 著:吉川茂, 著:鈴木英男, 著:大串健吾
¥5,060 (2026/06/16 23:30時点 | Amazon調べ)

From the Flowering of Stereo Recording in the 1960s to Today’s Three-Dimensional Spatial Audio…

In recording studios, engineers have been experimenting all along — working within the technical and sonic limitations of stereo recording — with recording and mixing techniques aimed at reproducing the reality of the stage and concert hall, in order to bring recorded music as close as possible to live performance. And at some point, by making full use of multi-microphone, multi-track recording, it became possible to actively construct sound images and soundstages, and to recreate the acoustics of a hall or stage in a simulated fashion. From there, they began to realise that by further visually emphasising separation and localisation, they could conjure something for listeners that felt convincingly like a real acoustic space — an illusion.

Some recordings from that era went too far with the cut-and-paste experimentalism and the exaggeration became excessive, but in classical music recording, a style of spatial reproduction that seemed to swallow a concert hall whole became mainstream fairly early on, and by the 1960s this had already become the standard for major label recordings. There are also quite a few cases where the acoustics of what sounds like a large concert hall are actually an acoustic reconstruction from studio sessions — Abbey Road Studios being one example. The history of recording technology is, in a sense, a history of trial and error in this act of re-creation, and of the accumulated technical knowledge it has produced.

At the same time, on the listener and consumer audio side, this artificially created “precise localisation” and “separation of instruments” became the guiding principle of sound quality, and the gradually clearer and crisper sound — more vivid than live performance itself — came to be embraced and pursued as the standard of high fidelity by the audio industry and audiophiles alike. The outcome was that even the concept of “the art of playback” came to be created as a cultural construct.

When you take all of this into account, strictly speaking, a recorded work is technically no longer live performance. As a listener, one naturally wants to re-experience the essence of the original live performance as fully as possible — yet in reality, there is absolutely no escaping the various acoustic modifications and recreations brought about by recording, mixing, and speaker playback. A realistic sound image, pinpoint localisation, a three-dimensional acoustic space — all of this is ultimately the product of an electrically constructed virtual reality, layered transformation upon transformation.

株式会社音楽之友社
¥4,243 (2026/06/17 09:28時点 | Amazon調べ)

This virtual-realistic audio technology, built up over years of advances in recording, playback, and equipment performance, leads us towards the modern audio equipment and high-end audio that presents music more clearly and grandly than any real acoustic sound. Yes — as an inevitable consequence of technical advancement, the balance has tilted: from drawing out the live music that was once captured in a recording, towards the pursuit of high fidelity as pseudo-recreated through acoustic technology. And with the advent of digital sources, the protagonist of audio has, slowly but surely, been replaced — from music with a living pulse, to a manufactured high fidelity.

Whether that direction is good or bad ultimately comes down to the difference in one’s position: whether one is a music listener who takes music as their starting point, or an audiophile who takes high fidelity as theirs. It may look as though both are doing the same thing through their audio equipment, but in truth the two differ in their fundamental, essential purpose. And it is in carrying this paradox — this irreconcilable contradiction and dilemma within an endless trial and error — that the audio hobby is at once sometimes painful, and sometimes a great deal of fun.

— In Summary —

That said, as someone who takes live performance as their touchstone — as a music fan first — there is a point I want to convey. An approach that pursues sound image and soundstage excessively through audio equipment is, like applying heavy retouching to a photograph that was already realistic, not something that contributes in a particularly good direction to musical reproduction. I think this is the crux of the matter for listeners whose goal is the music itself rather than the sound quality. Given the technical constraints of multi-microphone mixed recording and 2-channel stereo playback, on any decent system, some degree of sound image and soundstage will emerge regardless of what you do or don’t do. Take that as it comes, enjoy it in moderation, being careful not to obstruct the natural flow of the music, and hold on to a sense of balanced proportion. Accept the ambiguity of the world, and let yourself be carried by the natural flow that the marriage of playback space and music produces.

音楽之友社
¥2,200 (2026/06/16 23:30時点 | Amazon調べ)

The more one chases that artificial, unnatural direction — the hyper-crisp, hyper-dynamic, three-dimensional re-creation of a soundstage — and the more one pursues a manufactured “high fidelity,” the further one drifts from the richness and colour of the music one originally wanted to hear. Keep this firmly in mind. Simply knowing this, I have always thought, is what allows one to remain a music lover while still enjoying the audiophile world — holding both together without contradiction.

List of comments (1)

  • あけおめ&お久しぶりです。

    >現在のスピーカーにおける「音像型」「音場型」という分類には、
    >ヘッドホンで語られる「バランス接続」「アンバランス接続」の話題と、
    >どこか似た歴史的・文化的背景を感じています。

    ここは切り分けておきたいのですが、
    基本的に、バランスorアンバランスは単純な技術論ですので、良し悪し好み以前に技術的に違うものです。対してスピーカーの「音場型」「音像型」云々は、音作りの結果として生まれる音質傾向、性格的な偏りを、官能評価として解り易いように言語化したものです。前者は技術の話で、後者は感覚的主観の言語化です。

    >音質を追求するための技術的な考え方や整理のための言葉
    >音楽の楽しみ方そのものを分けるための概念ではなかったように思います。

    歴史的には1960年代から現代に至るまで特に何も分かれていないと思います。音像も音場もステレオ初期に既に言語化、感覚的評価のために生まれた概念ですが、基本的には目に見えない音質を文字で説明するために創られた数多ある言語化の一欠片です。

    >今では「どちらが正しいか」ではなく、「どちらがより心地よく音楽を楽しめるか」
    >というニュアンスを含んで語られることが増えてきたようにも感じます。

    「どちらが正しい」かの結論は、「機材環境に依る、聴く音楽による、使用者の好みに依る」です。相対指標ではありますが対立概念ではありませんし、正しいとか正しくないという二元論に落とし込むのは本質的に誤りでしょう。「どちらがより心地よく音楽を楽しめるか」も同様で、それは「個人の好みに依る。どんな音楽を聴いているかに依る」です。昔から対立軸には無かったし、現代でも対立していません。

    技術的制約から全ての音源に対して一元的に最適化する事は現実的に不可能ですので、個々人の環境で、それぞれに好みの音楽が心地よくなるシステムを追求するのが最適解でしょう。

    >その過程で、言葉が技術や科学から離れていき
    >結果として「音質」そのものに意識が向きすぎしまって、
    >音楽を楽しむという本来の目的が見えにくくなっていることもあるのではと感じます。

    これは僕の見てきた世界とは真逆で、音質そのものに意識が向き着ているオーディオマニアの人達に、むしろ技術や科学に固執しているタイプが多い様に感じます。逆に技術や科学に興味が薄い音楽ファンや演奏家は、オーディオ的な基準で偏った不自然な音質にも興味が薄い。だからといって音質がどうでも良いのではありません。生演奏での音楽表現、そこに必要な音質にはとても拘る。録音・再生音質も出来ればそちら側に寄り添いたい。

    僕は技術論的思考へ単純化されたオーディオ追求には強い違和感を感じてきたので、敢えて、音像も音場も現実の音楽には明確には存在しない事を書きました。極端な高音質指向は、根拠となる立脚点がそもそもリアルな音楽ではなくなっていませんか?と。

    >音楽を楽しむという本来の目的が見えにくくなっている

    少なくとも、血の通った音楽を電気工学で単純化し、頭でっかちで視野狭窄的な技術論のみに落とし込んだところで、音楽を楽しめるようには絶対になりません。音楽は先ず以て感覚的な世界ですから、主観的、感覚的な価値判断、自己評価が出来るか否か?が出発点であり全てです。

    音楽を愉しむってのは感じることです。加えて聴感覚は感性の領域だからこそ、抽象的な言葉と結びつけて言語化も出来る。そして僕には個人的に言語としての抽象表現にこだわりがあります。そこでは単純化された電気技術だけでは測れない、曖昧で複雑な音の営みを、文学的表現によって幾許かでも他者へ伝える為の数少ない手段たり得るからです。

To comment

Index