● Print the Quiet · 9 / 9

What's in the Tape That Isn't in the Score

The closer: tone is the irreducible record of a body in a room making a decision in time, and that decision has to belong to someone.

By Jason Colapietro, writing as Johnny Suede · A Suede Social long read · 11 min

This series began with Jeff Buckley standing at a microphone at Bearsville Studios in September 1993, singing a Leonard Cohen song into the air of a converted barn. It has worked through the loudness war, through the religion of the compressor, through the lost art of clean tones, through the room as instrument, through the comp-versus-take question, through analog tape and its physical constraints, through the steady lengthening of the recorded vocal chain over three decades.

The argument running underneath all eight pieces has been the same. Tone is the part of recorded music that is not in the score. The chord chart of Buckley's Hallelujah is Leonard Cohen's. The melody is Cohen's. The lyric is mostly Cohen's, filtered through John Cale's editorial choices. What is Buckley's, and what is the reason listeners still seek out his version in numbers that beat the songwriter's original by two to one on Spotify, is everything that is not on the lead sheet. The arpeggio Buckley invented to translate Cale's piano figure to guitar. The capo position. The 6/8 pulse. The U87 four to six inches from his mouth in a 35-by-35-foot wooden cathedral. The right hand on the strings. The half-second hesitation before "the holy or the broken Hallelujah." The slight bend on "cold" in the second verse. The thumbnail catching the wound G string at the second-verse pickup. The breath at 3:20.

None of that is on the lead sheet. All of it is the record. That is the thesis. This piece, the ninth, is the closer.

✦ ◆ ✦

The argument in one paragraph

Tone is the irreducible record of a body in a room making a decision in time. It is the part of music that is not in the score because the score, by definition, abstracts away the body, the room, the decision, and the time. The lead sheet says G major. The recording says G major as played by this human, on this instrument, at this volume, in this room, at this temperature, on this day, by these hands, in this emotional state, with this breath, at this distance from this microphone, through this preamp, onto this medium. Every one of those qualifiers is tonal information. None of them is in the score. The recording is the sum of the score plus all of the qualifiers. The qualifiers are what we call tone. The qualifiers are also, currently, the part of music that no AI system has been able to infer from the notes — because the information was never in the notes. It was in the breath.

✦ ◆ ✦

What the series has actually established

A short recap of the technical and historical points that, taken together, support the thesis. Anyone who has read the eight prior pieces can skip this section. Anyone arriving fresh: this is what the receipts say.

On dynamic range. Grace was mastered in 1994 at roughly DR11 to DR13 — solidly in the healthy band. Average pop today sits around DR8 to DR10, an improvement from the 2008 trough of DR4 to DR6 but still below the analog-era baseline. Records mastered with dynamic range intact allow the listener's nervous system to read dynamic change as emotional information. Records crushed to a flat wall remove that channel of communication. The loudness war was a forty-year wound to popular music's emotional bandwidth. The streaming-era normalization flipped the incentive, but the practice is still changing slowly.

On compression. Compression, used well, is the throat of the recording — the tool that most directly shapes how the body of a sound arrives at the listener's body. Used badly, it removes that body. The difference between breath and crush is a difference of judgment, not of equipment. Andy Wallace pulled two to four decibels of gain reduction on the Grace master bus. Modern pop pulls six to ten on the vocal alone before it touches the master.

On clean tones. Clean is the courtroom. It exposes everything the player does, which is why mediocre players reach immediately for distortion. The clean Telecaster on Hallelujah is the precondition for the take. If the tone had been dirty, the same chords would not have done what they did.

On the room. Bearsville Studio A is on the record as much as the Telecaster or the U87. The 35-foot ceiling, the wooden envelope, the early reflections, the late tail — these are physical conditions that no plugin reverb completely captures because no plugin reverb is a coupled physical system. Bedroom production has, by necessity, replaced room behavior with software. The software is genuinely close. It is not the same.

On takes. Buckley cut more than twenty full performances of Hallelujah at Bearsville. Wallace assembled the released master from three to five of them. The comp was at the line and verse level, not the syllable level. The causal chain of phrasing inside each take is intact. That is structurally different from modern phoneme comping, which breaks the causal chain even when the result is technically perfect. Listeners cannot articulate the difference. Their nervous systems read it.

On tape. Magnetic tape is itself a multi-stage automatic processing chain — soft saturation, low-order harmonic generation, gentle HF rolloff at hot levels, low-end head-bump, micro pitch modulation — applied to every track of every record from 1955 to roughly 1995, for free, without an engineer choosing it. The transition to digital in the late 1990s removed that chain. What we call "the warm sound of analog records" is the audible residue of that chain. Modern tape emulation plugins capture 90 to 95 percent of it.

On the vocal chain. Buckley's 1994 chain had four to five stages between mouth and master. A modern major-label vocal chain has fifteen or more. Each added stage trades fidelity-to-source for control. The aggregate cost is the audible body of the performer.

These are the receipts. Taken together, they describe a forty-year set of decisions in the recording industry that have, on net, removed body from records and replaced it with surface. The body is the tone. The surface is the chain.

✦ ◆ ✦

What AI is good at, and what it currently isn't

A serious version of the AI conversation, without the marketing voice or the doomerism.

What current AI music systems are genuinely good at: chord progressions, melodic plausibility, surface timbre approximation, lyric generation, structural arrangement (intro, verse, chorus, bridge). The 2025 work from Suno, Udio, Riffusion, and similar systems is, by any reasonable measure, excellent at the score. If the score is what you want — a competent chord progression, a passable melody, a serviceable lyric, a finished-sounding production — you can have one in three seconds.

What current AI music systems are noticeably less good at: the qualifiers. The body in the room. The decision in time. The breath at 3:20. AI vocals generated as of 2026 tend to be locally plausible at the phrase level and globally disconnected — they have not learned the causal chain across phrases, the way line four of a verse is a response to how line three just felt. AI guitar tones, especially clean ones, tend to sound like averages of training data rather than like a specific player in a specific room with a specific instrument. AI mastering defaults to brick-walled output because the training data taught it that brick-walled output is what finished records sound like.

The shorthand: AI is excellent at what is in the score and approximate at what is not in the score. The thing it is approximate at is what we have been calling, for nine pieces, tone.

Will it stay that way? Probably not. The models are improving fast, and the gap is narrowing. By 2030, AI systems will almost certainly produce vocals with convincing causal chains across phrases. They will almost certainly produce guitar tones that are indistinguishable from a specific player in a specific room. They will almost certainly handle dynamic mastering with the judgment of a skilled engineer. The technical limitation is real and it is also temporary.

But the interesting question is not the technical one. The interesting question is what tone means when it can be inferred from the notes — when the qualifiers can be generated, in a plausible way, by the same system that wrote the score.

✦ ◆ ✦

What tone means when the body is optional

Here is the position this series will end on, for the record.

Tone is the irreducible record of a body in a room making a decision in time. When the body, the room, and the decision can be synthesized from training data, tone is still tone — but the question of which tone, and whose, becomes a different question. It becomes a question of attribution, provenance, and chosen constraint, rather than a question of physical capture.

This is a real shift, not a marketing one. The 1994 Bearsville recording of Hallelujah could only have been made by Jeff Buckley, in that room, on that day. The 2026 AI-generated recording of Hallelujah in the Buckley style could be made by anyone, anywhere, with a prompt. The technical capture is no longer the bottleneck. The bottleneck is now the decision: whose body, whose room, whose breath, whose constraint.

What this series has tried to argue — and what we, at Suede Labs, think matters about it — is that the decision is the thing that has to be preserved, attributed, and honored, because the body that made it is the source of the tone. If we let the decision be untraceable, the tone becomes generic. If we honor the decision — if we trace it back to a specific person, in a specific moment, choosing specific constraints — the tone remains tone, even when it is synthesized.

In practice, this means three things:

One: the people who made the decisions deserve attribution. Cohen wrote the song. Cale chose the lyric edit. Buckley invented the arpeggio. Wallace caught the take. Each of those decisions is a contribution to the record we hear, and each of those contributions should be tracked. This is what programmable IP and creator-ownership infrastructure is for, in the most honest version of why it matters. Not as a token mechanism. As a way to keep the line from a piece of music back to the bodies that made the decisions inside it.

Two: constraints should be chosen, not erased. The 1994 chain had four to five stages because Wallace chose a constraint set. The 2026 chain has fifteen because most of those stages were added without anyone choosing them — they accreted as defaults, as habits, as algorithmic suggestions accepted without consideration. The discipline of choosing constraints, the way Jack White chose tape and the way Daptone chose all-analog, is itself a tonal act. Every default in your DAW that you accept without thinking about is a small piece of tonal authorship you have given away. Every default you change is a piece of tonal authorship you have kept.

Three: the records that survive will be the ones that had bodies in rooms making decisions. This is empirical, not aspirational. The records that last thirty years and become the records people put on at weddings and funerals are records with audible decisions inside them. Grace. Pink Moon. The Nightfly. Court and Spark. Astral Weeks. Blue. What's Going On. Revolver. The frictionless, decision-free, AI-defaulted track is unlikely to be the one anyone returns to in 2056. The track with a body audible inside it — synthetic or organic — is the one that has a chance.

✦ ◆ ✦

A small confession

Let me be plain about something the previous pieces have danced around.

This series has a position. The position is not that AI music is bad. The position is that the part of music that listeners respond to most deeply — the tone, the body, the decision — is the part that is hardest to fake and easiest to lose. We have lost it, partly, over thirty years of mastering loud, of comping at the syllable, of replacing rooms with plugins, of adding stages to the vocal chain until the singer is barely audible behind the engineering. AI did not cause that loss. The loss preceded AI by a long time. What AI raises is whether we will lose it further by default — or whether we will use this moment, as the technology accelerates, to remember what tone is and to design our tools, our platforms, and our value systems to keep it.

The whole Print the Quiet argument is that tone is a value system. It is the choices about what to leave in (the breath, the room, the wobble) and what to take out (the redundant compressor, the unnecessary correction, the safety overdubs that protected nothing). Those choices are not technical. They are aesthetic, and aesthetic choices are moral. When a producer chooses to leave the breath in, they are making a small moral claim about what listening is for: that listening is a meeting between two bodies, and that the meeting requires the bodies to be audible.

When the breath comes out, the meeting becomes a meeting between a listener and a surface. The listener can still enjoy the surface. Many surfaces are beautiful. But the meeting is different. Something specific has been removed.

✦ ◆ ✦

Coda — the breath at 3:20

Go back to Hallelujah, one more time.

Three minutes and twenty seconds into the song, Buckley's voice drops to almost nothing. The verse before the final chorus. He is, by this point in the take, deep enough into the performance that his body knows where it is going. The guitar arpeggio continues underneath, the Telecaster's clean Vibrolux signal pulsing in 6/8, the Quadraverb halo extending the sustain. And in the small gap between the verse's last word and the chorus's first one, he takes a breath.

You can hear it. The U87 four to six inches from his mouth picks it up plainly. The breath has shape. It is not silent. It is not a noise the engineer should have gated out. It is the sound of his body preparing for the climb back. It is part of the record.

If you played that breath to a person who had never heard the song, and asked them what it was, they would say: someone is breathing. They would not say: that is a tonal artifact of close-miking with a tube condenser at a specific proximity in a wood-paneled tracking room with light analog tape compression. They would say: someone is breathing.

That is what tone is. The sound of a body, audible inside a record. The thing that makes a listener, who has never met Buckley, feel like they have met Buckley. The thing that lets a song written by Leonard Cohen and arranged by John Cale, sung by a twenty-six-year-old in a barn in upstate New York in September 1993, become, thirty-three years later, the version that two generations of listeners return to when they want to feel that the music is real.

You can synthesize the breath. The synthesis will get closer every year. The breath, generated, will eventually be indistinguishable from the breath, captured.

When it is, the question we will face is not whether the synthesis sounds like Buckley. The question will be: whose decision was the breath? Whose body, in what room, made the choice to leave it in?

If the answer is no one's — if the breath is a default that the algorithm added because the training data taught it to — then the breath is no longer a meeting between two bodies. It is a meeting between a listener and a probability distribution.

If the answer is someone's — if a specific person, at a specific moment, made the decision to keep that breath in — then the breath is still a meeting. The body is still in the record.

That, more than any technical detail in any of the previous eight pieces, is what Print the Quiet has been trying to point at.

Tone is the part of music that is not in the score. The part that is not in the score is the body. The body is in the room making a decision in time. The decision has to belong to someone, and the someone has to be traceable, or the meeting is no longer a meeting.

Print the quiet. Keep the breath. Trace the decision.

That is the work.

That is the receipt.

✦ ◆ ✦

Print the Quiet, a Suede Social series on tone, dynamics, and the parts of music that don't fit on a lead sheet, ends here. Thank you for reading. — JC