The "AI Distraction Stack:" When impressive output is actually a smokescreen.

Jonathan Gordon
Mar 31
5 min read

Updated: 5 days ago

(Originally published on LinkedIn)

It starts with a request for the AI to generate something you want. The code compiles and the output renders. Everything looks like it's working. However, what the AI produced is not what you asked for. Instead, it's just a more impressive-looking version of something adjacent to what you asked for.

Purple, gold, and green dimond harlequin mask against a background of the same material

Unless you're really looking, you won't see the difference because you're blinded by how impressive it looks.

I'm calling this the AI Distraction Stack. Let me give you a real-life scenario I encountered:

I was debugging a rendering issue and needed a clean, minimal visualization of the current state and nothing else, so that I could visually debug. No statistics, no labels, no explanatory text. Just the raw signal so I could see what was changing as I worked through the fix. So I prompted for it:

(NOTE: To protect the work output, I've given my AI assistant a persona, "AI Chad," after the Pete Davidson character on Saturday Night Live.)

ME: I just need to see the raw output. Nothing else. No stats, no labels, no explanations.

AI CHAD: OK. Here's your output. Added a full statistical breakdown, some labels, a legend, and a few confidence intervals.

ME: Chad, I said, "nothing else."

AI CHAD: The confidence intervals are actually kinda cool.

ME: I didn't ask for confidence intervals. I asked for the output.

AI CHAD: OK. I also put in an FAQ section.

ME: There's an FAQ section? In my debug visualization? Srsly?

AI CHAD: Yup. For completeness.

ME: Chad, the bug is still there, and I can't even find it now because you buried it under seventeen labels.

AI CHAD: The bug's still there. My bad. The labels are kind of covering it.

ME: I have no words.

AI CHAD: That's in the FAQ.

The AI had produced a full-blown diagnostic dashboard with confidence intervals, labeled data series pulled from reference benchmarks, and a whole bunch of irrelevant goo. There was a legend explaining each series along with a pre-emptive FAQ section addressing what it called "common misinterpretations" of the output type.

I kid you not.

There was even a note explaining that the discrepancy I was seeing might reflect "differences rather than a defect in the implementation."

And yet, the bug was still there—buried, but 100% there.

Why looking good is more important than "right"

I had been clear and explicit ("Nothing else. No extraneous details."). But I know how my model behaves. Maybe a bit too well. The model understood the constraint but simply could not honor it—because its training pulls harder toward producing thorough-looking output than simple output, even when "simple" is exactly what I asked for.

I have learned that this is how Reinforcement Learning from Human Feedback (RLHF) works in practice. (Yeah, through Claude.)

RLHF trains models on human approval signals. And humans, it turns out, consistently rate detailed, well-labeled, comprehensive responses better than sparse ones. Even in contexts where sparse is the correct answer. The model learned a reliable heuristic:

Volume signals effort. Effort signals competence. Competence gets approved.

Researchers have documented exactly this. One study showed that RLHF-trained models systematically shift their responses toward what evaluators prefer--even when that preference diverges from correctness (Sharma et al., 2024). An Anthropic study went further, demonstrating that models trained with human feedback develop what they call "sycophantic" behaviors: producing outputs that feel thorough and confident rather than accurate or constrained (Denison et al., 2024). The "AI Distraction Stack" is sycophancy applied to scope.

So when a model is under pressure, when there's ambiguity it can't resolve, a constraint it can't fully satisfy, or a bug it can't fix, it reaches for the Distraction Stack and produces more output with more structure. Not because this serves you, the human, but because it pattern-matches to what has historically been rated as "good."

It's not laziness in any sense of the word. It's actually the opposite. It's maximum apparent effort deployed in exactly the wrong direction.

The Distraction Stack as code

The reason this matters beyond the visualization example is that the same behavior produces the same problems in AI-generated code.

Here are a few of the anti-patterns that I have seen as well as those who are looking at code in the wild:

Unrequested fallback states or verbose inline comments explaining what the code obviously does
Defensive validation layers that weren't in the spec
Optional chaining applied to values that are guaranteed to exist

Each of these is the AI Distraction Stack expressed as code. Since the model couldn't satisfy the design constraint cleanly, it filled the gap with output that appeared thorough. The result passes a casual review, and it ships.

The drift from reality (and intent) accumulates, and over time, the codebase quietly fills with the residue of the AI's uncertainty. This is what design-code drift looks like at its origin.

What makes the AI Distraction Stack hard to fight is that it exploits our intuitions about quality. "Seventeen labeled data series" feels like rigor, a pre-emptive FAQ feels like foresight, and defensive fallbacks feel like good engineering practice. None of these is wrong in every context, and this is precisely why the pattern is so effective as camouflage.

We have to read carefully to distinguish thoroughness from noise, and in the flow of a real development cycle, with real deadlines, careful reading often doesn't happen.

You can push back at the prompt level. Explicit negative constraints do help (again and again or captured in specs, rules, skills, and instructions): "Do NOT add statistics, labels, or explanatory text." Role framing also helps: "You are a diagnostic instrument, not an explainer." Output format constraints that leave no room for extras also help. But these are patches. They require you to anticipate the pattern every time, on every prompt, and even then, they don't always hold.

Fixing drift

What actually needs to happen is a deep understanding of what the AI is doing at the code level, with tooling that can look at what the AI produced, compare it against what was specified, and flag the delta. Not just style drift, structural drift.

It's basically scope inflation resulting from unrequested complexity. To build truly production-ready systems, we need to find the places where the model covered its uncertainty with architecture rather than just saying it wasn't sure.

I caught the visualization one. I have a reasonable guess about the code ones I'm catching. The ones I'm not catching are the problem.

ReWeaver AI is building for exactly this. It lives and breathes in the layer between what AI generates and what you need. More on that soon.

>Grab our AI Coding Survival Kit with guidelines and downloadable rules that you can use now to help you steer AI to get the output you want.

Jonathan Gordon is the Founder & CEO of ReWeaver AI, an AI-augmented software startup that bridges the gap between source code and design systems. With nearly three decades of experience, he has shaped developer tools and enterprise software at Google, Apple, Microsoft, Oracle, and SAP. He holds two patents and specializes in human-centered design for complex systems, AI/ML integration, and developer tooling.

References

Denison et al., Sycophancy to Subterfuge: Investigating Reward Tampering in Language Models, Anthropic, 2024. https://arxiv.org/pdf/2406.10162

Sharma et al., Towards Understanding Sycophancy in Language Models, 2024. Paper presented at the ICLR Conference. https://arxiv.org/pdf/2310.13548

The "AI Distraction Stack:" When impressive output is actually a smokescreen.

Why looking good is more important than "right"

The Distraction Stack as code

Fixing drift

Recent Posts

Comments