top of page

The design-code roundtrip that isn't

  • Writer: Jonathan Gordon
    Jonathan Gordon
  • Mar 6
  • 7 min read

Updated: May 6

Figma, Claude, and Codex just announced bidirectional design-code workflows. Here's what they actually shipped — and the pain it leaves completely unsolved.


Black and white image of a road on a mountainside that goes into fog.
Photo Credit: James Lynch | Vecteezy

Takeaways

  • The “roundtrip” model is full of guesses at every step.

  • AI writes code faster than anyone can check it. Drift is multiplying rapidly.

  • The more you use the loop, the further you drift.

  • AI generates; it doesn't maintain.


The demos look seamless. A Figma frame becomes React code in seconds. A running app becomes editable Figma layers. The narrative writes itself: design and code, finally in sync, AI-powered, bidirectional, round-trip.

None of that is actually happening. And the gap between the story and the reality is exactly where your next six months of pain is going to come from.


What Figma's Design-Code Roundtrip Actually Does

What it does: The design-to-code direction—get_design_context—reads the node tree of a Figma frame and passes that data to an LLM, which generates code.

What it doesn’t do: By default, it doesn't read your codebase, doesn't know your components, and doesn't know your design tokens. It generates fresh code from design data, probabilistically, as if your repository is a blank page. Figma’s own developer documentation is direct about this: Without manually maintained Code Connect mappings, “the model is guessing” [1].


What it does: Figma’s Code Connect feature is a real step toward solving the guessing problem. It lets teams manually map Figma components to their corresponding code components, so the LLM gets import paths and usage snippets instead of guessing from scratch. That is genuinely useful. 

What it doesn’t do: Code Connect is a manually-maintained mapping layer—someone has to create each mapping, keep it current as the codebase evolves, and extend it every time a new component ships. It tells the LLM which component to use. It does not tell you whether the component used last week still matches the design updated yesterday. The mapping is a hint to the generator, not a check on what the generator produced.


What it does: The code-to-design direction—generate_figma_design—opens a browser, loads your running app, takes a screenshot of the rendered DOM, and uses AI to reverse-engineer approximate Figma layers from those pixels.

What it doesn’t do: It's not reading your source code. It is not reading your design system. The layers it creates are not linked to your components, your variables, or your tokens. They are a screenshot-derived shadow of what the running UI looked like at the time of capture.

The "roundtrip" is design data → LLM generates code → browser renders it → screenshot → AI infers approximate layers. Every arrow is a probabilistic guess. Nothing is compared. Nothing is reconciled. Error accumulates at every step.

What it does: This is a fast iteration tool for building new things.

What it doesn’t do: It only operates on what you’re building right now, in this session, starting from scratch. It has nothing to say about the thousands of components already running in your production codebase. And it has no mechanism—none—for detecting whether what just got generated matches your actual design system, or whether it will still match it in three months.


The Pain That's Actually Coming

Here is what is happening right now in every team that has adopted AI coding assistants, whether they can see it yet or not.

Drift is accelerating invisibly.

The velocity of AI coding has multiplied how quickly new code gets written. It has not accelerated the ability to verify that the code is aligned with the design system. According to GitClear’s analysis of 153 million lines of code, code churn (lines revised or reverted within two weeks of being written) was projected to double compared to its pre-AI baseline [2]. Their follow-up study of 211 million lines found that refactoring activity dropped from 25% of all code changes to under 10% between 2021 and 2024, while copy-pasted code rose from 8% to over 12% in the same period [3].

Google’s 2024 DORA report, drawing on a decade of data from over 39,000 practitioners, found that every 25% increase in AI adoption was associated with a 7.2% reduction in delivery stability [4]. There is more code, faster—but it's less reliable, less consistent, and harder to maintain.

The adoption numbers make this urgent. The Stack Overflow 2025 Developer Survey found that 80% of professional developers are now actively using AI tools, up from 44% in 2023 [6]. This is not a coming wave. It is already the default way software gets written.


Roundtrip tools make this worse, not better.

When generate_figma_design captures your running UI as Figma layers, those layers are not your design system. They are a pixel-derived approximation of one moment in your app's rendered state. If you make changes to those layers and then use get_design_context to generate more code from them, you are now iterating on a probabilistic approximation of an approximation — progressively further from your canonical design intent with every pass through the loop.

The more a team uses these tools, the faster the semantic connection between the design system and the codebase degrades. The demos show a beautiful first iteration. Nobody shows iteration twelve, after six engineers have each run through the loop with their own prompts, their own Figma selections, and their own AI-generated output.


No one knows what drifted. Or where. Or by how much.

This is the specific pain that is coming (and it’s already here for teams that moved fast with AI over the last year). Technical debt is now the number one frustration reported by developers, and twice the rate of the next biggest pain point [5]. The tools generating the debt are only moving faster.

Manual auditing is time-intensive archaeology: tracing changes, reconciling competing truths, deciding which side is authoritative. And this arduous process only finds a fraction of the actual drift. And by the time it's finished, new drift has already accumulated.

Inspection is the wrong response to drift because by the time it can be detected, the damage is already distributed. What teams need is governance—continuous, structural enforcement that makes alignment the default, not the exception.

The solution to drift is not better generation. It's semantic comparison between canonical representations, continuous detection of divergence regardless of which side changed, and governed propagation of resolutions, with a human in control of every change that crosses the boundary between design and code.


What Actually Solves This

The solution to drift is not better generation. It is not a smarter screenshot-to-layers pipeline. It is not a more capable LLM that approximates your token system slightly less often.

The solution is semantic comparison between canonical representations, continuous detection of divergence regardless of which side changed, and governed propagation of resolutions, with a human in control of every change that crosses the boundary between design and code.

This is what ReWeaver AI is building.


Drift governance: making the invisible visible.

ReWeaver AI continuously compares what your design system specifies against what your codebase actually implements — semantically, not just visually. It understands that different representations can mean the same thing, and it knows when those equivalences break. When the implementation diverges from the design intent, ReWeaver AI surfaces it: what drifted, where, and by how much.

This works regardless of which side changed. A designer updates a component; ReWeaver AI knows the code hasn’t followed. An engineer ships a divergence; ReWeaver AI knows the design spec wasn’t updated. Neither direction is invisible. And it works across your entire production codebase — not just the component you’re currently looking at.

Governed bidirectional sync: Closing the gap on human terms.

Detection without resolution is a reporting tool. ReWeaver AI goes further. When drift is detected, it can propose a resolution — a code change that reflects the updated design, or a design update that reflects what shipped in code — using your team’s actual components, tokens, and conventions. Every proposed change goes to a human for review before anything crosses the boundary between design and code.

The automation level is configurable — some teams review everything, others automate low-severity categories — but the principle is constant: the AI surfaces and proposes, the human governs.

That's the distinction that matters. Generation tools give you AI-produced output in both directions. What’s missing is the layer that knows whether that output is still aligned with your design system tomorrow — and that can close the gap when it isn’t, on your terms.

Use Every Tool You Have

Use Figma's MCP tools. Use Claude Code. Use Codex. Use Cursor. Generate as fast as you can. The velocity is real, and there is no reason to leave it on the table.

Just understand what those tools do and don't do. They are giving you a fast starting point for new code. They are not maintaining the semantic integrity of your design system against your production codebase. They are not detecting the drift that accumulates every time any engineer generates anything. They are not closing that gap in a way that is repeatable, governed, or grounded in your actual design intent.

This is what's missing. This is the pain that's coming — quietly, invisibly, compounding — for every team that is moving fast right now without it.

ReWeaver AI is the part of this workflow that nobody has built yet. The part that comes after generation, and makes generation sustainable.


>>Grab the AI Drift Prevention Toolkit with downloadable rules you can use right now to steer AI and get the output you want.

--------------------

References

  1. Figma Developer Docs, "Structure your Figma file for better code." Figma, 2025.

  2. Harding, B. & Kloster, M., "Coding on Copilot: 2023 Data Suggests Downward Pressure on Code Quality." GitClear, January 2024.

  3. Harding, B., "AI Copilot Code Quality: 2025 Data Suggests 4x Growth in Code Clones." GitClear, 2025.

  4. Google DORA Team, "2024 Accelerate State of DevOps Report." Google Cloud, October 2024.

  5. Stack Overflow, "2024 Developer Survey: AI." Stack Overflow, July 2024. 

  6. Stack Overflow, "2025 Developer Survey." Stack Overflow, 2025.

-------------------

Jonathan Gordon is the founder/CEO of ReWeaver AI. He has worked as a user-focused software designer leading design and engineering teams at Google, Microsoft, Oracle, Facebook, SAP, and others. 


 
 
 

Comments


bottom of page