After Vibe Code, Vibe Clean: Turning ‘It Runs’ Back Into ‘I Understand’

12 minute read

Published:

After Vibe Code, Vibe Clean: Turning “It Runs” Back Into “I Understand”

0. Background: AI Makes Writing Code Too Easy — and Making a Mess Too Easy

AI coding tools like Cursor and Claude Code make shipping features incredibly fast: you describe what you want, it fills in a ton of implementation detail, and your code quickly “just runs.”

But the flip side is equally obvious: these tools also make it way too easy to turn your system into a mess. This is especially true in dynamic languages like Python, where the language gives you enormous freedom:

  • Passing data around as dicts faces almost no friction
  • Fields can appear or disappear at any time
  • The same concept can go by many aliases
  • Defaults and fallbacks can be sprinkled in anywhere

And when AI writes code, one of its favorite strategies is: add more compatibility layers and stronger fallbacks to make it “work right now.” This is extremely effective in the short term, but over time it makes the system harder and harder to maintain.

Contrast this with a strongly-typed language like Rust, where the situation is often the opposite: you can’t easily “just throw a dict at it and move on,” because the compiler forces you to answer some questions:

  • What does this data actually look like (what’s its structure)?
  • Which fields are required and which are optional?
  • Will this data be modified along the way? Who’s allowed to modify it?
  • How are errors represented?

So I’ve come to believe more and more that, beyond vibe coding (rapidly churning out features), we also need vibe cleaning: in an environment like Python where it’s “too easy to paper over things,” intentionally writing the data shapes, invariants, and failure semantics back into the code.

1. A Generic Example: Why Do LLM Workflows Keep Getting Thicker?

Imagine you’re building a small LLM workflow that turns user questions into answers. It’s broken into several steps:

  • Parse: extract inputs (question, user info, preferences)
  • Retrieve (optional): fetch relevant document snippets
  • Generate: produce a draft answer
  • Verify: check / self-critique / format validation
  • Finalize: output the final structure (body, citations, confidence, error info)

For convenience (and to make it easier for AI to participate in coding), the interfaces usually end up looking like:

async def step(request: dict) -> dict:
    ...

This feels great at the start: flexible, fast to iterate on, add fields whenever you want. But as features pile up, you quickly hit maintenance problems: what shape is the data at each step? Which fields are guaranteed to exist? What structure does a failure return?

If these questions don’t have clear answers, the system starts behaving “roughly correct but hard to explain; changes are risky, and regression costs are high.”

2. Observation: Python + AI Naturally Produces “dict Parsing Everywhere”

In Python, when you use AI to add features, you often see this pattern emerge:

input_data = request.get("input", {})
ctx = request.get("ctx", {})
params = request.get("params", {})

question = (
    input_data.get("q")
    or input_data.get("question")
    or ctx.get("last_question")
    or ""
)

lang = params.get("lang") or ctx.get("lang") or "zh"

This kind of code is essentially doing two things:

  • Giving the same concept multiple source paths (q / question / last_question …)
  • Burying priority and defaulting strategies inside local implementations (potentially different at each step)

In a dynamic language, this is easy to “get running”; with AI assistance, it’s even easier to keep copying and expanding. The result: every feature added brings more “compatibility branches” and “fallbacks,” and the system keeps getting thicker.

3. Root Cause: The Problem Isn’t Just Vibe Coding — It’s That the Language Lets You Make Invariants Implicit

Here’s the key thing to grasp: the complexity really comes from implicit invariants.

An invariant means: for the system to run correctly, certain constraints must hold steady, such as:

  • question must be a non-empty string
  • citations is a list and each item must carry a url
  • Failures must carry an error_code, otherwise the caller can’t make decisions

In Rust, you’re generally forced to encode these into the type system (Option<T>, Result<T, E>, whether struct fields are optional, borrowing / mutability). In Python, you can simply not write them — then paper over the gaps with or "", get(..., {}), try/except: pass.

This is the crux: Python allows you to not answer the question “what is the data contract?” and still keep writing. And AI tools, driven by a “local correctness” incentive, naturally tend to spread this papering-over approach globally.

When structural information lives not in the code but in the reader’s head, comprehensibility drops fast as functionality grows.

4. Corollary: Why Systems Drift Toward “Add Only, Never Reuse”

This chain is remarkably stable:

  • Steps communicate via dict → dict; fields are optional and mutable
  • A new requirement arrives; the fastest fix is to accept one more input format and add one more fallback
  • More fallbacks → field semantics get blurrier (synonym fields / alias fields / half-baked fields coexist)
  • Blurrier semantics → you’re less confident reusing existing modules (you don’t know what shape they depend on)
  • Eventually, every iteration tends toward “write yet another branch that runs”

The conclusion is direct: the system gets messy not because you’re writing fast, but because every iteration adds more uncertainty.

5. Vibe Cleaner: Adding Back What Rust Would Have Forced You to Do

If you think of Rust’s strong typing as “mandatory upfront decision-making,” then what vibe cleaner does in Python is add those decisions back after the fact:

  • What is the data structure?
  • Which fields are required, which are optional?
  • Where is data allowed to be modified, and where isn’t it?
  • How are errors expressed, and how should callers handle them?

Below are the three most universally applicable ways to do this.

5.1 Entry Convergence: Centralize dict Parsing in One Place

Goal: the main flow no longer calls .get() everywhere guessing at fields. You only handle aliases / defaults / priorities in a single entry parser.

from dataclasses import dataclass
from typing import Optional, Any


@dataclass(frozen=True)
class StepRequest:
    question: str
    lang: str = "zh"
    user_id: Optional[str] = None
    raw: dict[str, Any] | None = None


def parse_request(request: dict) -> StepRequest:
    input_data = request.get("input", {})
    ctx = request.get("ctx", {})
    params = request.get("params", {})

    question = (
        input_data.get("q")
        or input_data.get("question")
        or ctx.get("last_question")
        or ""
    ).strip()

    lang = (params.get("lang") or ctx.get("lang") or "zh").strip()

    return StepRequest(
        question=question,
        lang=lang,
        user_id=ctx.get("user_id"),
        raw=request,
    )

The value here is very much like Rust: you’ve at least expressed the data contract centrally at the entry point.

5.2 Semantic Convergence: Turn “One Field, Many Shapes” Into a Single Shape

For example, if citations comes in multiple representations, normalize once so the main flow only touches one structure.

from dataclasses import dataclass
from typing import Any


@dataclass(frozen=True)
class Citation:
    url: str
    title: str = ""
    snippet: str = ""


def normalize_citations(x: Any) -> list[Citation]:
    if not x:
        return []
    if isinstance(x, list) and x and isinstance(x[0], str):
        return [Citation(url=u) for u in x]
    if isinstance(x, list) and x and isinstance(x[0], dict):
        return [
            Citation(
                url=i.get("url", ""),
                title=i.get("title", ""),
                snippet=i.get("snippet", ""),
            )
            for i in x
        ]
    return []

This is essentially manually creating a “type boundary” in Python.

5.3 Fixed Error Semantics: Failures Should Have Structure Too

In Rust, Result<T, E> forces you to handle failures explicitly; in Python, it’s easy to mix nulls, exceptions, and dict flags. The goal of cleaning is to make failure semantics fixed.

from dataclasses import dataclass


@dataclass(frozen=True)
class StepResult:
    ok: bool
    data: dict
    error_code: str = ""
    error_message: str = ""


def ok(data: dict) -> StepResult:
    return StepResult(ok=True, data=data)


def fail(code: str, msg: str, data: dict | None = None) -> StepResult:
    return StepResult(
        ok=False,
        data=data or {},
        error_code=code,
        error_message=msg,
    )

This way, callers don’t need to guess by checking “empty or not,” and don’t need try/except everywhere.

6. Verification: Did You Really Become More Maintainable?

Whether vibe cleaning is effective can be checked with three metrics:

  • Entry convergence: is .get()-style parsing basically confined to a single entry point?
  • Main-flow branch reduction: have the if/elif chains in the main flow (caused by “shape uncertainty”) significantly decreased?
  • Consistent failure paths: are failures always expressed in the same structure? Do callers no longer infer from nulls?

These metrics measure: how much does the reader have to mentally fill in? The less they have to fill in, the cheaper maintenance becomes.

7. Conclusion: AI Makes Coding Faster, But Doesn’t Automatically Make Systems Clearer

Strongly-typed languages use a compiler to force you to make decisions up front. Python’s freedom + AI’s “local correctness” tendency naturally push systems toward “it runs but it’s hard to understand.”

The point of vibe cleaning is: after you’ve enjoyed the speed of vibe coding, add back the “data contracts that should have been explicit,” and pull your system back from the direction of unmaintainable.

Of course, you might ask: if you keep praising languages like Rust so much, why don’t you write your own code in Rust? The honest truth is, Rust has its own pain — it’s complex to write, I’m not that strong with it, and most of the recent agent tools have better library support in Python.

Leave a Comment

LinkedIn QQ空间 知乎