Rewriting is a beautiful trap

Many developer has looked at a legacy codebase and thought: "We should just rewrite this." The old code is messy, hard to change, and nobody fully understands it anymore. A rewrite promises a clean slate — modern patterns, better performance, no baggage. It's easy to get excited about.

But here's my take: the mess you're looking at is not an accident. It's the accumulated weight of every real-world problem the system had to solve. Those weird conditionals, those oddly specific error handlers, those features that seem pointless — they exist because a customer hit an edge case, or a production incident forced a quick fix, or a requirement changed halfway through a release.

When you rewrite, you throw all of that away. And then you slowly rediscover why it was there in the first place.

Why developers love rewrites

It's worth being honest about this. Rewrites feel good. You get to pick your own tech stack, design clean interfaces, and work without years of accumulated constraints. For a while, everything moves fast because you're only solving the problems you already know about.

But that speed is an illusion. The old system isn't slow because the code is bad. It's slow because it handles hundreds of edge cases you forgot existed. The new system feels fast because it doesn't handle them yet.

The migration trap

A few years back, we decided to rewrite our legacy API gateways. The old system had real issues — poor performance, mixed data plane and control plane, no support for modern protocols. The pitch was bold and simple: throw away the old stuff, build a next-generation gateway, and migrate everyone over.

Sounds great in a planning meeting. Here's what actually happens:

You don't want to run both systems side by side. That means double the engineering effort and double the infrastructure cost. So the plan becomes: migrate existing customers to the new platform, then shut down the old one.

But those customers depend on legacy features. Not features we chose to keep — features our customers built their own products on top of. They've shipped multiple releases around our legacy APIs. They can't just switch overnight, and in many cases, they can't even control the deprecation timeline on their end.

So now you need feature parity before you can migrate anyone. You're rebuilding the same behaviors in the new system, except without years of production hardening. And you're doing it under pressure, because every month you run both systems, you're burning money and splitting your team's focus.

Good news is, so far, we've successfully migrated our largest, highest-throughput customers to the new gateway. It was a long and painful process, and we're still moving the rest over at the moment. So the rewrite wasn't a failure — but it was nowhere near as clean or quick as we'd hoped.

When rewriting actually makes sense

I'm not saying rewrites are always wrong. They work when the conditions are right:

You can afford to break compatibility. This is the big one. If you're building a new version of a product and you're willing to tell customers "v2 is different, some old features are gone," a rewrite can work. But if you need full backwards compatibility, the rewrite just becomes a reimplementation — all the cost, none of the freedom.
The problem has fundamentally changed. If the old system was designed for a completely different scale, or a different set of requirements that no longer apply, refactoring might not be enough. Sometimes the architecture itself is the bottleneck, and you genuinely need a new foundation.
You have a hard cutover date. Running two systems in parallel kills teams. If you can't commit to a date where the old system dies, don't start the rewrite.

Notice the pattern: rewrites work when you're building something genuinely new, not when you're trying to recreate what you already have but cleaner.

Most of the time, refactor

The boring truth is that incremental refactoring almost always wins. Not because it's more fun — it's not. Nobody gets to announce a shiny new platform. But it has one massive advantage: you never stop shipping.

You can refactor a module, deploy it, and validate it in production the same week. A rewrite might take six months before you even know if the approach works. And if it doesn't, you've lost those six months.

The best refactoring I've seen follows a simple pattern: identify the worst part of the system, draw a clean boundary around it, improve what's inside that boundary, and repeat. Over time, the system gets better without ever putting customers at risk.

The real question

Before you pitch a rewrite, ask yourself this: are you trying to solve a real architectural problem, or are you just tired of looking at ugly code?

If it's the first one, make sure the conditions are right — you can break compatibility, the requirements have changed, and you have a clear plan to kill the old system. If it's the second one, refactor. Ugly code that works is worth more than beautiful code that doesn't exist yet.