The Rewrite Trap: Why Starting From Scratch Almost Never Works
Your legacy codebase is painful. The temptation to rewrite it from scratch is real. But rewrites fail far more often than they succeed - and the reasons are predictable.
At some point, every engineering team looks at their codebase and says the words: “We should just rewrite it.”
The reasoning always sounds solid. The current system is a mess. Nobody understands the billing module. The framework is three major versions behind. Every new feature takes twice as long as it should. A clean rewrite would fix all of that. Fresh start. Modern stack. Proper architecture this time.
It’s one of the most seductive ideas in software. It’s also one of the most dangerous.
Why rewrites feel like the answer
Legacy systems have a specific kind of pain that’s hard to articulate to anyone who hasn’t lived it. You open a file and find code that was written by someone who left two years ago, solving a problem nobody remembers, using a pattern nobody else follows. There are comments like “temporary fix - TODO” dated 2021. The test suite covers 30% of the codebase, and half of those tests are broken.
Working in code like this feels like wading through mud. Every change requires archaeology. Every deployment feels like defusing a bomb. And when someone proposes a clean rewrite, it sounds like the obvious solution.
But here’s what teams consistently underestimate: that messy code is encoding years of business logic, edge cases, and hard-won lessons. Every weird conditional, every strange workaround, every “why would anyone do this” moment - there’s usually a reason. You just don’t know what it is yet.
The predictable failure pattern
Rewrites follow a remarkably consistent trajectory.
Months 1-3: Euphoria. The new codebase is clean. Progress is fast. The team is excited. Everything that was hard in the old system is easy in the new one. Leadership gets optimistic timelines.
Months 4-6: Discovery. The team starts hitting edge cases they didn’t know existed. That weird billing logic? Turns out it handles seven different pricing tiers that were added over three years of customer negotiations. The strange data migration script? It compensates for a vendor API that returns inconsistent formats. Every “simple” feature from the old system reveals hidden complexity.
Months 7-12: The squeeze. The new system was supposed to be done by now. It isn’t. Meanwhile, the old system still needs maintenance. Bug fixes. Security patches. Customer requests. Now the team is split between two codebases, shipping half as fast on both. Leadership starts asking uncomfortable questions.
Month 12+: The reckoning. Either the rewrite ships with fewer features than the original (and customers notice), or it gets quietly shelved, and the team goes back to the old system, demoralised and behind schedule.
Joel Spolsky called this “the single worst strategic mistake that any software company can make” back in 2000. Twenty-six years later, teams are still making it.
The real problem with rewrites
The failure isn’t technical. Teams don’t fail because they picked the wrong framework or made bad architecture decisions. They fail because of three fundamental issues.
You can’t spec what you don’t understand
A rewrite assumes you know exactly what the current system does. You almost certainly don’t. Production systems accumulate behaviour over years - not just the features in the backlog, but the implicit behaviours that customers depend on without anyone documenting them.
We’ve seen rewrites miss things like: timezone handling for a single client in a non-standard region, a CSV export that three enterprise customers built their entire reporting around, rate limiting logic that was added after an incident nobody documented.
These aren’t in any spec. They live in the code. And you won’t discover they’re missing until a customer calls.
Two systems is worse than one bad system
The moment you start a rewrite, you’re maintaining two codebases. Every bug fix has to be considered for both. Every new feature either goes in the old system (which feels like a waste) or waits for the new one (which delays value to customers).
This split attention is brutal. Your best engineers want to work on the new system. Your customers need the old one to keep working. The result is that both systems get worse - the old one from neglect, the new one from rushing.
The moving target problem
While you’re rewriting, the business doesn’t stop. New requirements come in. Market conditions change. Competitors ship features. By the time your rewrite is “done,” the target has moved. The new system is already behind on the things that matter today, because it was designed around yesterday’s requirements.
What to do instead
The alternative isn’t “do nothing.” If your codebase is genuinely painful, ignoring it isn’t a strategy either. But there are approaches that work far more reliably than the big-bang rewrite.
Strangle the monolith
The strangler fig pattern is the most proven approach to modernising legacy systems. Instead of rewriting everything at once, you build new functionality alongside the old system and gradually route traffic to the new components.
Need to modernise the payment system? Build a new payment service. Route new transactions through it. Migrate old ones over time. The old monolith shrinks piece by piece, and at no point are you maintaining a half-finished replacement.
It’s slower than a rewrite sounds. It’s faster than a rewrite actually takes.
Invest in the boundaries
Often, the most painful part of a legacy system isn’t the code itself - it’s the coupling. Everything depends on everything else. Change one thing, break three others.
Before rewriting anything, invest in creating clean interfaces between components. Add an API layer between the frontend and backend. Extract shared database access into services. Once you have boundaries, you can replace individual pieces without touching the rest.
Fix the feedback loops first
Many “we need a rewrite” conversations are actually “our development process is broken” conversations in disguise. If deploys take an hour, add CI/CD. If you’re afraid to change code, add tests to the areas you change most. If onboarding takes months, write documentation for the critical paths.
These improvements compound. A team that can deploy in minutes and has confidence in their test suite can improve a legacy codebase surprisingly fast - without the risk of a rewrite.
Know when the rewrite is actually right
To be clear: sometimes a rewrite is the correct call. If the system is on a platform that’s literally end-of-life and unsupported, if the team has zero knowledge of the original language and can’t hire for it, or if the system is small enough that a rewrite is genuinely a few weeks of work - then yes, rewrite.
The test is simple. If you can list every feature, every edge case, and every integration the current system handles, and the rewrite timeline is under three months, it might work. If you’re estimating in quarters and the spec is “everything it currently does, but better” - you’re walking into the trap.
The uncomfortable truth
The desire to rewrite is usually an emotional response to frustration, dressed up in technical arguments. And frustration is valid - working in a painful codebase is genuinely demoralising.
But the answer to “this code is hard to work with” is almost never “throw it away and start over.” It’s “make it incrementally less hard to work with.” That’s less exciting. It doesn’t look as good in a planning deck. Nobody gets to greenfield a new architecture.
It does, however, actually work. And in software, shipping beats planning every time.