When we started building at Autobutler, nobody planned for the app to be 11 years old and still mostly monolithic. But that's what happened. The Rails app kept working, kept scaling, and kept shipping features. By the time I joined in 2022, we were handling millions of requests daily across a complex marketplace with payments, logistics, and two-sided marketplace matching—all in a single codebase.
When I looked at how other companies were doing it, they were either running 50-person teams with separate services or spending all their time on orchestration and debugging network calls. We did neither. This is what worked for us.
Separate ownership before you separate code
The real problem with monoliths isn't the codebase size. It's the ownership problem. When nobody owns anything, changes ripple everywhere. Three people touch a payment file, all with different ideas about how it should work, and suddenly you're shipping inconsistent behavior.
We fixed this without touching the deployment model:
- Domain directories. We reorganized the codebase from "app/controllers" and "app/models" to "app/domains/bookings", "app/domains/payments", etc. Every domain owns its models, controllers, and services. It's clear who "owns" what.
- CODEOWNERS file. We added a CODEOWNERS file that required at least one person from each domain to approve changes to their directory. This isn't burdensome—it's clarifying. It made it obvious that changing the booking logic required booking team approval.
- Public interfaces. Each domain has a public interface file that documents what other parts of the codebase can use. We treat internal implementation details as private. This prevents tight coupling.
This is what you get from an organizational boundary without paying the cost of a service boundary. No network calls, no distributed transactions, but clarity about who can change what.
Real pattern
We caught a bug where the payments team was adding a new field to bookings without checking with the bookings team. With CODEOWNERS, this now requires approval. Would we have caught it eventually? Maybe. But we caught it immediately, before it shipped.
Optimize the feedback loop first, not the architecture
When people say "our monolith doesn't scale," they often mean "I wait 20 minutes for tests to run" or "I can't see what's broken without looking at 15 different services." It's not usually a throughput problem. It's a velocity problem.
We made three investments that matter way more than breaking up the code:
- Fast tests. We split our test suite by domain. Booking tests don't run payment tests. Payment tests don't run notification tests. We went from one huge test suite running 45 minutes to parallel runs of 6-8 minute domain tests. This alone cut deploy cycle time in half.
- Preview environments. Every pull request gets a full replica of production—database, redis, the works—for 30 minutes. Engineers can actually see their changes in context. This caught a thousand bugs that wouldn't show up in local dev.
- Observability that actually works. We added error tracking (we use Sentry), logs structured by domain, and a dead-simple dashboard that shows "what changed in the last 10 minutes?" When something breaks, you know immediately what deploy caused it and what domains are affected.
Teams that pull apart their monolith without fixing these things end up with slow tests and bad observability across ten services instead of one. That's not an improvement.
When you actually need to split the monolith (hint: it's organizational)
After we got ownership and tooling right, we asked: do we need to split this? The answer was: not yet. And we're four years past when conventional wisdom said we would.
Here's the signal we actually pay attention to: if two teams cannot move independently within the same codebase, then you have a problem. Not because of the code—because of the coordination cost. One team's deploy can break another team's code. Changes require complex coordination. The deployment process becomes a bottleneck.
If that's not happening, don't split.
The pragmatic parts that actually matter
Beyond ownership and tooling, a few technical patterns buy you a lot of headroom:
- Async jobs for slow work. Anything that takes more than 200ms gets kicked to a background job. User doesn't wait. We use Sidekiq with straightforward retry logic. This is boring Rails stuff that works.
- Read replicas for specific queries. We have a replica that serves all analytics and reporting reads. The main database handles transactions. You don't need separate services for this.
- Caching like you mean it. Fragment caching for views, low-level caching for expensive queries, whole-object caching for data that doesn't change. Most Rails apps can double their throughput with better caching.
- Database constraints and indexes. Foreign key constraints that your ORM enforces. Indexes on columns you actually filter by. This isn't glamorous but it prevents a thousand bugs.
None of this requires rewriting your architecture. It's just being intentional about how you use the tools.
The honest part: some teams do need to split
If you've got 50+ engineers working on the same Rails app, coordination costs get real. Different teams deploy at different velocities. Testing becomes expensive because everything affects everything. At that scale, service boundaries make sense.
But most companies never get there. And the ones that do usually wish they'd made the organizational boundaries clearer before they split the code. Because splitting the code doesn't magically make coordination easier—you just have network calls on top of the same organizational problems.
If you're dealing with a Rails app that feels like it's falling apart at the seams, the answer probably isn't a rewrite. Talk to us and we can walk through what's actually slowing you down.