Why I stopped mass-applying "best practices" and started asking "best for what"
Microservices, event sourcing, CQRS, DDD. I had been cargo-culting patterns from companies 100x our size. The question "who benefits from this complexity" changed my architectural decisions.
In my first year as Head of Engineering at FinanceOps, I made architectural decisions by looking at what successful companies did and copying their patterns. Netflix uses microservices, so I considered breaking our monolith into services. Stripe uses event sourcing, so I started designing our reconciliation engine with event sourcing. Various conference talks praised CQRS, so I separated our read and write models. Twelve months later, I had built a system that was architecturally impressive and operationally miserable.
This phase is where the title finally started to feel expensive. It also builds on what I learned earlier in “Learning to delegate when every task feels faster to do yourself.” Hiring, planning, founder conversations, and bad weeks in production all piled into the same calendar. A lot of the systems thinking I kept in lifeos and flowscape showed up here too: clarity is not paperwork, it is how you stop uncertainty from leaking into people.
The Cargo Cult Trap
The problem with “best practices” is that the label implies universality. If something is a best practice, it should be best for everyone. But architectural patterns are not universal truths. They are solutions to specific problems at specific scales with specific constraints. Microservices solve the problem of deploying and scaling independent components when you have dozens of teams. Event sourcing solves the problem of auditing every state change when regulatory requirements demand it. CQRS solves the problem of optimizing read and write patterns independently when they have dramatically different performance characteristics.
We had four engineers, one deployment pipeline, one database, and one team. We did not have the problem that microservices solve. We did not have the regulatory requirement that event sourcing addresses. Our read and write patterns were not different enough to justify separate models. I had adopted solutions to problems we did not have, and each solution brought operational complexity that we absolutely did have.
The event-sourced reconciliation engine was the worst offender. Instead of updating a row in a table when a transaction matched, we appended an event to an event store and projected the current state into a read model. This meant every read query required replaying events or maintaining a separate materialized view. When the materialized view fell out of sync, which happened roughly once a week, we had to rebuild it from the event stream. Debugging production issues required understanding both the event stream and the projection logic. A simple “why does this transaction show the wrong status” question turned into a two-hour event replay investigation.
The Question That Changed Everything
The turning point was a Friday afternoon in March 2025 when Raj and I were debugging yet another materialized view inconsistency. After 90 minutes of tracing events, Raj looked at me and said, “Who actually benefits from this complexity?”
I did not have a good answer. The event sourcing pattern was not serving a regulatory requirement because our compliance team had never asked for it. It was not serving a performance requirement because the read model we were materializing was slower than a direct database query would have been. It was not serving an audit requirement because we had a separate audit logging system that already captured every state change. The event sourcing was serving my desire to build the architecture I had read about, not the architecture our product needed.
- Microservices: reverted to a modular monolith with clear domain boundaries but a single deployable unit
- Event sourcing: replaced with standard PostgreSQL tables with an audit log trigger for compliance
- CQRS: consolidated back to a single model with database views for read-optimized queries
- Domain-Driven Design: kept the domain modeling concepts but dropped the tactical patterns like aggregates and value objects that added boilerplate without value at our scale
The Decision Framework I Use Now
Before adopting any architectural pattern, I now ask five questions. If I cannot answer all five with specific, concrete answers, the pattern stays on the shelf.
- What specific problem does this pattern solve? Not a category of problem. The exact problem in our system.
- Who experiences that problem today? An actual person on the team or an actual client, not a hypothetical future user.
- What is the operational cost of this pattern? Not just the implementation cost but the ongoing debugging, monitoring, and maintenance cost.
- What is the simplest alternative that addresses the same problem? Could a database index, a caching layer, or a code refactor solve it without a new architectural pattern?
- At what scale does this pattern become necessary? If the answer is 10x or 100x our current scale, defer it.
The simplest alternative question is the most powerful. Most problems that feel like they need an architectural pattern actually need a targeted fix. Slow queries need indexes, not CQRS. Deployment coupling needs better module boundaries, not microservices. State tracking needs an audit table, not event sourcing.
What I Kept
Not everything was thrown out. Some practices earned their place because they solved real problems we had.
- GitOps with ArgoCD: solved real deployment coordination problems on a team where anyone could deploy
- Kafka for the reconciliation pipeline: solved real async processing needs for batch workloads
- TypeScript strict mode: caught real bugs in financial calculations that loose mode missed
- Structured logging with Loki: solved real debugging needs for distributed processing
Operator mode means you inherit every downstream consequence. The code path is only half the story; the other half is how the decision warps planning, trust, and execution speed. I kept relearning that lesson while building portfolio, pipeline-sdk, and dotfiles.
The difference between a best practice and cargo culting is whether you can explain the specific problem it solves for your specific team at your specific scale. If the answer starts with “in case we need to” or “companies like Netflix do it,” you are cargo culting.
Simplifying our architecture was the most productive thing I did in Q1 2025. We removed thousands of lines of code. Debugging time dropped. New engineers ramped up faster because there was less to learn. The system was not less capable. It was more capable because the team could understand it, modify it, and operate it without fighting the complexity tax of patterns designed for companies with ten times our headcount.