portfolio Anshul Bisen
ask my work

When your SLOs and your sales team disagree, the SLOs lose

An enterprise prospect required 99.99% uptime. Our SLOs were 99.9%. Engineering leaders who refuse to engage with commercial reality get bypassed.

A Fortune 500 enterprise prospect was ready to sign. Seven-figure annual contract. The kind of deal that changes a startup’s trajectory. One problem: their procurement team required 99.99% uptime in the service level agreement. Our internal SLOs were set at 99.9%.

The difference between 99.9% and 99.99% looks small on paper. It is the difference between 8.7 hours of downtime per year and 52.6 minutes. In engineering terms, it is the difference between a reasonably reliable system and one that requires redundancy across availability zones, automated failover, and an on-call rotation that can respond in minutes, not hours.

This is where the real argument actually lives.

Leadership got more concrete for me once I realized release engineering and infrastructure are really trust systems. It also builds on what I learned earlier in “The Prisma-to-Drizzle migration we almost did and why we stayed on Prisma.” The infrastructure stack, ctrlpane, and even my dotfiles all orbit the same idea now: the best teams move fast because the defaults are stable, not because the heroics are impressive.

The infrastructure mess that made the lesson stick.

The Engineering Instinct Is Wrong

My first instinct was to push back. We are at 99.9%. Getting to 99.99% is a major infrastructure investment. Let sales negotiate the SLA down. That is the standard engineering leader response, and it is wrong.

It is wrong because engineering does not exist in a vacuum. The sales team is not asking for 99.99% because they enjoy making engineering’s life harder. They are asking because the customer requires it, and the customer requires it because they are processing financial transactions through our platform. For a Fortune 500 company, 8.7 hours of downtime translates to millions in unprocessed transactions. Their requirement is reasonable.

Engineering leaders who refuse to engage with commercial reality do not win the argument. They get bypassed. The CEO signs the SLA without engineering input. The sales team learns to stop consulting engineering. And the engineering team ends up contractually committed to a target they had no part in defining.

The Negotiation That Worked

Instead of pushing back, I engaged. I sat down with the sales lead and the prospect’s technical team and had a different conversation. Not “we cannot do 99.99%” but “here is what 99.99% requires, what it costs, and how we get there.”

  • Multi-AZ deployment with automated failover: 6 weeks of engineering work and an estimated $3,200/month increase in infrastructure cost.
  • Reduced deployment windows with blue-green deployment automation: 3 weeks of engineering work, already on our roadmap for Q1.
  • On-call rotation with 5-minute response SLA during business hours: requires hiring one additional SRE. Approximately $150K fully loaded annual cost.
  • Monthly SLA reporting with automated monitoring: 2 weeks of engineering work using existing Grafana infrastructure.

Total investment: roughly 11 weeks of engineering time and $190K in annual recurring cost. The contract was worth significantly more than that annually. The math was clear.

Where We Pushed Back

We did not accept 99.99% for everything. The negotiation focused on specificity:

  • Payment processing API: 99.99% uptime. This is the customer-facing critical path.
  • Reconciliation reporting: 99.9% uptime. Reports can be delayed without impacting transactions.
  • Admin dashboard: 99.5% uptime. Internal tooling does not need the same guarantees as the transaction pipeline.
  • Maintenance windows: excluded from uptime calculations with 48-hour advance notice.

By segmenting the SLA by service tier, we committed to 99.99% only where it mattered and maintained more relaxed targets everywhere else. The customer’s procurement team accepted this because their actual concern was transaction processing reliability, not dashboard availability.

What I Learned

The experience changed how I think about the relationship between engineering and sales. Before this deal, I viewed SLA negotiations as sales making promises engineering had to keep. After this deal, I view them as a collaboration where engineering quantifies the cost and sales quantifies the value.

Alignment usually looks like constraint made explicit.

By the time I wrote this, the lesson was bigger than the tool or incident. The job had become setting defaults a team could trust, then proving those defaults in systems like ftryos and pipeline-sdk. That is leadership work, not just technical taste.

When SLOs and sales disagree, the right response is not to defend the SLOs. It is to price the gap. Engineering leaders who can translate reliability requirements into cost and timeline become strategic partners. Those who just say no become obstacles.

We signed the deal. The infrastructure improvements we made for this one customer benefited every customer. Our actual uptime improved across the board. The SRE hire made our on-call rotation sustainable. The blue-green deployment automation accelerated our deployment frequency.

The enterprise customer did not just buy our product. They bought the forcing function that made our infrastructure better. That is the part nobody talks about when they complain about enterprise requirements. Sometimes the demanding customer is doing you a favor by making you build what you should have built anyway.

The disagreement between SLOs and sales targets is a proxy for a deeper organizational tension: engineering optimizes for reliability while sales optimizes for feature velocity. Resolving the tension requires making the tradeoff explicit at the leadership level instead of letting it play out as passive-aggressive ticket prioritization between teams. We built a shared dashboard that showed both SLO burn rate and feature delivery velocity on the same screen. When leadership could see both metrics together, the conversations shifted from blame to prioritization. The SLO framework did not eliminate the tension. It made the tension productive by giving both sides a shared language and a shared view of the data. That shared view changed everything about how we prioritized work.