Designing a multi-tenant PostgreSQL schema that does not make you cry at 3am
Shared database, shared schema, row-level security. That was the bet. Here is how it played out.
When you build a B2B fintech product, multi-tenancy is not optional. Every client expects their data to be isolated, their queries to be fast, and their neighbors to be invisible. The architecture decision that determines all of this is the tenancy model you choose at the database level. Get it wrong early and you are rewriting your entire data layer six months later. I spent a full week evaluating three approaches before writing a single migration.
In that first stretch at FinanceOps, I was still learning how to wear the Head of Engineering title without hiding behind it. It also builds on what I learned earlier in “What nobody tells you about building fintech: compliance is a product feature.” The only credibility that mattered was whether the decision survived contact with real money, ugly edge cases, and the next person I would eventually hire. That same bias toward strict boundaries later shaped how I approached ftryos and pipeline-sdk: make correctness boring before you make the API clever.
The Three Models and Why I Rejected Two
There are exactly three ways to do multi-tenancy in PostgreSQL, and the internet will argue endlessly about which one is correct. Here is what I actually evaluated.
- Database-per-tenant: Every client gets their own PostgreSQL database. Maximum isolation, nightmare operations. Managing connections, migrations, and backups across dozens of databases with a team of one is not feasible. I rejected this in under an hour.
- Schema-per-tenant: Every client gets their own PostgreSQL schema within a shared database. Good isolation, but migrations must run against every schema independently. Connection pooling becomes complicated because each connection needs to SET search_path before every query. At 50 tenants this is manageable. At 500, it breaks.
- Shared schema with row-level security: All tenants share the same tables. A tenant_id column on every table, enforced by RLS policies. One migration path, one connection pool, one set of indexes. Isolation enforced by the database engine itself, not by application code.
I chose shared schema with RLS because it gives us operational simplicity at scale while enforcing isolation at the database level rather than trusting application code to always remember the WHERE clause.
The RLS Implementation
Every table in our schema includes a tenant_id column. Every query in our system is automatically scoped to the current tenant by PostgreSQL itself. The application sets the tenant context once at the start of each request, and RLS policies do the rest.
-- Enable RLS on the payments tableALTER TABLE payments ENABLE ROW LEVEL SECURITY;
-- Policy: tenants can only see their own rowsCREATE POLICY tenant_isolation ON payments USING (tenant_id = current_setting('app.current_tenant_id')::UUID);
-- Force RLS even for table ownersALTER TABLE payments FORCE ROW LEVEL SECURITY;The middleware sets the tenant context using a PostgreSQL session variable at the beginning of every database transaction. This is the only place in the application that touches tenant scoping. Every query after that point is automatically filtered.
// Middleware: set tenant context for every requestasync function setTenantContext(tenantId: string) { await db.execute( sql`SELECT set_config('app.current_tenant_id', ${tenantId}, true)` );}The true parameter in set_config makes the setting transaction-local. It automatically resets when the transaction ends. No risk of tenant context leaking between requests, even with connection pooling.
The Batch Operation Edge Case That Almost Broke Everything
RLS worked flawlessly for three months. Then I built the nightly reconciliation job. This job processes payments across all tenants in a single batch for efficiency. It runs as a system process, not as any specific tenant. And because RLS was enforced even for the table owner, the system process could not see any rows at all.
The naive fix would be to disable RLS for the system role. That terrified me. A single misconfigured connection could expose every tenant’s data. Instead, I created a dedicated system role with a carefully scoped BYPASSRLS policy and a separate connection pool that only the batch job can use.
-- System role for batch operationsCREATE ROLE financeops_system BYPASSRLS;GRANT SELECT, INSERT, UPDATE ON ALL TABLES IN SCHEMA public TO financeops_system;- The system role connection string is stored separately from the application connection string.
- Only batch jobs running on the internal network can use the system pool.
- Every batch operation logs the system role usage for audit compliance.
- The system role has no DELETE permission anywhere. Batch jobs never delete data.
This was the hardest design decision in the entire tenancy model. The principle is clear: RLS should be the default for everything, and bypass should require explicit, audited, minimal-privilege access. Any shortcut here is a data breach waiting to happen.
Performance Considerations
The concern I hear most often about RLS is performance. Adding a WHERE clause to every query must be slow, right? In practice, with a proper composite index on (tenant_id, primary_key) the overhead is negligible. Our benchmark showed less than 2% query time increase for RLS-filtered queries compared to manual WHERE clauses.
The real performance win is that RLS prevents accidental full-table scans. If a developer forgets a tenant filter in a manual WHERE clause, the query scans every row in the table. With RLS, the planner always includes the tenant filter, so even a sloppy query is scoped correctly. The database protects you from your own mistakes at 3am when you are debugging a production issue and writing ad-hoc queries.
Looking back, this is one of those builder-phase decisions that bought me leadership credibility before I had any leadership title equity. I was still proving I could be trusted with the boring, consequential calls. That instinct carried straight into ftryos and pipeline-sdk.
Multi-tenancy is not a feature you add to an existing schema. It is a foundation you build everything on. Choose your model in week one and design every table, every index, and every query around it.