Obsidian·RegTech / Reporting·April 2025

Multi-Tenant Reporting Platform with Row-Level Security

Jurisdiction-precise reports for 180+ tenants in DACH, with guaranteed data isolation at the DB level.

180+
Tenants
12k
Reports/Month
<200ms
Query P95
DB-Level
Isolation
ReactNode.jsTimescaleDBAWSRLS
Engineers
Sarah Chen · Christopher Schmitt
Duration
11 weeks · schema-to-scale
Shape
2 audit weeks + 3 build sprints
Context
Frankfurt · RegTech

The Problem

Obsidian delivered regulatory reports to 180+ financial institutions in DACH — banks, asset managers, insurers. Every tenant had different jurisdictions (DE, AT, CH, LI, LU), different reporting standards (BaFin, FMA, FINMA), different data depths.

The existing platform was single-tenant replicated: every new customer = separate DB instance, separate deployment pipeline, separate monitoring. Onboarding time: three weeks. Cost per tenant: four-figure monthly, just for infrastructure.

Management had two pains: scaling (every new customer produced linear cost) and compliance fear (if a data leak happened anywhere in that web, it would be the end). The quotes coming in: either an expensive enterprise multi-tenant suite for six figures per year, or a complete custom rewrite with a consulting army.

We proposed a third path: Postgres-native Row-Level Security.

Discovery: Why RLS and not application-layer?

In week 1 we analyzed the existing platform. Every query already had a WHERE tenant_id = ? filter — well-intended, but enforced in the application layer. A forgotten filter in a new endpoint = data leak. A hacked session token = data from other tenants readable.

We evaluated three options:

  1. Physical isolation — as before, one DB per tenant. Secure, but expensive and slow
  2. Application-layer tenancy — the current model, cleaned up. Cheap but fragile
  3. DB-level RLS — Postgres-native Row-Level Security. Secure like option 1, cheap like option 2

Option 3 had a risk: most teams have no experience with RLS. Debug sessions can get confusing ("why do I see no data?" — "because tenant context is missing"). But once cleanly set up, isolation is physically impossible to bypass, not just conventional.

Architecture

180+ tenants
  ↓
React dashboard (white-label per tenant)
  ↓
Node.js API
  ↓ SET app.tenant_id = <from JWT>
  ↓
Postgres with RLS policies per table
  ↓ USING (tenant_id = current_setting('app.tenant_id')::uuid)
  ↓
TimescaleDB hypertables (time series)
  ↓
EXCLUDE constraints (no overlapping periods)

RLS policy example:

CREATE POLICY tenant_isolation ON jurisdiction_reports
  USING (tenant_id = current_setting('app.tenant_id')::uuid);

Every query from the backend sets the tenant context at the start of the connection. Postgres then automatically filters every WHERE clause — forgetting is impossible. If someone gets direct DB access (which they shouldn't), they can still only see the scope of the set tenant.

"BaFin audit passed on the first attempt. That was the number-one fear before — that we'd be sent back after two months of review. Now we have a mechanism that isn't just compliance-conform, it's compliance-by-default." — CTO, Obsidian

Build: eleven weeks, focused on migration safety

Weeks 1–2 — Schema Audit Analyze every table: does tenant_id already exist? Where not, how do we add it cleanly? Which tables must definitely have RLS (reports, customer data) vs. which can stay shared (master data, locales)?

Weeks 3–5 — RLS rollout on new DB Schema build with RLS from day one. TimescaleDB extensions for time-series hypertables. EXCLUDE constraints for jurisdiction + period (no duplicate reports per client+period). Testing with synthetic data from multiple tenants — could any tenant see data from another? Answer: no, no matter how we tried.

Weeks 6–8 — Per-tenant migration Parallel run: new + old stack, data in both systems. Tenant-by-tenant migration with read comparison. If reports came out 100% identical: cut over. On deviations: analyze delta (mostly null handling in old data).

Weeks 9–11 — White-label dashboards + monitoring React dashboard with per-tenant themes (logo, colors, accent). Admin console for internal support. Monitoring: RLS violations as critical alerts (should never happen, but if it does → immediate escalation).

Result

  • 180+ tenants on one DB — previously 180+ separate instances
  • Onboarding 3 weeks → 15 minutes — a new tenant = INSERT into a table
  • 12,000 reports/month — P95 query latency under 200ms
  • Zero data-leak incidents in 9 months of production
  • BaFin audit passed first attempt

What we learned

RLS is underrated. Many teams write their own tenant isolation in the application layer — fragile, error-prone, and hard to audit. Postgres-native RLS moves isolation to where it belongs: into the engine that holds the data. Audit explanation becomes trivial: "Postgres RLS policies are enforced by the DB engine, not by application code."

Performance was not an issue — contrary to prejudice. RLS has a reputation as "slow" because many benchmarks test edge cases. With correct indexing (always tenant_id first in composite index) overhead is < 5%. At our 12k reports/month: unmeasurable.

EXCLUDE constraints are gold. That was our discovery during the build: Postgres can guarantee via EXCLUDE USING gist that two reports for the same tenant + same jurisdiction + overlapping periods cannot coexist. The application used to do that with complex logic — now the DB handles it atomically.

Monitor before migrating. The most important tip: before you turn on RLS in production, set up alerts that log every query without tenant_context. We had a test endpoint in week 2 of the migration that ran without context. Alert pinged immediately. Fix in 10 minutes. Without the alert we would have missed it.

Similar project?

Let's talk about it.

Start a project