Sometimes Postgres Isn't the Answer: Pomerium's New File-Based Storage Backend

blog

Pomerium v0.31 introduces a new file-based, Raft-clustered storage backend for its Databroker, offering a more manageable and scalable alternative to Postgres while simplifying operations.

Pomerium v0.31 marks a significant release, introducing a novel custom local file storage backend featuring a clustered mode implemented via Raft. This new architecture provides automatic failover and recovery, serving as a powerful alternative to our existing Postgres backend. While common wisdom often suggests that Postgres is sufficient for most tasks, advocating against the perceived "over-engineering" of modern microservice systems, we found ourselves developing a custom database solution and moving away from Postgres. How did this strategic shift come about?

First, let's establish why Pomerium requires a storage backend at all.

The Databroker

Pomerium operates as an identity-aware access proxy, centralizing authentication and authorization for upstream applications. Its architecture comprises several key components:

The proxy service handles incoming requests and matches them against defined route policies.
The authorize service enforces access policies for these routes for every request.
If a route necessitates user login, requests are redirected to the authenticate service.
The authenticate service initiates an OIDC login with an identity provider.
Upon successful login, a session is established containing the user's identity details (user ID, claims, etc.).
This session is then stored in the databroker service, and the user is redirected back to the originating request.
The subsequent request is processed by the proxy service, which calls the authorize service. This service now locates the new session (via a cookie) and leverages the identity information to make precise access decisions.

The Databroker is indispensable to this process, serving as the storage location for sessions and other critical data. But was it always necessary?

Early versions of Pomerium, in fact, operated without a Databroker, storing all identity information directly within the user’s session cookie. This approach, however, presented several deficiencies:

Cookie Size Limitations: While user IDs and most claims are small, some claims can be quite large. For instance, users in large organizations may belong to thousands of groups, making it impractical to store extensive group ID arrays in cookies. Furthermore, some identity providers embed substantial data, like an entire avatar image, directly within claims.
Delayed Policy Enforcement: Although OIDC login provides initial claims, many identity providers offer additional directory data via separate APIs (e.g., Google, Microsoft Entra). This data might not be present in OIDC claims, or if it is, it's only updated during user logins. Optimal policy enforcement requires faster updates; if a user is removed from a group, access should be revoked within minutes, not hours or days. This necessitates background directory synchronization, independent of user login events.
External Context Data: Beyond directory sync, other sources of external context data are crucial for robust authorization decisions.
Lack of Session Management: Data stored in cookies is opaque; there's no way to monitor active sessions or programmatically delete them.

These limitations prompted our shift to storing sessions and other data in a Databroker — a service designed to support key-value lookups, querying, streaming synchronization, distributed leases, and change notifications.

Initially, this was built on groupcache, but caching inherently introduces volatility and potential inconsistencies. We then transitioned to Redis and ultimately to Postgres. As our data storage needs grew more sophisticated, particularly with requirements for indexing complex data like CIDR for GeoIP lookups, we narrowed our support to Postgres.

Postgres Problems

On paper, Postgres is an exceptional database. It boasts a rich feature set, including CIDR indexing, first-class JSON types, and LISTEN/NOTIFY functionality for supporting multiple Databroker instances. Its data is easily explorable for debugging, and it's widely available across diverse environments. This flexibility is vital for Pomerium, which can be deployed in various ways—from Kubernetes ingress controllers to bare-metal VMs and even air-gapped environments.

However, the reality of managing Postgres introduces its own set of challenges. Choosing Postgres as a primary database within a SaaS company often means accepting significant maintenance burdens: backups, scaling, storage management, and on-call responsibilities. Alternatively, opting for a managed Postgres instance can simplify things, but it's not without its caveats.

At scale, Pomerium can generate a substantial volume of reads and writes to a Postgres instance. Each user session must be written to the database and replicated across Pomerium instances. Directory data also requires storage and replication, which for some customers can mean tens or hundreds of thousands of users and/or groups. Managed databases are often not provisioned to handle such intense loads. They might be deployed with surprisingly modest hardware, or located in distant regions, leading to significant latency across the internet. Furthermore, shared database instances can experience unexpected slowdowns entirely unrelated to Pomerium's operations.

These problems are not insurmountable; a seasoned DBA would undoubtedly know how to configure a Postgres environment for optimal performance. Yet, DBAs are increasingly rare. Many companies lack the institutional knowledge required to effectively operate a Postgres database, leading to inevitable issues with Pomerium deployments and placing us in the position of providing expertise for infrastructure we know little about. Resolving these challenges can be time-consuming and frustrating for all parties involved.

This led us to re-evaluate our reliance on Postgres as our primary storage backend. We sought a solution that would be simpler to operate and configure while still providing all the necessary features.

Towards a File-Based Storage Backend with Automatic Failover

Beyond Postgres, Pomerium also features an in-memory backend, primarily used for testing. This in-memory version utilizes a B-Tree for data storage and supports CIDR indexing via a BART package. However, it suffered from two major deficiencies compared to the Postgres version:

Lack of Persistence: Data is lost upon Pomerium restarts, leading to user logouts and missing external context data until repopulated (which can take considerable time for large directories).
Single-Instance Limitation: Unlike Postgres, which supports multiple Pomerium instances, the in-memory backend can only be used by a single Databroker. While a single Databroker can serve multiple Pomerium proxies (in split-service mode), this still leaves a single point of failure.

Persistence

To address the persistence issue, we introduced a file-based storage backend using a local key-value store. Many options exist, such as SQLite or Go-native key-value stores like bbolt or badger. SQLite typically requires cgo (unless using modernc.org/sqlite), and given our minimal requirements—largely simple key-values without complex relational data—we opted for Pebble instead.

Pebble is built upon log-structured merge-trees, a design employed in many databases. It serves as CockroachDB's storage engine, known for its high performance and rigorous testing.

The Databroker maintains a set of keyspaces, each uniquely identified by a single-byte prefix:

const (
	prefixUnusedKeySpace = iota
	prefixLeaseKeySpace
	prefixMetadataKeySpace
	prefixOptionsKeySpace
	prefixRecordChangeKeySpace
	prefixRecordChangeIndexByTypeKeySpace
	prefixRecordKeySpace
	prefixRecordIndexByTypeVersionKeySpace
	prefixRegistryServiceKeySpace
)

Each keyspace contains a sorted list of keys and values, enabling efficient data retrieval. For example, records are written to the record key space as follows:

// record:
//   keys: prefix-record | {recordType as bytes} | 0x00 | {recordID as bytes}
//   values: {record as proto}

Pomerium heavily relies on two primary operations: loading current values for a given record type and loading changes as they occur. Because Pebble stores keys in sorted order, both operations are highly efficient, though they involve some write amplification to support both use cases.

Due to Pomerium's design, state can always be reconstructed from other sources. Users can simply log in again to restore sessions, and directory data can be re-synchronized. This flexibility allows us to relax strict reliability requirements. The goal of our persistence is to minimize downtime rather than prevent catastrophic failure, eliminating the need for complex backups and allowing users to determine their preferred level of reliability (e.g., whether to use a Kubernetes persistent volume).

Replication

To enable multi-instance support, we adopted the Raft protocol. Raft achieves distributed consensus for multi-node clusters by ensuring that a quorum of nodes must be available for a leader to be elected.

Go offers two mature Raft implementations: hashicorp/raft and etcd-io/raft, obviating the need for us to develop our own.

However, achieving full consistency comes with a performance cost. Moreover, it provides a stronger guarantee than what we strictly require. If some writes are lost, the consequence might be a user being logged out or directory data becoming stale; otherwise, the system will self-recover. Therefore, we chose to relax the consistency guarantees.

Instead of replicating all data via the Raft state machine, we use Raft solely as a leader election mechanism. Once a leader is established, requests from any followers are forwarded to the leader. Additionally, followers replicate the state from the leader, ensuring they are prepared to take over if necessary.

Should the leader fail, a new leader will be elected, possessing the replicated state of the previous leader. While it might not be entirely up-todate (though in practice, replication is usually fast, and writes are rarely lost), the system remains resilient.

Services utilizing the Databroker employ standard gRPC load balancing and can connect to any available node. If a service was connected to a failed leader, it will reconnect to a different node. Whether this new node is the current leader or a follower is irrelevant, as the request will be handled either way.

This architecture means that all Databroker write requests are ultimately handled by a single leader node. Is this an issue? Most of the time, no. The Databroker can manage a significant volume of requests, and Pomerium can often be scaled more effectively by segmenting workloads into separate clusters. Future enhancements may explore a multi-leader approach using key ranges.

Conclusion

With the introduction of the file-based storage backend and Databroker clustering, we believe we've achieved an optimal balance between ease of use and advanced features. Performance is at least on par with the Postgres version, while management is significantly simpler. The new file-based storage backend for the Databroker is available in Pomerium v0.31. It can be configured using the following options:

databroker_storage_type: file
databroker_storage_connection_string: file:///var/pomerium/databroker

Detailed instructions for setting up Raft can be found in the documentation.

Caleb Doxsey Software Engineer November 6, 2025