Database

Database replication

An introduction to database replication and change data capture.


Replication is the process of copying changes from your database to another location. It's also referred to as change data capture (CDC): capturing all the changes that occur to your data.

Use cases#

You might use database replication for:

  • Analytics and data warehousing: Replicate your operational database to analytics platforms for complex analysis without impacting your application's performance.
  • Data integration: Keep your data synchronized across different systems and services in your tech stack.
  • Backup and disaster recovery: Maintain up-to-date copies of your data in different locations.

Replication methods#

Supabase supports three replication methods. Choose based on whether you need another Supabase Postgres database, a managed pipeline to an external system, or full control over your own logical replication setup.

Read replicas#

Read replicas are additional Supabase Postgres databases kept in sync with your primary database. Use them when you want read-only query capacity, lower latency in another region, or to isolate analytical reads from application writes while staying inside Supabase Postgres.

External replication#

External replication is powered by Supabase ETL. It uses Postgres logical replication under the hood and provides a managed Dashboard workflow for replicating data from Supabase Postgres to external data systems.

Supported destinations#

External replication currently supports BigQuery as the managed destination. We are working on new destinations, and this table will be updated as support expands.

DestinationInsertUpdateDeleteTruncateSchema changeDescription
BigQuery✅ Supported✅ Supported✅ Supported✅ Supported✅ SupportedManaged replication to Google BigQuery for analytics and reporting.

Manual replication#

Manual replication uses the same underlying Postgres logical replication features as external replication, but you configure and operate the pieces yourself. Use this path when you want to connect tools such as Airbyte, Estuary, Fivetran, Materialize, Stitch, AWS DMS, or another system that supports Postgres logical replication.

For realtime features and syncing data to clients (browsers, mobile apps), see Realtime.

Realtime also uses Postgres changes, but it is intended for broadcasting database updates to clients rather than maintaining a copy of your database in another system.

Concepts and terms#

Write-Ahead Log (WAL)#

Postgres uses a system called the Write-Ahead Log (WAL) to manage changes to the database. As you make changes, they are appended to the WAL, which is a series of files (also called "segments") where the file size can be specified. Once one segment is full, Postgres will start appending to a new segment. After a period of time, a checkpoint occurs and Postgres synchronizes the WAL with your database. Once the checkpoint is complete, then the WAL files can be removed from disk and free up space.

Logical replication and WAL#

Logical replication is a method of replication where Postgres uses WAL files to transmit changes to another Postgres database, or to a system that supports reading WAL files.

LSN#

LSN is a Log Sequence Number that identifies a position in the WAL. It is often used to determine the progress of replication in subscribers and calculate the lag of a replication slot.

Logical replication architecture#

When setting up logical replication, three key components are involved:

  • publication - A set of tables on your primary database that will be published
  • replication slot - A slot used for replicating the data from a single publication. The slot, when created, will specify the output format of the changes
  • subscription - A subscription is created from an external system (i.e. another Postgres database) and must specify the name of the publication. If you do not specify a replication slot, one is automatically created

Logical replication output format#

Logical replication is typically output in two forms, pgoutput and wal2json. The output method is how Postgres sends changes to any active replication slot.

Logical replication configuration#

When using logical replication, Postgres keeps WAL files around for longer than it otherwise needs them. If the files are removed too quickly, then your replication slot can become inactive or lost if the database receives a large number of changes in a short time.

In order to mitigate this, Postgres has many options and settings that can be tweaked to manage the WAL usage effectively. Not all of these settings are user configurable as they can impact the stability of your database. For those that are, these should be considered as advanced configuration and not changed without understanding that they can cause additional disk space and resources to be used, as well as incur additional costs.

SettingDescriptionUser-facingDefault
max_replication_slotsMax count of replication slots allowedNo
wal_keep_sizeMinimum size of WAL files to keep for replicationNo
max_slot_wal_keep_sizeMax WAL size that can be reserved by replication slotsNo
checkpoint_timeoutMax time between WAL checkpointsNo