Skip to content

Cephalon.Data.Postgres

Maturity: M2 · Ownership: provider-managed · Family: data-and-cdc · See audit, matrix.

Cephalon.Data.Postgres is the PostgreSQL provider-native CDC companion pack for Cephalon. It proves that the shared Cephalon.Data CDC execution/runtime catalog family also fits logical-replication streaming with slot-backed durable progress, publication/table ownership validation, and module-preserving capture ownership truth without a PostgreSQL-specific registry in Cephalon.Engine.

  • contributes configured provider-native PostgreSQL logical-replication captures through PostgresLogicalReplicationCaptureOptions and keeps those descriptors on the shared /engine/cdc-captures* catalog with provider = "postgresql" and mode = "logical-replication"
  • publishes capability metadata data.postgresql, data.relational-store, and data.cdc.postgresql introspectable at runtime through the manifest
  • publishes the provider-native CDC execution graph postgresql-logical-replication-capture-flow, hosted execution postgresql-logical-replication-capture-pump, and execution runtime postgresql-logical-replication-capture-pump when PostgreSQL CDC captures are configured
  • runs a provider-native hosted service that validates publication/table ownership, classifies logical-replication slot lifecycle posture, optionally recreates inactive invalidated slots when configured, streams one bounded logical-replication batch per configured capture, stages outbox publications, confirms replication-slot progress only after outbox stage success, and reports runtime posture through the shared ICdcCaptureRuntimeReporter surface
  • keeps durable PostgreSQL CDC checkpoints on the logical replication slot itself, using confirmed flush position as the provider-native acknowledgement boundary instead of a Cephalon-managed checkpoint table
  • preserves authored capture ownership through CdcCaptureDescriptor.SourceModuleId while surfacing metadata.contributorModuleId = "postgres-data" when the provider pack contributes the descriptor on behalf of another module
  • Configuration/PostgresDataOptions.cs
  • Configuration/PostgresLogicalReplicationCaptureOptions.cs
  • Modules/PostgresDataModule.cs
  • Registration/PostgresDataEngineBuilderExtensions.cs
  • Services/IPostgresLogicalReplicationTransport.cs
  • Services/PostgresLogicalReplicationCaptureHostedService.cs
  • Services/PostgresLogicalReplicationCheckpointToken.cs
  • Services/PostgresLogicalReplicationExecutionRuntimeContributor.cs
  • Services/PostgresLogicalReplicationTransport.cs
  • Services/PostgresDataRuntimeIds.cs

This pack sits on top of Cephalon.Data, not in place of it. Cephalon.Data still owns the runtime-neutral CDC descriptor catalog, capture-side execution binding, shared runtime-state catalog, additive execution-runtime catalog, external runtime reporting seam, and the shared /engine/cdc-* plus snapshot truth model. Cephalon.Data.Postgres adds the PostgreSQL-specific logical-replication reader, slot/publication validation, bounded transaction batching, and provider-native hosted execution loop needed to project truthful PostgreSQL behavior into those shared surfaces.

That keeps the engine honest. The shared data-cdc-capture-pump still exists as the generic in-process runtime, but it simply ignores captures whose effective owner resolves to postgresql-logical-replication-capture-pump. The PostgreSQL pack becomes the third concrete provider-native CDC runner after MongoDB and SQL Server, and the first slot-backed logical-streaming relational runner on the same ownership/topology model.

The slice stays intentionally scoped. Cephalon.Data.Postgres does not claim general-purpose IReadStore or IWriteStore dispatch, EF Core integration, PostgreSQL-backed outbox or inbox storage, publication management beyond validating one declared publication/table mapping, or multi-table orchestration beyond one configured capture per publication/table/slot path. If a host wants relational read/write persistence today, Cephalon.Data.EntityFramework remains the honest baseline. Cephalon.Data.Postgres only claims provider-native CDC capture plus durable slot-backed progress truth.

engine.AddData(options =>
{
options.EnableCdcExecution = true;
});
engine.AddPostgresData(
connectionString: "Host=localhost;Port=5432;Database=app;Username=postgres;Password=postgres",
databaseName: "app",
configure: options =>
{
options.CdcCaptures.Add(new PostgresLogicalReplicationCaptureOptions
{
Id = "orders-cdc",
DisplayName = "Orders CDC",
SourceModuleId = "orders",
PublicationName = "cephalon_orders_publication",
SlotName = "cephalon_orders_slot",
TableSchema = "public",
TableName = "orders",
OutboxId = "tenant-event-outbox",
ChannelId = "orders",
MessageType = "OrdersChanged"
});
});

For configuration-driven hosts, prefer the options overload and bind from Engine:Data:Postgres:

engine.AddData(options =>
{
options.EnableCdcExecution = true;
});
engine.AddPostgresData(options =>
{
configuration.GetSection(PostgresDataOptions.SectionPath).Bind(options);
options.ConnectionStringName ??= "Postgres";
options.DatabaseName = "app";
});
{
"ConnectionStrings": {
"Postgres": "Host=localhost;Port=5432;Database=app;Username=postgres;Password=postgres"
},
"Engine": {
"Data": {
"Postgres": {
"ConnectionStringName": "Postgres",
"DatabaseName": "app",
"CdcCaptures": [
{
"Id": "orders-cdc",
"DisplayName": "Orders CDC",
"SourceModuleId": "orders",
"PublicationName": "cephalon_orders_publication",
"SlotName": "cephalon_orders_slot",
"TableSchema": "public",
"TableName": "orders",
"OutboxId": "tenant-event-outbox",
"ChannelId": "orders",
"MessageType": "OrdersChanged",
"InitialPosition": "slot-consistent-point",
"CreateSlotIfMissing": true,
"RecreateSlotIfInvalidated": false,
"MaxChangesPerRead": 128,
"MaxAwaitTimeSeconds": 5,
"PollingIntervalSeconds": 5
}
]
}
}
}
}

ConnectionStringName and ConnectionString are mutually exclusive. If both are set, the pack fails fast during service resolution. If neither is set, the PostgreSQL provider cannot start.

Configuration options (Engine:Data:Postgres)

Section titled “Configuration options (Engine:Data:Postgres)”
OptionTypeDefaultDescription
ConnectionStringNamestring?nullRoot ConnectionStrings key to resolve for PostgreSQL
ConnectionStringstring?nullInline PostgreSQL connection string
DatabaseNamestringrequiredOperator-facing database name for the configured logical-replication captures
CdcCapturesPostgresLogicalReplicationCaptureOptions[][]Contribute provider-native PostgreSQL logical-replication captures

PostgreSQL capture options (Engine:Data:Postgres:CdcCaptures[])

Section titled “PostgreSQL capture options (Engine:Data:Postgres:CdcCaptures[])”
OptionTypeDefaultDescription
IdstringrequiredStable CDC capture id
DisplayNamestringIdOperator-facing capture name
DescriptionstringgeneratedHuman-readable capture description
SourceModuleIdstringrequiredModule id that owns the capture surface
SourceIdstring{provider}:{database}/{schema}.{table}Logical upstream source id when it should differ from the watched table path
PublicationNamestringrequiredPostgreSQL publication that must publish the tracked table
SlotNamestringrequiredPostgreSQL logical replication slot used for durable progress
TableSchemastring"public"Schema name of the tracked table
TableNamestringrequiredTable name of the tracked table
OutboxIdstringrequiredLinked outbox id that receives publications
ChannelIdstringrequiredLogical outbox channel id for emitted publications
MessageTypestringrequiredLogical message type for emitted publications
EventFormatstring"postgresql-logical-replication-event"Operator-facing event format projected on the descriptor
InitialPositionstring"slot-consistent-point"Initial position when the slot must be created; supported values are slot-consistent-point and latest-available
CreateSlotIfMissingbooltrueCreates the logical replication slot when it does not exist yet
RecreateSlotIfInvalidatedboolfalseDrops and recreates the logical replication slot when PostgreSQL reports that the existing slot is invalidated and inactive
MaxChangesPerReadint128Maximum number of captured changes to stage during one provider-native iteration
MaxAwaitTimeSecondsint5Maximum number of seconds to await committed WAL messages during one provider-native iteration
PollingIntervalSecondsint5Polling interval in seconds for one provider-native loop iteration
ResourceIdsstring[]["{database}.{schema}.{table}"]Explicit logical resources observed by the capture
Tagsstring[]["cdc", "postgresql", "provider-native"]Operator-facing tags on the descriptor
MetadataDictionary<string,string>{}Additional operator-facing metadata merged onto the descriptor

When CdcCaptures are configured:

  • each capture is published through /engine/cdc-captures* with provider = "postgresql", mode = "logical-replication", and an executionBinding whose authored and requested runtime id is postgresql-logical-replication-capture-pump
  • the execution runtime is published through /engine/cdc-capture-runtimes* and snapshot.CdcCaptureExecutionRuntimes with executionOwnership = host-managed, executionTopology = provider-native, and acknowledgementMode = provider-native
  • the same runtime publishes through /engine/execution-graphs, /engine/hosted-executions, /engine/runtime-story, and snapshot under postgresql-logical-replication-capture-flow plus postgresql-logical-replication-capture-pump
  • the hosted runner validates that the declared publication publishes the declared table, validates slot type/plugin/database ownership, optionally creates the logical replication slot, can drop and recreate an inactive invalidated slot when RecreateSlotIfInvalidated = true, reads one bounded committed pgoutput batch per iteration, stages one outbox message per captured change, and only confirms slot progress after the linked outbox accepted the batch
  • each staged outbox message uses deterministic id {commitLsn}:{ordinal}, content type application/vnd.cephalon.postgresql.logical-replication+json, headers for provider, cdcCaptureId, databaseName, schemaName, tableName, publicationName, slotName, and operation, plus metadata for sourceId, eventFormat, and checkpointToken
  • the runtime-state surface keeps typed freshness, lag, pending-publication posture, checkpoint, change id, last operation type, reporter metadata, failure-kind metadata, and slot lifecycle metadata such as slotLifecycleState, slotLifecycleAction, slotResumeMode, slotRestartLsn, slotConfirmedFlushLsn, slotWalStatus, and slotInvalidationReason on the same /engine/cdc-captures/runtime* catalog instead of inventing a PostgreSQL-specific monitor
  • durable checkpoint tokens are serialized as slotName|commitLsn|transactionEndLsn, runtime metadata keeps replicationCheckpointSource = "slot-confirmed-flush-lsn", and restart or resume posture stays grounded in the slot’s confirmed flush position instead of a Cephalon-managed checkpoint table

When PostgresDataModule is active, the following capability keys appear in the runtime manifest:

Capability keyWhen registered
data.postgresqlAlways
data.relational-storeAlways
data.cdc.postgresqlCdcCaptures.Count > 0

This pack intentionally still does not claim:

  • IReadStore / IWriteStore dispatch backed directly by PostgreSQL
  • Entity Framework or DbContext integration
  • PostgreSQL-backed IOutbox, IInbox, or event-dispatch storage
  • publication or table creation/migration orchestration beyond validating one declared publication/table path and optionally recreating an inactive invalidated slot
  • logical decoding plugins beyond the shipped pgoutput path
  • multi-table slot orchestration, fan-out, or low-code generation beyond one configured capture per table path
  • automatic slot failover, multi-consumer lease orchestration, or external edge execution ownership beyond the shared /engine/cdc-* topology surfaces

These remain later slices so the current provider claim stays truthful.