Package io.debezium.relational
Class RelationalSnapshotChangeEventSource<P extends Partition,O extends OffsetContext>
java.lang.Object
io.debezium.pipeline.source.AbstractSnapshotChangeEventSource<P,O>
io.debezium.relational.RelationalSnapshotChangeEventSource<P,O>
- All Implemented Interfaces:
ChangeEventSource,SnapshotChangeEventSource<P,,O> AutoCloseable
public abstract class RelationalSnapshotChangeEventSource<P extends Partition,O extends OffsetContext>
extends AbstractSnapshotChangeEventSource<P,O>
Base class for
SnapshotChangeEventSource for relational databases with or without a schema history.
A transaction is managed by this base class, sub-classes shouldn't rollback or commit this transaction. They are free to use nested transactions or savepoints, though.
- Author:
- Gunnar Morling
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic classRelationalSnapshotChangeEventSource.RelationalSnapshotContext<P extends Partition,O extends OffsetContext> Mutable context which is populated in the course of snapshotting.Nested classes/interfaces inherited from class io.debezium.pipeline.source.AbstractSnapshotChangeEventSource
AbstractSnapshotChangeEventSource.SnapshotContext<P extends Partition,O extends OffsetContext>, AbstractSnapshotChangeEventSource.SnapshottingTask Nested classes/interfaces inherited from interface io.debezium.pipeline.source.spi.ChangeEventSource
ChangeEventSource.ChangeEventSourceContext -
Field Summary
FieldsModifier and TypeFieldDescriptionprotected final Clockprivate final RelationalDatabaseConnectorConfigprotected final EventDispatcher<P,TableId> private final JdbcConnectionprivate final MainConnectionProvidingConnectionFactory<? extends JdbcConnection>private static final org.slf4j.Loggerprivate final RelationalDatabaseSchemastatic final Patternprivate final SnapshotProgressListener<P>Fields inherited from class io.debezium.pipeline.source.AbstractSnapshotChangeEventSource
LOG_INTERVAL -
Constructor Summary
ConstructorsConstructorDescriptionRelationalSnapshotChangeEventSource(RelationalDatabaseConnectorConfig connectorConfig, MainConnectionProvidingConnectionFactory<? extends JdbcConnection> jdbcConnectionFactory, RelationalDatabaseSchema schema, EventDispatcher<P, TableId> dispatcher, Clock clock, SnapshotProgressListener<P> snapshotProgressListener) -
Method Summary
Modifier and TypeMethodDescriptionprotected booleanadditionalColumnFilter(P partition, TableId tableId, String columnName) Additional filter handling for preparing column names for snapshot selectaddSignalingCollectionAndSort(Set<TableId> capturedTables) protected voidconnectionCreated(RelationalSnapshotChangeEventSource.RelationalSnapshotContext<P, O> snapshotContext) Executes steps which have to be taken just after the database connection is created.protected abstract OcopyOffset(RelationalSnapshotChangeEventSource.RelationalSnapshotContext<P, O> snapshotContext) private Queue<JdbcConnection>private voidcreateDataEvents(ChangeEventSource.ChangeEventSourceContext sourceContext, RelationalSnapshotChangeEventSource.RelationalSnapshotContext<P, O> snapshotContext, Queue<JdbcConnection> connectionPool) createDataEventsForTableCallable(ChangeEventSource.ChangeEventSourceContext sourceContext, RelationalSnapshotChangeEventSource.RelationalSnapshotContext<P, O> snapshotContext, EventDispatcher.SnapshotReceiver<P> snapshotReceiver, Table table, boolean firstTable, boolean lastTable, int tableOrder, int tableCount, String selectStatement, OptionalLong rowCount, Queue<JdbcConnection> connectionPool, Queue<O> offsets) protected voidcreateSchemaChangeEventsForTables(ChangeEventSource.ChangeEventSourceContext sourceContext, RelationalSnapshotChangeEventSource.RelationalSnapshotContext<P, O> snapshotContext, AbstractSnapshotChangeEventSource.SnapshottingTask snapshottingTask) private voidprotected abstract voiddetermineSnapshotOffset(RelationalSnapshotChangeEventSource.RelationalSnapshotContext<P, O> snapshotContext, O previousOffset) Determines the current offset (MySQL binlog position, Oracle SCN etc.), storing it into the passed context object.determineSnapshotSelect(RelationalSnapshotChangeEventSource.RelationalSnapshotContext<P, O> snapshotContext, TableId tableId) Returns a valid query string for the specified table, either given by the user via snapshot select overrides or defaulting to a statement provided by the DB-specific change event source.private voiddoCreateDataEventsForTable(ChangeEventSource.ChangeEventSourceContext sourceContext, P partition, O offset, EventDispatcher.SnapshotReceiver<P> snapshotReceiver, Table table, boolean firstTable, boolean lastTable, int tableOrder, int tableCount, String selectStatement, OptionalLong rowCount, JdbcConnection jdbcConnection) doExecute(ChangeEventSource.ChangeEventSourceContext context, O previousOffset, AbstractSnapshotChangeEventSource.SnapshotContext<P, O> snapshotContext, AbstractSnapshotChangeEventSource.SnapshottingTask snapshottingTask) Executes this source.protected StringenhanceOverriddenSelect(RelationalSnapshotChangeEventSource.RelationalSnapshotContext<P, O> snapshotContext, String overriddenSelect, TableId tableId) This method is overridden for Oracle to implement "as of SCN" predicategetAllTableIds(RelationalSnapshotChangeEventSource.RelationalSnapshotContext<P, O> snapshotContext) Returns all candidate tables; the current filter configuration will be applied to the result set, resulting in the effective set of captured tables.protected ChangeRecordEmitter<P>Returns aChangeRecordEmitterproducing the change records for the given table row.protected ClockgetClock()protected abstract SchemaChangeEventgetCreateTableEvent(RelationalSnapshotChangeEventSource.RelationalSnapshotContext<P, O> snapshotContext, Table table) Creates aSchemaChangeEventrepresenting the creation of the given table.getPreparedColumnNames(P partition, Table table) Prepares a list of columns to be used in the snapshot select.getSnapshotConnectionFirstSelect(RelationalSnapshotChangeEventSource.RelationalSnapshotContext<P, O> snapshotContext, TableId tableId) getSnapshotSelect(RelationalSnapshotChangeEventSource.RelationalSnapshotContext<P, O> snapshotContext, TableId tableId, List<String> columns) Returns the SELECT statement to be used for scanning the given table or empty value if the table will be streamed from but not snapshottedprotected StringgetSnapshotSelectOverridesByTable(TableId tableId) protected InstantgetSnapshotSourceTimestamp(JdbcConnection jdbcConnection, O offset, TableId tableId) For the given table gets source.ts_ms value from the database for snapshot data! For Postgresql its globally static for all tables since postgresql snapshot process setting auto commit off.private Threads.Timerprotected Collection<TableId>getTablesForSchemaChange(RelationalSnapshotChangeEventSource.RelationalSnapshotContext<P, O> snapshotContext) protected voidlastSnapshotRecord(RelationalSnapshotChangeEventSource.RelationalSnapshotContext<P, O> snapshotContext) protected abstract voidlockTablesForSchemaSnapshot(ChangeEventSource.ChangeEventSourceContext sourceContext, RelationalSnapshotChangeEventSource.RelationalSnapshotContext<P, O> snapshotContext) Locks all tables to be captured, so that no concurrent schema changes can be applied to them.protected voidprotected StatementreadTableStatement(JdbcConnection jdbcConnection, OptionalLong tableSize) Allow per-connector query creation to override for best database performance depending on the table size.protected abstract voidreadTableStructure(ChangeEventSource.ChangeEventSourceContext sourceContext, RelationalSnapshotChangeEventSource.RelationalSnapshotContext<P, O> snapshotContext, O offsetContext) Reads the structure of all the captured tables, writing it toRelationalSnapshotChangeEventSource.RelationalSnapshotContext.tables.protected voidreleaseDataSnapshotLocks(RelationalSnapshotChangeEventSource.RelationalSnapshotContext<P, O> snapshotContext) Releases all locks established in order to create a consistent data snapshot.protected abstract voidreleaseSchemaSnapshotLocks(RelationalSnapshotChangeEventSource.RelationalSnapshotContext<P, O> snapshotContext) Releases all locks established in order to create a consistent schema snapshot.private voidrollbackTransaction(Connection connection) protected OptionalLongrowCountForTable(TableId tableId) If connector is able to provide statistics-based number of records per table.private voidsetSnapshotMarker(OffsetContext offset, boolean firstTable, boolean lastTable, boolean firstRecordInTable, boolean lastRecordInTable) toTableIds(Set<TableId> tableIds, Pattern pattern) protected voidtryStartingSnapshot(RelationalSnapshotChangeEventSource.RelationalSnapshotContext<P, O> snapshotContext) Methods inherited from class io.debezium.pipeline.source.AbstractSnapshotChangeEventSource
aborted, close, completed, delaySnapshotIfNeeded, determineDataCollectionsToBeSnapshotted, execute, getSnapshottingTask, prepare
-
Field Details
-
LOGGER
private static final org.slf4j.Logger LOGGER -
SELECT_ALL_PATTERN
-
connectorConfig
-
jdbcConnection
-
jdbcConnectionFactory
private final MainConnectionProvidingConnectionFactory<? extends JdbcConnection> jdbcConnectionFactory -
schema
-
dispatcher
-
clock
-
snapshotProgressListener
-
-
Constructor Details
-
RelationalSnapshotChangeEventSource
public RelationalSnapshotChangeEventSource(RelationalDatabaseConnectorConfig connectorConfig, MainConnectionProvidingConnectionFactory<? extends JdbcConnection> jdbcConnectionFactory, RelationalDatabaseSchema schema, EventDispatcher<P, TableId> dispatcher, Clock clock, SnapshotProgressListener<P> snapshotProgressListener)
-
-
Method Details
-
doExecute
public SnapshotResult<O> doExecute(ChangeEventSource.ChangeEventSourceContext context, O previousOffset, AbstractSnapshotChangeEventSource.SnapshotContext<P, O> snapshotContext, AbstractSnapshotChangeEventSource.SnapshottingTask snapshottingTask) throws ExceptionDescription copied from class:AbstractSnapshotChangeEventSourceExecutes this source. Implementations should regularly check via the given context if they should stop. If that's the case, they should abort their processing and perform any clean-up needed, such as rolling back pending transactions, releasing locks, etc.- Specified by:
doExecutein classAbstractSnapshotChangeEventSource<P extends Partition,O extends OffsetContext> - Parameters:
context- contextual information for this source's executionpreviousOffset- previous offset restored from KafkasnapshotContext- mutable context information populated throughout the snapshot processsnapshottingTask- immutable information about what tasks should be performed during snapshot- Returns:
- an indicator to the position at which the snapshot was taken
- Throws:
Exception
-
createConnectionPool
private Queue<JdbcConnection> createConnectionPool(RelationalSnapshotChangeEventSource.RelationalSnapshotContext<P, O> ctx) throws SQLException- Throws:
SQLException
-
createSnapshotConnection
- Throws:
SQLException
-
connectionCreated
protected void connectionCreated(RelationalSnapshotChangeEventSource.RelationalSnapshotContext<P, O> snapshotContext) throws ExceptionExecutes steps which have to be taken just after the database connection is created.- Throws:
Exception
-
toTableIds
-
addSignalingCollectionAndSort
- Throws:
Exception
-
determineCapturedTables
private void determineCapturedTables(RelationalSnapshotChangeEventSource.RelationalSnapshotContext<P, O> ctx) throws Exception- Throws:
Exception
-
getAllTableIds
protected abstract Set<TableId> getAllTableIds(RelationalSnapshotChangeEventSource.RelationalSnapshotContext<P, O> snapshotContext) throws ExceptionReturns all candidate tables; the current filter configuration will be applied to the result set, resulting in the effective set of captured tables.- Throws:
Exception
-
lockTablesForSchemaSnapshot
protected abstract void lockTablesForSchemaSnapshot(ChangeEventSource.ChangeEventSourceContext sourceContext, RelationalSnapshotChangeEventSource.RelationalSnapshotContext<P, O> snapshotContext) throws ExceptionLocks all tables to be captured, so that no concurrent schema changes can be applied to them.- Throws:
Exception
-
determineSnapshotOffset
protected abstract void determineSnapshotOffset(RelationalSnapshotChangeEventSource.RelationalSnapshotContext<P, O> snapshotContext, O previousOffset) throws ExceptionDetermines the current offset (MySQL binlog position, Oracle SCN etc.), storing it into the passed context object. Subsequently, the DB's schema (and data) will be be read at this position. Once the snapshot is completed, aStreamingChangeEventSourcewill be set up with this initial position to continue with stream reading from there.- Throws:
Exception
-
readTableStructure
protected abstract void readTableStructure(ChangeEventSource.ChangeEventSourceContext sourceContext, RelationalSnapshotChangeEventSource.RelationalSnapshotContext<P, O> snapshotContext, O offsetContext) throws ExceptionReads the structure of all the captured tables, writing it toRelationalSnapshotChangeEventSource.RelationalSnapshotContext.tables.- Throws:
Exception
-
releaseSchemaSnapshotLocks
protected abstract void releaseSchemaSnapshotLocks(RelationalSnapshotChangeEventSource.RelationalSnapshotContext<P, O> snapshotContext) throws ExceptionReleases all locks established in order to create a consistent schema snapshot.- Throws:
Exception
-
releaseDataSnapshotLocks
protected void releaseDataSnapshotLocks(RelationalSnapshotChangeEventSource.RelationalSnapshotContext<P, O> snapshotContext) throws ExceptionReleases all locks established in order to create a consistent data snapshot.- Throws:
Exception
-
createSchemaChangeEventsForTables
protected void createSchemaChangeEventsForTables(ChangeEventSource.ChangeEventSourceContext sourceContext, RelationalSnapshotChangeEventSource.RelationalSnapshotContext<P, O> snapshotContext, AbstractSnapshotChangeEventSource.SnapshottingTask snapshottingTask) throws Exception- Throws:
Exception
-
getTablesForSchemaChange
protected Collection<TableId> getTablesForSchemaChange(RelationalSnapshotChangeEventSource.RelationalSnapshotContext<P, O> snapshotContext) -
getCreateTableEvent
protected abstract SchemaChangeEvent getCreateTableEvent(RelationalSnapshotChangeEventSource.RelationalSnapshotContext<P, O> snapshotContext, Table table) throws ExceptionCreates aSchemaChangeEventrepresenting the creation of the given table.- Throws:
Exception
-
createDataEvents
private void createDataEvents(ChangeEventSource.ChangeEventSourceContext sourceContext, RelationalSnapshotChangeEventSource.RelationalSnapshotContext<P, O> snapshotContext, Queue<JdbcConnection> connectionPool) throws Exception- Throws:
Exception
-
copyOffset
protected abstract O copyOffset(RelationalSnapshotChangeEventSource.RelationalSnapshotContext<P, O> snapshotContext) -
tryStartingSnapshot
protected void tryStartingSnapshot(RelationalSnapshotChangeEventSource.RelationalSnapshotContext<P, O> snapshotContext) -
getSnapshotSourceTimestamp
protected Instant getSnapshotSourceTimestamp(JdbcConnection jdbcConnection, O offset, TableId tableId) For the given table gets source.ts_ms value from the database for snapshot data! For Postgresql its globally static for all tables since postgresql snapshot process setting auto commit off. For Mysql its static per table and might be ~second behind of the select statements start ts. -
createDataEventsForTableCallable
private Callable<Void> createDataEventsForTableCallable(ChangeEventSource.ChangeEventSourceContext sourceContext, RelationalSnapshotChangeEventSource.RelationalSnapshotContext<P, O> snapshotContext, EventDispatcher.SnapshotReceiver<P> snapshotReceiver, Table table, boolean firstTable, boolean lastTable, int tableOrder, int tableCount, String selectStatement, OptionalLong rowCount, Queue<JdbcConnection> connectionPool, Queue<O> offsets) -
doCreateDataEventsForTable
private void doCreateDataEventsForTable(ChangeEventSource.ChangeEventSourceContext sourceContext, P partition, O offset, EventDispatcher.SnapshotReceiver<P> snapshotReceiver, Table table, boolean firstTable, boolean lastTable, int tableOrder, int tableCount, String selectStatement, OptionalLong rowCount, JdbcConnection jdbcConnection) throws InterruptedException - Throws:
InterruptedException
-
setSnapshotMarker
private void setSnapshotMarker(OffsetContext offset, boolean firstTable, boolean lastTable, boolean firstRecordInTable, boolean lastRecordInTable) -
lastSnapshotRecord
protected void lastSnapshotRecord(RelationalSnapshotChangeEventSource.RelationalSnapshotContext<P, O> snapshotContext) -
rowCountForTable
If connector is able to provide statistics-based number of records per table. -
getTableScanLogTimer
-
getChangeRecordEmitter
protected ChangeRecordEmitter<P> getChangeRecordEmitter(P partition, O offset, TableId tableId, Object[] row, Instant timestamp) Returns aChangeRecordEmitterproducing the change records for the given table row. -
determineSnapshotSelect
private Optional<String> determineSnapshotSelect(RelationalSnapshotChangeEventSource.RelationalSnapshotContext<P, O> snapshotContext, TableId tableId) Returns a valid query string for the specified table, either given by the user via snapshot select overrides or defaulting to a statement provided by the DB-specific change event source.- Parameters:
tableId- the table to generate a query for- Returns:
- a valid query string or empty if table will not be snapshotted
-
getSnapshotSelectOverridesByTable
-
getPreparedColumnNames
Prepares a list of columns to be used in the snapshot select. The selected columns are based on the column include/exclude filters and if all columns are excluded, the list will contain all the primary key columns.- Returns:
- list of snapshot select columns
-
additionalColumnFilter
Additional filter handling for preparing column names for snapshot select -
enhanceOverriddenSelect
protected String enhanceOverriddenSelect(RelationalSnapshotChangeEventSource.RelationalSnapshotContext<P, O> snapshotContext, String overriddenSelect, TableId tableId) This method is overridden for Oracle to implement "as of SCN" predicate- Parameters:
snapshotContext- snapshot context, used for getting offset SCNoverriddenSelect- conditional snapshot select- Returns:
- enhanced select statement. By default it just returns original select statements.
-
getSnapshotSelect
protected abstract Optional<String> getSnapshotSelect(RelationalSnapshotChangeEventSource.RelationalSnapshotContext<P, O> snapshotContext, TableId tableId, List<String> columns) Returns the SELECT statement to be used for scanning the given table or empty value if the table will be streamed from but not snapshotted -
getSnapshotConnectionFirstSelect
-
readTableStatement
protected Statement readTableStatement(JdbcConnection jdbcConnection, OptionalLong tableSize) throws SQLException Allow per-connector query creation to override for best database performance depending on the table size.- Throws:
SQLException
-
rollbackTransaction
-
getClock
-
postSnapshot
- Throws:
InterruptedException
-