Class SnapshotProcessor

java.lang.Object
io.debezium.connector.cassandra.AbstractProcessor
io.debezium.connector.cassandra.SnapshotProcessor

public class SnapshotProcessor extends AbstractProcessor
This reader is responsible for initial bootstrapping of a table, which entails converting each row into a change event and enqueueing that event to the ChangeEventQueue.

IMPORTANT: Currently, only when a snapshot is completed will the OffsetWriter record the table in the offset.properties file (with filename "" and position -1). This means if the SnapshotProcessor is terminated midway, upon restart it will skip all the tables that are already recorded in offset.properties

  • Field Details

  • Constructor Details

  • Method Details

    • initialize

      public void initialize()
      Description copied from class: AbstractProcessor
      Override initialize to initialize resources before starting the processor
      Overrides:
      initialize in class AbstractProcessor
    • destroy

      public void destroy()
      Description copied from class: AbstractProcessor
      Override destroy to clean up resources after stopping the processor
      Overrides:
      destroy in class AbstractProcessor
    • process

      public void process() throws IOException
      Description copied from class: AbstractProcessor
      The actual work the processor is doing. This method will be executed in a while loop until processor stops or encounters exception.
      Specified by:
      process in class AbstractProcessor
      Throws:
      IOException
    • snapshot

      void snapshot() throws IOException
      Fetch for all new tables that have not yet been snapshotted, and then iterate through the tables to snapshot each one of them.
      Throws:
      IOException
    • getTablesToSnapshot

      private Set<com.datastax.oss.driver.api.core.metadata.schema.TableMetadata> getTablesToSnapshot()
      Return a set of TableMetadata for tables that have not been snapshotted but have CDC enabled.
    • takeTableSnapshot

      private void takeTableSnapshot(com.datastax.oss.driver.api.core.metadata.schema.TableMetadata tableMetadata) throws IOException
      Runs a SELECT query on a given table and process each row in the result set by converting the row into a record and enqueue it to ChangeRecord
      Throws:
      IOException
    • generateSnapshotStatement

      private static com.datastax.oss.driver.api.core.cql.SimpleStatement generateSnapshotStatement(com.datastax.oss.driver.api.core.metadata.schema.TableMetadata tableMetadata)
      Build the SELECT query statement for execution. For every non-primary-key column, the TTL, WRITETIME, and execution time are also queried.

      For example, a table t with columns a, b, and c, where A is the partition key, B is the clustering key, and C is a regular column, looks like the following:

           SELECT now() as execution_time, a, b, c, TTL(c) as c_ttl, WRITETIME(c) as c_writetime FROM t;
       
    • processResultSet

      private void processResultSet(com.datastax.oss.driver.api.core.metadata.schema.TableMetadata tableMetadata, com.datastax.oss.driver.api.core.cql.ResultSet resultSet) throws IOException
      Process the result set from the query. Each row is converted into a ChangeRecord and enqueued to the ChangeEventQueue.
      Throws:
      IOException
    • extractRowData

      private static CassandraSchemaFactory.RowData extractRowData(com.datastax.oss.driver.api.core.cql.Row row, Collection<com.datastax.oss.driver.api.core.metadata.schema.ColumnMetadata> columns, Set<String> partitionKeyNames, Set<String> clusteringKeyNames, Object executionTime)
      This function extracts the relevant row data from Row and updates the maximum writetime for each row.
    • getType

      private static CassandraSchemaFactory.CellData.ColumnType getType(String name, Set<String> partitionKeyNames, Set<String> clusteringKeyNames)
    • readExecutionTime

      private static Object readExecutionTime(com.datastax.oss.driver.api.core.cql.Row row)
    • readCol

      private static Object readCol(com.datastax.oss.driver.api.core.cql.Row row, String col, com.datastax.oss.driver.api.core.metadata.schema.ColumnMetadata cm)
    • readColTtl

      private static Object readColTtl(com.datastax.oss.driver.api.core.cql.Row row, String col)
    • calculateDeletionTs

      private static long calculateDeletionTs(Object executionTime, Object ttl)
      it is not possible to query deletion time via cql, so instead calculate it from execution time (in milliseconds) + ttl (in seconds)
    • ttlAlias

      private static String ttlAlias(String colName)
    • withQuotes

      private static String withQuotes(String s)
    • tableName

      private static String tableName(com.datastax.oss.driver.api.core.metadata.schema.TableMetadata tm)