Frequently asked questions for the Change log business goal
Find answers to frequently asked questions about the Data Replication Change log business goal.
Contents
- What is the Change log business goal
- What is the difference between the Copy scenario and the Change log scenario?
- From which version can I use the Change log business goal?
- Which source-target database combinations can I use in Change log scenarios?
- Are all Change log types available for all source and target data stores?
- Does Change log behavior vary depending on the source database?
- What is the
column in the target table?Primary_Key
- Can I use the Change log business goal to replicate tables without primary keys?
- What is the DDL Change log type configuration parameter?
- What is the Log and apply DDL Change log type?
- What is the Schema evolution with new table Change log type?
- What is the Schema evolution in same change log table Change log type?
- After the mapping is created, can I review the Change log type that I selected?
- Which tables are cleaned up when I select the Change log metadata table cleanup option?
- Are metadata tables exclusive to a specific replication asset?
- Which Change log columns can I include in the Change log table?
- Which DDL commands can I use in a Change log operation?
- What is a Change log table?
- Which tables are created as part of a Change log operation?
- What is the
table?CHANGE_LOG_DDL
- Is a
table automatically created for each replication?CHANGE_LOG_DDL
- How is the
table named?CHANGE_LOG_DDL
- Backed up tables are created in the format
. Is this
the DDL type number for the corresponding DDL?sequenceCounter
- What does each column and identifier from the backed up table mean?
- What happens to metadata tables when a Change log replication is stopped or deleted?
- Is the
table always created?CHANGE_LOG_TXID_TO_COMMIT_TIMESTAMP
- What information does the the
table contain?CHANGE_LOG_TXID_TO_COMMIT_TIMESTAMP
- What information does the
metadata table contain?CHANGE_LOG_DDL_TYPE_TO_DESCRIPTION
- If I run multiple DDLs in the same transaction, are all of those DDLs recorded in the
table?CHANGE_LOG_DDL
- How can I correlate the information that is present in the
table and theCHANGE_LOG_DDL
table?CHANGE_LOG
- How can I correlate the information present in the
table and theCHANGE_LOG
table?CHANGE_LOG_TXID_TO_COMMIT_TIMESTAMP
- How are DML operations recorded in Change log scenarios?
- How are DDL operations recorded in Change log scenarios?
- Do all DDL commands get applied to the target table?
- Are all constraint modifications applied to the target tables?
- If a column is set to null, can it be changed to a nonnull value in the Change log table?
- If my metadata tables grow, how can they be cleaned up?
- What happens when a source table is truncated during a Change log operation?
What is the Change log business goal?
The Change log business goal is a standardized method of documenting and tracking modifications that are made to database schemas, tables, or data over time.
By using the Change log business goal, you can preserve consistency, guarantee compatibility with earlier versions of your data, and ease schema migrations between various environments, such as development, testing, and production.
What is the difference between the Copy scenario and the Change log scenario?
The Copy scenario maintains an identical copy of the source table at the target database. The Change log scenario maintains a history of the flow of operations, both DML and DDL, that are performed on the source database.
The three significant differences between the Copy scenario and the Change log scenario are:
- Primary keys
- Multiple instances of the same key can appear in the Change Log scenario, which means that something that is a primary key in a Copy scenario becomes a simple index in a Change Log scenario.
- See the following example, where multiple rows in the
table have the same primary key value as the source table.CHANGE_LOG
- Handling of before and after images
- Insert records have no before, so the target database has no insert operation for a before image.
- Delete records have no after, so the target database has no insert operation for an after image.
- Update records have both before and after segments, so the target database has both images inserted.
- See the following example to see how before and after images are handled for DMLs in the Change log.
- Table structure
- The Change log scenario expands before and after columns or rows based on the Change log type.
- In the Copy scenario, the target table structure is the same as the original table structure.
- See the following example, where the
table provides more information about the operations that were performed. For the Copy scenario, both source and target tables have identical data and structure.CHANGE_LOG
Example:
If you want to replicate data using the
Change log type for the following SQL operations:delete_insert_pair
INSERT INTO SOURCE_TABLE (ID, NAME) VALUES (1, 'abc');
INSERT INTO SOURCE_TABLE (ID, NAME) VALUES (2, 'xyz');
UPDATE SOURCE_TABLE SET NAME ='upd' WHERE ID = 1;
DELETE FROM SOURCE_TABLE WHERE ID = 2;
The following table is your source table and ID is the primary key:
ID | Name |
---|---|
1 | upd |
For a Copy scenario, the target table would look like:
ID | Name |
---|---|
1 | upd |
For a Change log scenario with the SQL query:
SELECT ID, NAME, OPTYPE, SEQUENCE, TRANSACTION_ID FROM TARGET_TABLE ORDER BY TRANSACTION_ID ASCENDING, SEQUENCE ASCENDING
the target table would look like:
ID | Name | OPTYPE | SEQUENCE | TRANSACTION_ID |
---|---|---|---|---|
1 | abc | Insert | 1 | 001 |
2 | xyz | Insert | 2 | 002 |
1 | abc | UpdateBefore | 3 | 003 |
1 | upd | UpdateAfter | 4 | 003 |
2 | xyz | Delete | 5 | 004 |
From which version can I use the Change log business goal?
You can use the Change log business goal from Cloud Pak for Data version 5.1.2 and later.
Which source-target database combinations can I use in Change log scenarios?
You can use the Change log business goal for the following combinations of source and target data stores:
Source data store | Target data store |
---|---|
PostgreSQL | Db2 Warehouse on Cloud |
Db2 | Db2 Warehouse on Cloud |
Db2 for z/OS | Db2 Warehouse on Cloud |
Oracle XStream | Db2 Warehouse on Cloud |
PostgreSQL | Db2 on Cloud |
Db2 | Db2 on Cloud |
Db2 for z/OS | Db2 on Cloud |
Oracle XStream | Db2 on Cloud |
PostgreSQL | Kafka |
Db2 | Kafka |
Db2 for z/OS | Kafka |
Oracle XStream | Kafka |
Are all Change log types available for all source and target data stores?
With Kafka targets, you can use only the Log and apply DDL Change log type.
With Db2 targets, you can use all three Change log types.
With watsonx.data targets, you can't use any Change log types.
Does Change log behavior vary depending on the source database?
No, there is no change in behavior based on the source database that you choose.
What is the Primary_Key
column in the target table?
Primary_Key
The Primary key in the target table is the
column. The SEQUENCE
column is an incremental counter column, starting with 1, that holds information regarding the order in which the operations
are applied to the source table. When you run a Change log replication, a primary key column in the source table becomes a non-primary key column in the target table.SEQUENCE
Can I use the Change log business goal to replicate tables without primary keys?
Yes, with the Change log business goal, you can replicate tables with or without primary keys.
What is the DDL Change log type configuration parameter?
The DDL Change log type configuration parameter indicates how to apply the schema changes to the target table. Based on the parameter, the schema changes can be applied without any attempt to retain the schema history, or after taking a backup of the schema.
There are 3 parameter values:
- Log_and_apply_ddl
- Does not retain the schema history.
- Schema_evolution_with_new_table
- Retains schema history by backing up the table for all DDLs.
- Schema_evolution_in_same_change_log_table
- Retains schema history by backing up the table only for DDLs that change table attributes like drop, rename table, otherwise columns are backed up within the same change log table.
What is the Log and apply DDL Change log type?
With the Log and apply DDL Change log type, schema changes are applied without any attempt to retain the schema history. This means that the DDL is replicated, and the previous state isn't preserved. For example, when a column is renamed, all previous values for that column and any values that were inserted after the column was renamed are all stored together in the renamed column.
This Change log type applies to relational databases and Kafka with or without Schema Registry.
What is the Schema evolution with new table Change log type?
With the Schema evolution with new table Change log type, each time that a table with a new structure is created in the source database, a new change log table is also created in the target database. The previous table is preserved and data is replicated to the new table from that point on.
The backed up table name format is
.<TABLENAME>_<OPERATION>_<SEQUENCE_COUNTER>
This type is currently not supported in Kafka, since it requires permission of autocreate topics. Interested users can raise an enhancement request.
What is the Schema evolution in same change log table Change log type?
With the Schema evolution in same change log table Change log type, any changes to the source table are reflected in one change log table.
If a column attribute changes, the column is renamed to
and a new column with a new name is added with new attributes.<COLUMNNAME>_<OPERATION>_<SEQUENCE_COUNTER>
If a table is dropped or renamed, the current state
is backed up as
and a new table is created with new name. Rows from prior table aren't present in the new table. All new operations are replicated to this new table.<TABLENAME>_<OPERATION>_<SEQUENCE_COUNTER>
After the mapping is created, can I review the Change log type that I selected?
Yes. In the Data Replication UI, hover over your replication asset and click Edit to see the current configuration of your replication asset. By navigating to the target options page, you can see the DDL type that you chose and change your selections.
Which tables are cleaned up when I select the Change log metadata table cleanup option?
The
table, which is exclusive to each replication run, is cleaned up.CHANGE_LOG_DDL_<JOB_RUN_ID>
The
and CHANGE_LOG_TXID_TO_COMMIT_TIMESTAMP
tables are cleaned up only if the position table does not exist. If the position table exists, the metadata tables are shared and they therefore
can't be dropped.CHANGE_LOG_DDL_TYPE_TO_DESCRIPTION
Are metadata tables exclusive to a specific replication asset?
Only the
table is exclusive to a specific replication asset. The CHANGE_LOG_DDL_<JOB_RUN_ID>
and CHANGE_LOG_TXID_TO_COMMIT_TIMESTAMP
tables are shared across replications.CHANGE_LOG_DDL_TYPE_TO_DESCRIPTION
Which Change log columns can I include in the Change log table?
You can include the following columns. Only the
column is required. All others are optional.SEQUENCE
SEQUENCE
- A monotonically increasing inter-transaction sequence counter that serves as a primary key for your change log target table.
OPTYPE
- An indication of what kind of DML operation happened. The possible options are:
-
- INSERT
- UPDATE_BEFORE
- UPDATE_AFTER
- DELETE
TRANSACTION_ID
- The numeric value that is used to identify each individual operation.
TIMESTAMP
- The source commit timestamp for each operation.
-
Which DDL commands can I use in a Change log operation?
You can use a total of 14 DDL commands in a Change log operation. The options are:
- CREATE_TABLE
- DROP_TABLE
- RENAME_TABLE
- ADD_COLUMN
- DROP_COLUMN
- RENAME_COLUMN
- ALTER_COLUMN
- ADD_PRIMARY_KEY
- DROP_PRIMARY_KEY
- ADD_CHECK_CONSTRAINT
- DROP_CHECK_CONSTRAINT
- ADD_UNIQUE_KEY_CONSTRAINT
- DROP_UNIQUE_KEY_CONSTRAINT
- PATCH_REPLICATION
What is a Change log table?
When you use the Change log business goal, the mapped target table is referred to as the Change log table.
Which tables are created as part of a Change log operation?
The following tables are created in the target database as part of a Change log operation:
- The Change log table
- The mapped target table.
- The
tableCHANGE_LOG_DDL_<JOB_RUN_ID>
- Created for each replication run.
- The
tableCHANGE_LOG_TXID_TO_COMMIT_TIMESTAMP
- Shared across all replications to the target database and created for each schema.
- Created only if it does not already exist in the target schema.
- The
tableCHANGE_LOG_DDL_TYPE_TO_DESCRIPTION
- Shared across all replications to the target database and is created for each schema.
- Created only if it does not already exist in the target schema.
- Not applicable to non-relational targets such as Kafka Target.
What is the CHANGE_LOG_DDL
table?
CHANGE_LOG_DDL
The
table is a metadata table that is created during a Change log replication. It stores information about the various DDLs operations that are performed at the source in the order that they are applied. Information
that is stored in the table includes:CHANGE_LOG_DDL
- sequence_id
- An automatically generated integer that is used to identify the order in which individual replications occurred.It is unrelated to the
that is used in the Change log table.SEQUENCE_COUNTER
- ddl_type
- The type of DDL that was replicated. See the DDL type table.
- ddl_info
- The complete DDL in SQL format.
- ddl_object
- The name of the object where the DDL modification occurred.
- timestamp
- The time when the DDL replication occurred. If the replication was patched, this column will contain only the
timestamp.patch_detection
Is a CHANGE_LOG_DDL
table automatically created for each replication?
CHANGE_LOG_DDL
If you choose a Db2 target, the
table is automatically created for each run.CHANGE_LOG_DDL
If you choose a Kafka target, you must specify whether you want to record DDL information in a topic. To record DDL information, you can specify a topic name in the Data Replication UI where the information can be stored.
How is the CHANGE_LOG_DDL
table named?
CHANGE_LOG_DDL
The
table is named in the following format: CHANGE_LOG_DDL
, where the CHANGE_LOG_DDL_<JOB_RUN_ID>
is the unique ID that is associated with the replication run. It can be easily found from
the Data Replication UI after replication starts.JOB_RUN_ID
Backed up tables are created in the format <tableName>_<ddlTypeName>_<sequenceCounter>
. Is this sequenceCounter
the DDL type number for the corresponding DDL?
<tableName>_<ddlTypeName>_<sequenceCounter>
sequenceCounter
No,
indicates that it is the fifth DDL that was applied and caused a backup of the table, and the DDL was applied to the
T1 table.T1_RENAMED_5
What does each column and identifier from the backed up table mean?
The Change log table is renamed in the format:
.<TABLENAME>_<OPERATION>_<SEQUENCE_COUNTER>
- OPERATION
- Indicates the type of DDL operation that caused the table to be backed up and renamed.
- SEQUENCE_COUNTER
- Indicates the order of the DDL that was applied. If sequence counter is
, then this was the nth DDL that was applied to the source database during this run.n
DDL | OPERATION |
---|---|
COLUMNADDED | Add Column |
COLUMNRENAMED | Rename Column |
COLUMNDROPPED | Drop Column |
COLUMNALTERED | Alter Column |
RENAMED | Rename Table |
DROPPED | Drop Table |
TRUNCATED | Truncate Table |
What happens to metadata tables when a Change log replication is stopped or deleted?
When a Change log replication is stopped or deleted, the system checks whether the
option is selected. If it is selected, metadata tables are cleaned up based on as follows:CHANGE_LOG_METADATA_CLEANUP
- The
table for the specific replication run is cleaned up.CHANGE_LOG_DDL
- The
andCHANGE_LOG_TXID_TO_COMMIT_TIMESTAMP
tables are cleaned up only if the position table does not exist. If the position table exists, the metadata tables are shared and they therefore can't be dropped.CHANGE_LOG_DDL_TYPE_TO_DESCRIPTION
Is the CHANGE_LOG_TXID_TO_COMMIT_TIMESTAMP
table always created?
CHANGE_LOG_TXID_TO_COMMIT_TIMESTAMP
When you choose a Db2 target, this table is created in the target database, shared across all replications to the database, and is created for each schema. The table is created for the first replication to the database or if it does not exist in the target database.
When you choose a Kafka target, you can specify the topic where this information needs to be stored for each replication. Therefore, you can choose to share the topic across replications or create and specify different topics for each replication.
What information does the the CHANGE_LOG_TXID_TO_COMMIT_TIMESTAMP
table contain?
CHANGE_LOG_TXID_TO_COMMIT_TIMESTAMP
The
table contains mapping information between the source CHANGE_LOG_TXID_TO_COMMIT_TIMESTAMP
and its TRANSACTION_ID
for each run. This table logs information only if your Change log uses
both the COMMIT_TIMESTAMP
and TRANSACTION_ID
columns.COMMIT_TIMESTAMP
The table contains 3 columns:
JOB_RUN_ID
- The Run ID of the current replication.
TRANSACTION_ID
- The Transaction ID for each operation.
COMMIT_TIMESTAMP
- The Commit timestamp for each operation.
What information does the CHANGE_LOG_DDL_TYPE_TO_DESCRIPTION
metadata table contain?
CHANGE_LOG_DDL_TYPE_TO_DESCRIPTION
The
table contains a description of the CHANGE_LOG_DDL_TYPE_TO_DESCRIPTION
that is mentioned in ddl_type
table. This information is not applicable to non-relational targets such as Kafka.CHANGE_LOG_DDL
The table contains two columns:
ddl_type
- A numeric column that indicates a unique ddl type referred to in the change log table.
- Description
- Describes the DDL operation that corresponds to the ddl_type.
If I run multiple DDLs in the same transaction, are all of those DDLs recorded in the CHANGE_LOG_DDL
table?
CHANGE_LOG_DDL
Yes, regardless of whether multiple DDLs are run as part of the same transaction or separate transactions, each DDL that is run in the source is recorded in the
table.CHANGE_LOG_DDL
How can I correlate the information that is present in the CHANGE_LOG_DDL
table and the CHANGE_LOG
table?
CHANGE_LOG_DDL
CHANGE_LOG
Both tables have a timestamp column,
for the DDL_TIMESTAMP
table and CHANGE_LOG_DDL
for the COMMIT_TIMESTAMP
table. By combining the information on both these tables based on the
timestamp column, you can correlate the information and have a better understanding of the order of operations.CHANGE_LOG
If you to correlate with transaction IDs:
- Combine information from the
table and theCHANGE_LOG_DDL
table to get the correct timestamp order.CHANGE_LOG
- Map the timestamps to the
to get the transaction IDs of the operations in the order they were performed.CHANGE_LOG_TXID_TO_COMMIT_TIMESTAMP
How can I correlate the information present in the CHANGE_LOG
table and the CHANGE_LOG_TXID_TO_COMMIT_TIMESTAMP
table?
CHANGE_LOG
CHANGE_LOG_TXID_TO_COMMIT_TIMESTAMP
If you want to correlate DML operations, you can use the
or TRANSACTION_ID
columns.SEQUENCE_ID
For example, if you perform Transaction 1, which inserts a row, and Transaction 2, which deletes the same row, you can sort by
to determine that the row doesn't exist now.TRANSACTION_ID
Or, if you perform Transaction 1, which inserts and deletes a row, and Transaction 2, which inserts and updates the same row, you can sort by operation
to know the current value of the row, and see the evolution that
it went through.SEQUENCE_ID
How are DML operations recorded in Change log scenarios?
Each DML operation that is performed on the source database or table is applied as a new record to the target change log table, along with additional information based on which Change log column was selected.
Example:
You want to perform the following sequence of SQL operations:
INSERT INTO SOURCE_TABLE (ID, NAME) VALUES (1, 'abc');
INSERT INTO SOURCE_TABLE (ID, NAME) VALUES (2, 'xyz');
UPDATE SOURCE_TABLE SET NAME ='upd' WHERE ID = 1;
DELETE FROM SOURCE_TABLE WHERE ID = 2;
The target table will look like:
ID | Name | OPTYPE | SEQUENCE | TRANSACTION_ID |
---|---|---|---|---|
1 | abc | Insert | 1 | 001 |
2 | xyz | Insert | 2 | 002 |
1 | abc | UpdateBefore | 3 | 003 |
1 | upd | UpdateAfter | 4 | 003 |
2 | xyz | Delete | 5 | 004 |
How are DDL operations recorded in Change log scenarios?
In the Change log scenario, each DDL operation that is performed in the source database or table is recorded in the
metadata table. Additional information about the type, order, time, and the SQL command are also
applied.CHANGE_LOG_DDL
The DDLs are applied to the target table based on the DDL Change log type.
Do all DDL commands get applied to the target table?
No, not all DDL commands get applied to the target tables.
or Add
commands of the Primary key, Unique key, and Check constraints are not replicated to the target
tables. Instead, those operations are only recorded in the Drop
table.CHANGE_LOG_DDL
Are all constraint modifications applied to the target tables?
No, not all constraint modifications are applied to the target tables. Primary, Unique, and Check constraints are not replicated to the target tables. Instead, these operations in the source
table are recorded in the
table.CHANGE_LOG_DDL
If a column is set to null, can it be changed to a nonnull value in the Change log table?
If you choose the Schema evolution in same change log table type, you can never change a column to a nonnull value after you set it to null.
If you choose the Log and apply DDL or Schema evolution with new table type, you can change a column to a nonnull value after you set it to null.
If my metadata tables grow, how can they be cleaned up?
You need to define a cleanup strategy for your metadata tables. They can be purged if they take up too much space, but it is up to you to decide on the cleanup strategy.
What happens when a source table is truncated during a Change log operation?
If you choose the Log and apply DDL type, the
table is not truncated. All rows remain in the CHANGE_LOG
table even though the source table is truncated.CHANGE_LOG
If you choose the Schema evolution with new table or Schema evolution in the same change log table types, a backup of the
table is taken, and this backup table is renamed: CHANGE_LOG
.
Then, a new table is created with the same name and structure as the table before backup, and a row is inserted in the <TABLENAME>_TRUNCATED_<SEQUENCE_COUNTER>
table, which states that CHANGE_LOG_DDL
was run.TRUNCATE
Parent topic: Setting a business goal for a Data Replication asset