Skip to content

RAC: cross-thread DDL not applied before processing DML from other threads #1

@rophy

Description

@rophy

Version: 16414786 (master)

Description

In RAC multi-thread mode, when a DDL statement (e.g., ALTER TABLE ADD COLUMN) is executed on node 1, the DDL redo records are only in thread 1's redo logs. If subsequent DML on node 2 uses the new schema (e.g., inserts with the new column), OLR fails because it may process thread 2's redo before encountering the DDL in thread 1's redo.

OLR should interleave redo processing across threads in SCN order, so that a DDL at SCN X in thread 1 is applied before DML at SCN Y (where Y > X) in thread 2.

Steps to reproduce

  1. On a 2-node Oracle RAC, create a table with 3 columns
  2. Insert rows on node 1 (thread 1 redo) and commit
  3. ALTER TABLE ADD COLUMN on node 1 (thread 1 redo)
  4. Insert rows on node 2 using the new column (thread 2 redo) and commit
  5. Capture archive redo logs from both threads
  6. Run OLR in batch mode against both threads' archives

SQL scenario (rac-ddl-cross-node.rac.sql):

-- @SETUP
CREATE TABLE TEST_RAC_DDL (id NUMBER PRIMARY KEY, name VARCHAR2(100), val NUMBER);
ALTER TABLE TEST_RAC_DDL ADD SUPPLEMENTAL LOG DATA (ALL) COLUMNS;

-- @NODE1
INSERT INTO TEST_RAC_DDL VALUES (1, 'Alice', 100);
INSERT INTO TEST_RAC_DDL VALUES (2, 'Bob', 200);
COMMIT;
ALTER TABLE TEST_RAC_DDL ADD (email VARCHAR2(200));

-- @NODE2
INSERT INTO TEST_RAC_DDL VALUES (3, 'Charlie', 300, 'charlie@test.com');
UPDATE TEST_RAC_DDL SET email = 'alice@test.com' WHERE id = 1;
COMMIT;

-- @NODE1
UPDATE TEST_RAC_DDL SET val = 250, email = 'bob@test.com' WHERE id = 2;
DELETE FROM TEST_RAC_DDL WHERE id = 1;
COMMIT;

Expected result

OLR processes all 6 DML operations correctly, detecting the DDL (ALTER TABLE ADD COLUMN) from thread 1's redo before processing thread 2's post-DDL DML.

Actual result

OLR starts processing thread 2's archive first and encounters an INSERT with 4 columns, but the schema only has 3 columns (DDL in thread 1 hasn't been read yet):

processing redo log: group: 0 scn: 3409107 to 3409343 seq: 49 path: /data/redo/2_49_1226247091.arc offset: 1024
ERROR 50060 runtime error, aborting further redo log processing: table: OLR_TEST.TEST_RAC_DDL: referring to invalid column id(3), xid: 0x0014.01f.000000c8, offset: 11592

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions