For transactional (non-bulk) data load in DataStage using Oracle connectors, it is always the best option to use the "update then insert" mode. As shown in this post, the "update then insert" mode is as fast as the "insert then update" mode when it is 100% inserts, and outperform other modes when updates has higher percentage. Even if you know that the incoming data is 100% inserts, "update then insert" is still a better choice. Because your job may fail in the middle, and the data has been partially loaded. Now, if you restart the job, it is no longer 100% inserts.
In order to use the "update then insert" mode, you need to specify key column(s) in an Oracle connector stage, so that it knows which columns should be matched in where clause while generating update statement at run-time. Alternatively, you can provide insert/update statements yourself.
Writing insert/update statements by hand is tedious, especially if you have a lot of tables. The following SQL scripts generate insert/update statements automatically. You may paste the output into Oracle connector stages directly. Or you may wrap the scripts in a simple parallel job, and save the output into a text file. Use a sequence job to loop through hundreds of tables.
For update statement:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
with col_info as ( | |
select | |
c.owner, | |
c.table_name, | |
c.column_name, | |
c.column_id, | |
pc.column_name as PK_COL | |
from | |
all_tab_cols c join all_constraints p on | |
c.owner = p.owner and | |
c.table_name = p.table_name and | |
p.constraint_type = 'P' | |
left join all_cons_columns pc on | |
p.constraint_name = pc.constraint_name and | |
p.owner = pc.owner and | |
c.column_name = pc.column_name | |
where | |
c.owner = '#OWNER#' and | |
c.table_name = '#TGT_TABLE#' | |
order by c.column_id | |
) select | |
'UPDATE #TGT_TABLE# SET ' || | |
listagg(case when pk_col is null then column_name || ' = ORCHESTRATE.' || column_name end, ', ') within group (order by column_id) || | |
' WHERE ' || | |
listagg(case when pk_col is not null then pk_col || ' = ORCHESTRATE.' || pk_col end, ' AND ') within group (order by column_id) as UPD_SQL | |
from col_info |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
with col_info as ( | |
select | |
c.owner, | |
c.table_name, | |
c.column_name, | |
c.column_id, | |
pc.column_name as PK_COL | |
from | |
all_tab_cols c join all_constraints p on | |
c.owner = p.owner and | |
c.table_name = p.table_name and | |
p.constraint_type = 'P' | |
left join all_cons_columns pc on | |
p.constraint_name = pc.constraint_name and | |
p.owner = pc.owner and | |
c.column_name = pc.column_name | |
where | |
c.owner = '#OWNER#' and | |
c.table_name = '#TGT_TABLE#' | |
order by c.column_id | |
) select | |
'INSERT INTO #TGT_TABLE# (' || | |
listagg(column_name, ', ') within group (order by column_id) || | |
') VALUES (' || | |
listagg('ORCHESTRATE.' || column_name, ', ') within group (order by column_id) || | |
')' as INS_SQL | |
from col_info |