Datascience ML and AI: Incremental Aggregation in Informatica

Tuesday, 16 September 2014

Incremental Aggregation in Informatica

Using Incremental Aggregation

If you can capture changes from the source that affect less than half the target, you can use incremental aggregation to optimize the performance of Aggregator transformations.

When you use incremental aggregation, you apply captured changes in the source to aggregate calculations in a session. The Integration Service updates the target incrementally, rather than processing the entire source and recalculating the same calculations every time you run the session.

You can increase the index and data cache sizes to hold all data in memory without paging to disk.

Process :

Step 1: Create Source Table & Data

Create table in source database

CREATE TABLE SRC_INVOICE_INCR_AGG

( CUSTOMER_KEY NUMBER,

INVOICE_KEY NUMBER,

AMOUNT NUMBER,

TRANSACTION_DATE DATE

)

Insert data into source table

Insert into SRC_INVOICE_INCR_AGG (CUSTOMER_KEY,INVOICE_KEY,AMOUNT,TRANSACTION_DATE) values (101,201,5000,to_timestamp('12-SEP-14','DD-MON-RR HH.MI.SSXFF AM'));

Insert into SRC_INVOICE_INCR_AGG (CUSTOMER_KEY,INVOICE_KEY,AMOUNT,TRANSACTION_DATE) values (101,203,1000,to_timestamp('11-SEP-14','DD-MON-RR HH.MI.SSXFF AM'));

Insert into SRC_INVOICE_INCR_AGG (CUSTOMER_KEY,INVOICE_KEY,AMOUNT,TRANSACTION_DATE) values (102,205,1000,to_timestamp('10-SEP-14','DD-MON-RR HH.MI.SSXFF AM'));

Insert into SRC_INVOICE_INCR_AGG (CUSTOMER_KEY,INVOICE_KEY,AMOUNT,TRANSACTION_DATE) values (102,202,6000,to_timestamp('13-SEP-14','DD-MON-RR HH.MI.SSXFF AM'));

Insert into SRC_INVOICE_INCR_AGG (CUSTOMER_KEY,INVOICE_KEY,AMOUNT,TRANSACTION_DATE) values (103,204,2000,to_timestamp('12-SEP-14','DD-MON-RR HH.MI.SSXFF AM'));

Insert into SRC_INVOICE_INCR_AGG (CUSTOMER_KEY,INVOICE_KEY,AMOUNT,TRANSACTION_DATE) values (104,206,10000,to_timestamp('14-SEP-14','DD-MON-RR HH.MI.SSXFF AM'));

Import table using source analyzer

Step 2: Create Target Table

Create target table in target database

CREATE TABLE TGT_INVOICE_INCR_AGG

( CUSTOMER_KEY NUMBER,

AMOUNT NUMBER,

PRIMARY KEY (CUSTOMER_KEY)

)

Import table using Target Designer

Step 3: Create mapping with transformations

Create a mapping with the name m_INCR_AGG
Drag and drop source and target into work area
Double click on source qualifier → Go to Properties tab → Click on SQL QUERY
Provide below sql query by changing red color repository schema name as per our repository

SELECT SRC_INVOICE_INCR_AGG.CUSTOMER_KEY, SRC_INVOICE_INCR_AGG.AMOUNT

FROM

SRC_INVOICE_INCR_AGG

where Transaction_DATE >(select NVL(max(END_TIME),to_date('01-01-1900','DD-MM-YYYY')) from B205rep.OPB_WFLOW_RUN

where workflow_name ='$PMWorkflowName')

Click on Validate → click on OK → Click on OK
Create Aggregator Transformation as shown below and connect to target
Save the Mapping

Step 4: Create Workflow and Session

Create workflow and session as shown below

Double click on session task provide source and target connection accordingly
Click on properties tab of session task → select Incremental aggregation
Save the workflow

Step 5: Run workflow and Observe output

Select Start task → click on run
observe target data
Cache created
Observe target table output
Add two records to source table by executing below scripts

INSERT INTO "SDBU"."SRC_INVOICE_INCR_AGG" (CUSTOMER_KEY, INVOICE_KEY, AMOUNT, TRANSACTION_DATE) VALUES ('103', '208', '1500', TO_DATE('21-SEP-14', 'DD-MON-RR'));

INSERT INTO "SDBU"."SRC_INVOICE_INCR_AGG" (CUSTOMER_KEY, INVOICE_KEY, AMOUNT, TRANSACTION_DATE) VALUES ('105', '207', '4000', TO_DATE('20-SEP-14', 'DD-MON-RR'));

Commit;

Run again workflow and observe output
103 record updated from cache as it is already available and cache updated
105 record new cache group created

Pages

Tuesday, 16 September 2014

Incremental Aggregation in Informatica

No comments:

Post a Comment