In this tutorial we will learn a new SAP Data services transform, known as Table Comparison Transform and we will see how we may use this transform to implement "slowly changing dimension" (SCD) Type - I. Like before, we have added a video tutorial at the end to perform a little hands-on.

This article is part of our comprehensive data services tutorial "Learning SAP Data Services - Online Tutorial", you may want to check that tutorial first if you have not already done so.

Table Comparison Transform

This is one of the SAP BusinessObjects Data Services native transform (For a full list of all transforms, you can see details of SAP Data services transforms) that helps to perform a fundamental ETL operation - comparing two data sets.

Comparing two data sets, this transform can generate the difference between them as a resultant dataset with each row of the result flagged as INSERT, UPDATE, or DELETE.

While loading data to a target table, this transform can be used to ensure rows are not duplicated in a target table and hence is very helpful to load a dimension table.

Implementing SCD Type1 dimension loading

In the video below we have demonstrated how to implement SCD - I. We strongly encourage you to watch the video along with reading this text. The basic logic for implementing SCD-I is, we will compare each record coming from source to the records in target and if the record exists in target, we will update the target record with the values from the source record. However, if the record does not exist, we will insert the record to target.

"Table Comparison Transform" help to compare the source records with target. You have to provide the source data (input schema) and the columns based on which you want to perform the comparison. Generally we specify the primary key column (or all the columns forming the composite primary key) so that it can perform the comparison based on the primary key column. Further to this, we can also specify "Compare Columns" - which is a list of columns that we intend track the changes of. As an example, let's say we do not want to update a customer's record even if a certain column has changed in the source and only want to update the target record if some other columns (e.g. column X, Y and Z) have changed. In that case, we have to put column X, Y and Z in "Compare Columns" list.

Note on the Comparison Method

There are three methods for accessing the comparison table namely Row-by-row select, Cached comparison table and Sorted input. Below is the brief on when to select which option.

  • Row-by-row select option is best if the target table is large compared to the number of rows the transform will receive as input. In this case for every input row the transform fires a SQL to lookup the target table.
  • Cached comparison table option is best when we are comparing the entire target table. DS uses pageable cache as the default. If the table fits in the available memory, we can change the Cache type property of the dataflow to In-Memory.
  • Sorted input option is best when the input data is pre sorted based on the primary key columns. DS reads the comparison table in the order of the primary key columns using sequential read only once. NOTE: The order of the input data set must exactly match the order of all primary key columns in the Table_Comparison transform.


Have a question on this subject?

Ask questions to our expert community members and clear your doubts. Asking question or engaging in technical discussion is both easy and rewarding.

Are you on Twitter?

Start following us. This way we will always keep you updated with what's happening in Data Analytics community. We won't spam you. Promise.

  • DENSE RANK in Data Services

    In this article, we will learn how to implement DENSE RANK operation in SAP Data Services.

  • Data Services Scenario Questions Part 7

    In this tutorial we will discuss some scenario based questions and their solutions using SAP Data Services. This article is meant mainly for Data Services beginners.

  • Why SAP Data Services

    SAP BusinessObjects Data Services delivers a single enterprise-class solution for Data Integration, Data Quality, Data Profiling, and Text Data Processing that allows us to integrate, transform, improve, and deliver trusted data to critical...

  • Data Services Metadata Query Part 3

    This article is a continuation of the previous topic related to Data Services Metadata Query. Let us explore more into the Data Services Repository Metadata.

  • Data Services Scenario Questions Part 5

    In this tutorial we will discuss some scenario based questions and their solutions using SAP Data Services. This article is meant mainly for Data Services beginners.

  • SAP Data Services Custom Function

    We can create our own functions by writing script functions in SAP Data Services scripting language using the smart editor. User-script functions return a value.

  • Real Time Jobs in Data Services

    SAP Data Services provides Realtime Data Integration features. Data Services can generate realtime event messages and can also listen to input XML messages. Real Time Job means, some message with content will trigger the Job or Dataflow processing. A RealTime...

  • Data Services Scenario Questions Part 3

    In this tutorial we will discuss some scenario based questions and their solutions using SAP Data Services. This article is meant mainly for Data Services beginners.

  • Fools Guide to BODS - Introductory Tutorial

    This article is a guide to the readers who are totally ignorant about the tool SAP Business Objects Data Services. It is presented in a very simple way so that readers would be familiar with common terms, terminologies and would be able to run...

  • Map_CDC_Operation in Data Services

    Map CDC Operation is used to support relational or mainframe changed-data capture (CDC). This transform reads the source DML changes i.e. INSERT/UPDATE/DELETE and reflects the same in the target system. So the transform maps all the source rows...