In this tutorial we will learn a new SAP Data services transform, known as Table Comparison Transform and we will see how we may use this transform to implement "slowly changing dimension" (SCD) Type - I. Like before, we have added a video tutorial at the end to perform a little hands-on.

This article is part of our comprehensive data services tutorial "Learning SAP Data Services - Online Tutorial", you may want to check that tutorial first if you have not already done so.

Table Comparison Transform

This is one of the SAP BusinessObjects Data Services native transform (For a full list of all transforms, you can see details of SAP Data services transforms) that helps to perform a fundamental ETL operation - comparing two data sets.

Comparing two data sets, this transform can generate the difference between them as a resultant dataset with each row of the result flagged as INSERT, UPDATE, or DELETE.

While loading data to a target table, this transform can be used to ensure rows are not duplicated in a target table and hence is very helpful to load a dimension table.

Implementing SCD Type1 dimension loading

In the video below we have demonstrated how to implement SCD - I. We strongly encourage you to watch the video along with reading this text. The basic logic for implementing SCD-I is, we will compare each record coming from source to the records in target and if the record exists in target, we will update the target record with the values from the source record. However, if the record does not exist, we will insert the record to target.

"Table Comparison Transform" help to compare the source records with target. You have to provide the source data (input schema) and the columns based on which you want to perform the comparison. Generally we specify the primary key column (or all the columns forming the composite primary key) so that it can perform the comparison based on the primary key column. Further to this, we can also specify "Compare Columns" - which is a list of columns that we intend track the changes of. As an example, let's say we do not want to update a customer's record even if a certain column has changed in the source and only want to update the target record if some other columns (e.g. column X, Y and Z) have changed. In that case, we have to put column X, Y and Z in "Compare Columns" list.

Note on the Comparison Method

There are three methods for accessing the comparison table namely Row-by-row select, Cached comparison table and Sorted input. Below is the brief on when to select which option.

  • Row-by-row select option is best if the target table is large compared to the number of rows the transform will receive as input. In this case for every input row the transform fires a SQL to lookup the target table.
  • Cached comparison table option is best when we are comparing the entire target table. DS uses pageable cache as the default. If the table fits in the available memory, we can change the Cache type property of the dataflow to In-Memory.
  • Sorted input option is best when the input data is pre sorted based on the primary key columns. DS reads the comparison table in the order of the primary key columns using sequential read only once. NOTE: The order of the input data set must exactly match the order of all primary key columns in the Table_Comparison transform.


Have a question on this subject?

Ask questions to our expert community members and clear your doubts. Asking question or engaging in technical discussion is both easy and rewarding.

Are you on Twitter?

Start following us. This way we will always keep you updated with what's happening in Data Analytics community. We won't spam you. Promise.

  • Data Services Flatfiles Tips

    Often we come across scenarios where we have the flat file definition in an excel sheet and we need to create corresponding File Format in SAP Data Services. Alternatively we import file format definition from a Sample Source file.

  • Auditing in SAP Data Services

    A proper data reconciliation process must be in place in any data Extraction-Transformation-Load (ETL) process. A successful reconciliation process should only indicate whether or not the data is correct. But data reconciliation is not easy....

  • Real Time Jobs in Data Services

    SAP Data Services provides Realtime Data Integration features. Data Services can generate realtime event messages and can also listen to input XML messages. Real Time Job means, some message with content will trigger the Job or Dataflow processing. A RealTime...

  • Web Service Call in SAP Data Services

    This article demonstrates, how to implement external web services call in SAP Data Services.

  • Data Services Scenario Questions Part 7

    In this tutorial we will discuss some scenario based questions and their solutions using SAP Data Services. This article is meant mainly for Data Services beginners.

  • Data Services Scenario Questions Part 1

    In this tutorial we will discuss some scenario based questions and their solutions using SAP Data Services. This article is meant mainly for Data Services beginners.

  • Text Data Processing using SAP Data Services

    This article deals with Text Data Processing using SAP Business Objects Data Services with the intension of Text Analytics. SAP BODS provides a single ETL platform for both Structured and Unstructured data as well as Data Quality, Data Profiling...

  • How to use Data Services Reverse Pivot Transformation

    In this article, we will learn how to use SAP Data Services Reverse Pivot Transform. The Reverse Pivot transformation combines data from several rows into one row by creating new columns. For each unique value in a pivot axis column and each...

  • Fools Guide to BODS - Introductory Tutorial

    This article is a guide to the readers who are totally ignorant about the tool SAP Business Objects Data Services. It is presented in a very simple way so that readers would be familiar with common terms, terminologies and would be able to run...

  • Fools Guide to BODS - Registering Repository to CMC

    In our earlier article, we have learnt how to create a repository in BODS. Once that part is done, The final process before launching the BODS application is registering the repository with...