In this "DWBI Concepts' Original article", we put Oracle database and Informatica PowerCentre to lock horns to prove which one of them handles data SORTing operation faster. This article gives a crucial insight to application developer in order to take informed decision regarding performance tuning. Original Research

Which is the fastest? Informatica or Oracle?

Informatica is one of the leading data integration tools in today’s world. More than 4,000 enterprises worldwide rely on Informatica to access, integrate and trust their information assets with it. On the other hand, Oracle database is arguably the most successful and powerful RDBMS system that is trusted from 1980s in all sorts of business domain and across all major platforms. Both of these systems are bests in the technologies that they support. But when it comes to the application development, developers often face challenge to strike the right balance of operational load sharing between these systems.

Think about a typical ETL operation often used in enterprise level data integration. A lot of data processing can be either redirected to the database or to the ETL tool. In general, both the database and the ETL tool are reasonably capable of doing such operations with almost same efficiency and capability. But in order to achieve the optimized performance, a developer must carefully consider and decide which system s/he should be trusting with for each individual processing task.

In this article, we will take a basic database operation – Sorting, and we will put these two systems to test in order to determine which does it faster than the other, if at all.

Which sorts data faster? Oracle or Informatica?

As an application developer, you have the choice of either using ORDER BY in database level to sort your data or using SORTER TRANSFORMATION in Informatica to achieve the same outcome. The question is – which system performs this faster?

Test Preparation

We will perform the same test with different data points (data volumes) and log the results. We will start with 1 million records and we will be doubling the volume for each next data points. Here are the details of the setup we will use,

  1. Oracle 10g database as relational source and target
  2. Informatica PowerCentre 8.5 as ETL tool
  3. Database and Informatica setup on different physical servers using HP UNIX
  4. Source database table has no constraint, no index, no database statistics and no partition
  5. Source database table is not available in Oracle shared pool before the same is read
  6. There is no session level partition in Informatica PowerCentre
  7. There is no parallel hint provided in extraction SQL query
  8. The source table has 10 columns and first 8 columns will be used for sorting
  9. Informatica sorter has enough cache size

We have used two sets of Informatica PowerCentre mappings created in Informatica PowerCentre designer. The first mapping m_db_side_sort will use an ORDER BY clause in the source qualifier to sort data in database level. Second mapping m_Infa_side_sort will use an Informatica sorter to sort data in informatica level. We have executed these mappings with different data points and logged the result.

Result

The following graph shows the performance of Informatica and Database in terms of time taken by each system to sort data. The time is plotted along vertical axis and data volume is plotted along horizontal axis.

Performance Graph

Verdict

The above experiment demonstrates that Oracle database is faster in SORT operation than Informatica by an average factor of 14%.

Assumption

  1. Average server load remains same during all the experiments
  2. Average network speed remains same during all the experiments

1. This data can only be used for performance comparison but cannot be used for performance benchmarking.
2. This data is only indicative and may vary in different testing conditions.

To know the Informatica and Oracle performance comparison for JOIN operation, please click here


Have a question on this subject?

Ask questions to our expert community members and clear your doubts. Asking question or engaging in technical discussion is both easy and rewarding.

Are you on Twitter?

Start following us. This way we will always keep you updated with what's happening in Data Analytics community. We won't spam you. Promise.

  • Informatica Reject File - How to Identify rejection reason

    When we run a session, the integration service may create a reject file for each target instance in the mapping to store the target reject record. With the help of the Session Log and Reject File we can identify the cause of data rejection in the...

  • What is Active Lookup Transformation

    Informatica 9x allows us to configure Lookup transformation to return multiple rows. So now we can retrieve multiple rows from a lookup table thus making Lookup transformation an Active transformation type.

  • Loading Flatfiles delimited by comma and double quotes

    In this article let us take up a very trivial but an important aspect that we as DW developers usual face. This is related to loading flat file sources. Whenever we have flat file sources we usual ask source systems for a specific type of field...

  • Stop Hardcoding- Follow Parameterization Technique

    This article tries to minimize hard-coding in ETL, thereby increasing flexibility, reusability, readabilty and avoides rework through the judicious use of Informatica Parameters and Variables.

  • Comparing Performance of SORT operation (Order By) in Informatica and Oracle

    In this "DWBI Concepts' Original article", we put Oracle database and Informatica PowerCentre to lock horns to prove which one of them handles data SORTing operation faster. This article gives a crucial insight to application developer in order to...

  • Informatica Join Vs Database Join

    In this yet another "DWBI Concepts' Original article", we test the performance of Informatica PowerCentre 8.5 Joiner transformation versus Oracle 10g database join. This article gives a crucial insight to application developer in order to take...

  • Implementing Informatica Incremental Aggregation

    Using incremental aggregation, we apply captured changes in the source data (CDC part) to aggregate calculations in a session. If the source changes incrementally and we can capture the changes, then we can configure the session to process those...

  • Using Informatica Normalizer Transformation

    Normalizer transformation is a native transformation in Informatica that can ease many complex data transformation requirements. Learn how to effectively use normalizer in this tutorial.

  • Implementing SCD2 in Informatica Using ORA_HASH at Source

    In this article we shall see how we can implement SCD type2 in Informatica using ORA_HASH, which is an ORACLE function that computes hash value for a given expression. We can use this feature to find the existence of any change in any of the SCD...

  • Calling C executable from Java Transform

    We are going to do is, to call C++ Executable from Informatica, using Passive Java Transform and capture the output of the C++ using Java and write the result to corresponding target column.