In this "DWBI Concepts' Original article", we put Oracle database and Informatica PowerCentre to lock horns to prove which one of them handles data SORTing operation faster. This article gives a crucial insight to application developer in order to take informed decision regarding performance tuning. Original Research

Which is the fastest? Informatica or Oracle?

Informatica is one of the leading data integration tools in today’s world. More than 4,000 enterprises worldwide rely on Informatica to access, integrate and trust their information assets with it. On the other hand, Oracle database is arguably the most successful and powerful RDBMS system that is trusted from 1980s in all sorts of business domain and across all major platforms. Both of these systems are bests in the technologies that they support. But when it comes to the application development, developers often face challenge to strike the right balance of operational load sharing between these systems.

Think about a typical ETL operation often used in enterprise level data integration. A lot of data processing can be either redirected to the database or to the ETL tool. In general, both the database and the ETL tool are reasonably capable of doing such operations with almost same efficiency and capability. But in order to achieve the optimized performance, a developer must carefully consider and decide which system s/he should be trusting with for each individual processing task.

In this article, we will take a basic database operation – Sorting, and we will put these two systems to test in order to determine which does it faster than the other, if at all.

Which sorts data faster? Oracle or Informatica?

As an application developer, you have the choice of either using ORDER BY in database level to sort your data or using SORTER TRANSFORMATION in Informatica to achieve the same outcome. The question is – which system performs this faster?

Test Preparation

We will perform the same test with different data points (data volumes) and log the results. We will start with 1 million records and we will be doubling the volume for each next data points. Here are the details of the setup we will use,

  1. Oracle 10g database as relational source and target
  2. Informatica PowerCentre 8.5 as ETL tool
  3. Database and Informatica setup on different physical servers using HP UNIX
  4. Source database table has no constraint, no index, no database statistics and no partition
  5. Source database table is not available in Oracle shared pool before the same is read
  6. There is no session level partition in Informatica PowerCentre
  7. There is no parallel hint provided in extraction SQL query
  8. The source table has 10 columns and first 8 columns will be used for sorting
  9. Informatica sorter has enough cache size

We have used two sets of Informatica PowerCentre mappings created in Informatica PowerCentre designer. The first mapping m_db_side_sort will use an ORDER BY clause in the source qualifier to sort data in database level. Second mapping m_Infa_side_sort will use an Informatica sorter to sort data in informatica level. We have executed these mappings with different data points and logged the result.

Result

The following graph shows the performance of Informatica and Database in terms of time taken by each system to sort data. The time is plotted along vertical axis and data volume is plotted along horizontal axis.

Performance Graph

Verdict

The above experiment demonstrates that Oracle database is faster in SORT operation than Informatica by an average factor of 14%.

Assumption

  1. Average server load remains same during all the experiments
  2. Average network speed remains same during all the experiments

1. This data can only be used for performance comparison but cannot be used for performance benchmarking.
2. This data is only indicative and may vary in different testing conditions.

To know the Informatica and Oracle performance comparison for JOIN operation, please click here


Have a question on this subject?

Ask questions to our expert community members and clear your doubts. Asking question or engaging in technical discussion is both easy and rewarding.

Are you on Twitter?

Start following us. This way we will always keep you updated with what's happening in Data Analytics community. We won't spam you. Promise.

  • Generate Surrogate Key without using Sequence Generator

    It is possible to generate sequential surrogate key in the target table without the use of an Informatica Sequence Generator transformation. Using this option, one can avoid any gap in the sequence numbers of the surrogate key.

  • Calling C executable from Java Transform

    We are going to do is, to call C++ Executable from Informatica, using Passive Java Transform and capture the output of the C++ using Java and write the result to corresponding target column.

  • Implementing Informatica Partitions

    Identification and elimination of performance bottlenecks will obviously optimize session performance. After tuning all the mapping bottlenecks, we can further optimize session performance by increasing the number of pipeline partitions in the...

  • How to Tune Performance of Informatica Joiner Transformation

    Joiner transformation allows you to join two heterogeneous sources in the Informatica mapping. You can use this transformation to perform INNER and OUTER joins between two input streams. For performance reasons, I recommend you ONLY use JOINER...

  • Loading Flatfiles delimited by comma and double quotes

    In this article let us take up a very trivial but an important aspect that we as DW developers usual face. This is related to loading flat file sources. Whenever we have flat file sources we usual ask source systems for a specific type of field...

  • APEAR - A tool for automating Informatica Performance Tuning

    DWBIConcepts is launching APEAR – an Automated Performance Evaluation and Reporting tool for Informatica. As the name suggests, this tool will help you tune the performance of Informatica sessions fully automatically. Now don't waste your precious...

  • Informatica Java Transformation

    Feel the Power of Java programming language to transform data in PowerCenter Informatica. Java Transformation in Informatica can be used either in Active or Passive Mode.

  • Informatica Excel Source

    This article is a guide on how to Unload data from EXCEL file system to target relational database using Informatica.

  • How to get Folders and Mapping names from Informatica Metadata Query

    We can use OPB_MAPPING and OPB_SUBJECT tables residing under informatica Repository to obtain information about all the mappings under each Informatica Folder. Following SQL query shows you how to do it.

  • How to Tune Performance of Informatica Lookup Transformation

    To me, look-up is the single most important (and difficult) transformation that we need to consider while tuning performance of Informatica jobs. The choice and use of correct type of Look-Up can dramatically vary the session performance in...