Fools Guide to Big data - What is Big Data

Sure enough, you have heard the term, "Big Data" many times before. There is no dearth of information in the Internet and printed medium about this. But guess what, this term still remains vaguely defined and poorly understood. This essay is our effort to describe big data in simple technical language, stripping-off all the marketing lingo and sales jargons. Shall we begin?

Continue Reading...

Implementing Rapidly changing dimension

Handling rapidly changing dimensions are tricky due to various performance implications. This article attempts to provide some methodologies on handling rapidly changing dimensions in a data warehouse.
Published 7 months ago under Dimensional Model

Why do we need Staging Area during ETL Load

"We have a simple data warehouse that takes data from a few RDBMS source systems and load the data in dimension and fact tables of the warehouse. I wonder why we have a staging layer in between. Why can’t we process everything on the fly and push them in the data warehouse?"
Published 9 months ago under ETL Concepts

Top 20 SQL Interview Questions with Answers

SQL is a language for accessing and manipulating database standardized by ANSI. To be successful with database-centric applications (which includes most of the applications Data Warehousing domain), one must be strong enough in SQL. In this article, we will learn more about SQL by breaking the subject in the form of several question-answer sessions commonly asked in Interviews.
Published 1 year ago under SQL

Performance Considerations for Dimensional Modeling

Performance of a data warehouse is as important as the correctness of data in the data warehouse because unacceptable performance may render the data warehouse as useless. There is this increasing awareness about the fact that it’s much effective to build the performance from the beginning rather than to tune the performance at the end. In this article we have a few points that you may consider for optimally building the data model of a data warehouse. We will only consider performance considerations for dimensional modeling.
Published 1 year ago under Dimensional Model

Comparing Performance of SORT operation (Order By) in Informatica and Oracle

In this "DWBI Concepts' Original article", we put Oracle database and Informatica PowerCentre to lock horns to prove which one of them handles data SORTing operation faster. This article gives a crucial insight to application developer in order to take informed decision regarding performance tuning.
Published 1 year ago under Informatica

Informatica Performance Tuning - Complete Guide

This article is a comprehensive guide to the techniques and methodologies available for tuning the performance of Informatica PowerCentre ETL tool. It's a one stop performance tuning manual for Informatica.
Published 1 year ago under Informatica

How to Tune Performance of Informatica Lookup Transformation

To me, look-up is the single most important (and difficult) transformation that we need to consider while tuning performance of Informatica jobs. The choice and use of correct type of Look-Up can dramatically vary the session performance in Informatica. So let’s delve deeper into this.
Published 1 year ago under Informatica

How to Tune Performance of Informatica Joiner Transformation

Joiner transformation allows you to join two heterogeneous sources in the Informatica mapping. You can use this transformation to perform INNER and OUTER joins between two input streams. For performance reasons, I recommend you ONLY use JOINER transformation if any of the following condition is true –
Published 1 year ago under Informatica

How to Tune Performance of Informatica Aggregator Transformation

Similar to what we discussed regarding the Performance Tuning of Joiner Transformation, the basic rule for tuning aggregator is to avoid aggregator transformation altogether unless...
Published 1 year ago under Informatica