This morning one of my colleagues showed me one Online article that is supposedly "teaching" people when should we use Informatica Stored Procedure transformation. The article is located here and following are the points that the author makes in support of using Stored Proc transformation in Informatica:
Warning: The points below are incorrect/incomplete
- To perform calculation: There will be many well tested calculations which we implement using expression.Instead of using expression we can use stored procedure to store these calculations and then use them by using connected or unconnected stored procedure transformation
- Dropping and recreating indexes: Whenever we have huge number of record to be loaded to target it's better to drop the existing indexes and recreate it.For dropping and recreation of indexes we can make use of connected or unconnected stored procedure transformation
- Check the status of a target table before loading data into it.
- To check the space left in Database
Now all four of the above points are incorrect/incomplete and if anybody gives me these answers in an Interview, I would probably not waste my time asking him a second question. So let's look into the answers and find out what's wrong in them.
There will be many well tested calculations which we implement using expression
The first point is probably the author's closest attempt to correctness. Yes, we do use stored procedure transformation for complex calculations. But can you read the line once again? He says - "There will be many well tested calculations which we implement using expression.Instead of using expression we can use stored procedure to store these calculations". Who would want to use stored procedure transformation again if the calculations are already done and well tested using expressions? I mean, they can. But why will they do that if they have already taken the pain of writing the whole logic in expression and tested it thoroughly? The correct answer is the opposite of this, "There will be many well tested calculations which are already implemented using database stored procedures"
Dropping and recreating indexes... Next he says, we use stored procedure transformation for "Dropping and recreating indexes". A connected SP transformation will be called as many times as is the number of records in the source table. So if you really have a big source table with millions of records - your table indexes will be dropped and recreated million times. I am sure author actually meant of using unconnected stored procedure transformation (which you can configure to call in the beginning or end of data load). But then again, instead of using a Stored Procedure transformation for index dropping, we could use pre/post SQL options in session level. So this is not a specific use-case for SP transformation.
status of target table... Then he says, "check the status of target table". For what? (Are you talking about MySQL? That's the only thing where table STATUS means something - but who uses Informatica to load data to a database which is primarily used in web/OLTP environment?)
space left in database... Finally, "to check space left in database". Well, you can do that. But what purpose will it serve? Can you save a soon-to-fail-load-due-to-space-crunch using this? Or you want to check space before inserting each individual row? Well, again you can. But guess what, no one will do that due to severe "performance" concerns.
So if you are wondering what could be some specific use-cases of Stored procedure transform, I have also 4 points. See below.
Some Valid use cases or scenarios where we would want to use stored procedure transformation in Informatica are:
If we have a complex calculation that needs to be done for each record repeatedly we may use stored procedure transformation. Complex statistical calculations where a certain measure is being calculated for each of the record is a good example where one may want to use this.
There are cases where you may want to use this to improve your performance. Let’s say one calculation requires accessing or looking up multiple records or values from several different tables. If you were to perform this calculation in Informatica, you will have to perform multiple lookup operations in many tables. Depending upon the types of your lookup, this may involve caching the entire data or at least making multiple network level calls to transport these data over the network layer. Instead, if we use a stored procedure transformation, we can perform the entire calculation within the database, thereby, eliminating any need to transport data over the network.
Although rare but sometimes client may have a proprietary algorithm implemented in database. Client does not want to let others know about the actual algorithm in use, however, wishes that we take advantage of this algorithm. If the algorithm is implemented in database stored procedure and proper access restrictions are placed, then we will not be able to read the SP to see the algorithm, but still can call this from Informatica to take the advantage of the algorithm. Example, proprietary trading algorithms in banks and brokerage farms.
Reusability is a reason why you may want to use a stored procedure transformation. Suppose you have already written a complex code in database programming language, tested the code exhaustively and you are pleased with the performance of this piece of code. Now that your Organization is implementing Informatica, you may not wish to rewrite this entire piece of code in Informatica again. You can reuse your existing code through stored procedure transformation.
Let us know if you have any other good use-case