In this article series we will try to cover all the possible scenarios related to flatfiles in Informatica.

Let us look into the first requirement

Requirement 1:
Generate respective flatfiles containing employee information for each department. That is, for each and every department we will generate flatfiles in a dynamic manner from our employee database table (we won't have to know how many departments we have beforehand).

  1. Select Source Analyzer and Import the Relational database table emp.
  2. Next open the Target Designer.Select the EMP table, from the Sources of your Informatica folder found on the Repository Navigator.Just drag and drop the EMP table to the Target Designer. Next Edit the target definition as follows.
    • Rename the Select table to FF_EMP_DEPTNO.
    • Change the Database type to Flat File.
    • Change the desired Flat file information fields. In this case we set it to Delimited.
    • Click on Advanced Flat file information and set the Column Delimiters as ,
    • Next select the Columns tab. On the right-hand corner click the icon Add FileName column to this table. A new port or column FileName with string datatype is created.

    Add FileName column target flatfile

    • Apply the above target definition changes.

    Target Instance Flatfile Definition

    So our source table EMP and target flatfile instance FF_EMP_DEPTNO are now ready, so we are good to go to set the mapping logic to generate dynamic flatfiles per department.

  3. Find the mapping screenshot below:

    Mapping using Transaction Control Tx

  4. First of all sort the employee records based on the deptno in the source qualifier.

    Source Qualifier SQL Query

  5. Next in the expression transformation we need to flag records based on the change in deptno. In the expression transformation add the following ports:
    • FileName as Output Port
    • COMP_DEPTNO as Variable Port
    • FLAG as Output Port
    • PREV_DEPTNO as Variable Port

    Expression Transformation- Change in dept flag

    Now find the expressions for all the above ports of the expression transformation.

    'FF_DEPT_' || TO_CHAR(DEPTNO) || '.csv'
  6. Now find the Transaction Control Condition, based on which the commit is issued to place the end of file for one dept and opens to write new file.

    Transaction Control Transformation

    So when there is change in the deptno, then the commit is issued to end the current file and a new file is open to write emp information of new deptno.

    Also note that with the introduction of Transaction Control Transformation the Commit Type is set to User Defined.

    Session Commit Type- User Defined

We are done with our first requirement.

Requirement 2: Now we try the reverse scenario i.e. load the database employee table with the information in the form of flatfiles received from source systems. These flatfiles have employee information based on each department; Also the number of files or departments is not know in Advance. Note, that the part of the flatfile name bears the dept information that also need to be mapped to deptno column of the employee table.

Here we go:

  1. Select Source Analyzer to Import FlatFile definition.
    • Select the option Import from File.
    • Next select the file FF_DEPT_10.csv in TgtFiles directory, considering that the file we generated from our previous example will be the source definition in this case.
    • Next select Flat File Type as Delimited.
    • Enter FF_EMP_DEPTNO as the name of this source definition.
    • Select start import at row 1, since we dont have any header record.
    • Next select the delimiter as Comma.
    • Next modify the field names and corresponding datatypes accordingly.
    • Next in the Properties tab check the option Add Currently Processed Flat File Name Port. When this option is selected, a port specifying currently processed file name gets added to the source definition.

    Add Currently Processed Flat File Name Port

    • Apply the above source definition changes.

    Source Flatfile Definition

    Our Source flat file definition is ready. There are also other easy ways to create this source file definition like export/import and modify emp table or create flatfile option.

  2. Next import the EMP table in the Target Designer. Alternatively you can drag and drop the source definition of EMP imported in the earlier example.

    So our source flatfile definition FF_EMP_DEPTNO and target table instance EMP are now ready, so we are good to go to set the mapping logic to extract employee information from different department specific flatfiles.

  3. Find the mapping screenshot below:

    Mapping using Currently Processed FlatFile Name and Indirect Load

  4. Next in the expression transformation we will extract the deptno information contained in the filename. Also note we have treated hiredate as varchar2(10) in source flatfile definition. Hence we need to convert it to date before loading to EMP target table. In the expression transformation add the following ports:
    • OUT_HIREDATE as Date/Time Output Port
    • DEPTNO as Output Port

    Expression Transformation- Extract DEPTNO

    Now find the expressions for all the above ports of the expression transformation.

    INSTR(CurrentlyProcessedFileName, 'FF_DEPT_', 1)
  5. Next let us look at Mapping tab of the Session property.
    • Set Source filetype as Indirect.
    • Set Source file directory as $PMTargetFileDir\, as this is currently where our file resides.
    • Set Source filename as FF_EMP_DEPTNO.txt

    File Reader Properties

    Next comes the question of setting the filelist in the Indirect file, from where Integration service will read the files. We can use pre-session command to write the list of files present in the directory in the indirect file.

  6. Use a Pre-Session Command to generate the Indirect file with the list of files at runtime.

PreSession Command to create indirect file

Find below a sample Windows command to do the same

dir /b /on $PMTargetFileDir\FF_DEPT*>$PMTargetFileDir\FF_EMP_DEPTNO.txt

Hence our second requirement is also done.

Have a question on this subject?

Ask questions to our expert community members and clear your doubts. Asking question or engaging in technical discussion is both easy and rewarding.

Are you on Twitter?

Start following us. This way we will always keep you updated with what's happening in Data Analytics community. We won't spam you. Promise.

  • Informatica Tuning - Step by Step Approach

    This is the first of the number of articles on the series of Data Warehouse Application performance tuning scheduled to come every week. This one is on Informatica performance tuning.

  • CDC Implementation using Flatfile

    This article shows how to use a flatfile to implement Change data Capture. Suppose we want to maintain the last extraction date in a flatfile, based on that value we want to capture the changed data of our business table.

  • Stop Hardcoding- Follow Parameterization Technique

    This article tries to minimize hard-coding in ETL, thereby increasing flexibility, reusability, readabilty and avoides rework through the judicious use of Informatica Parameters and Variables.

  • Implementing Informatica Persistent Cache

    You must have noticed that the "time" Informatica takes to build the lookup cache can be too much sometimes depending on the lookup table size/volume. Using Persistent Cache, you may save lot of your time. This article describes how to do it.

  • Informatica Join Vs Database Join

    In this yet another "DWBI Concepts' Original article", we test the performance of Informatica PowerCentre 8.5 Joiner transformation versus Oracle 10g database join. This article gives a crucial insight to application developer in order to take...

  • Pushdown Optimization In Informatica

    Pushdown Optimization which is a new concept in Informatica PowerCentre, allows developers to balance data transformation load among servers. This article describes pushdown techniques.

  • Implementing Informatica Incremental Aggregation

    Using incremental aggregation, we apply captured changes in the source data (CDC part) to aggregate calculations in a session. If the source changes incrementally and we can capture the changes, then we can configure the session to process those...

  • What is Active Lookup Transformation

    Informatica 9x allows us to configure Lookup transformation to return multiple rows. So now we can retrieve multiple rows from a lookup table thus making Lookup transformation an Active transformation type.

  • Generate Surrogate Key without using Sequence Generator

    It is possible to generate sequential surrogate key in the target table without the use of an Informatica Sequence Generator transformation. Using this option, one can avoid any gap in the sequence numbers of the surrogate key.

  • The benefit and disadvantage of Informatica Persistent Cache Lookup

    Persistent cache may be your choice of caching when it comes to lookup performance. But you should be aware of the hazards of persistent cache as well.