In this article let us take up a very trivial but an important aspect that we as DW developers usual face. This is related to loading flat file sources. Whenever we have flat file sources we usual ask source systems for a specific type of field delimiters.

Now suppose we have requested our source system for a comma separated flat file which will hold all Employee Information of an organization. Say we ask for a very simple file with five columns –Empno, Ename, Job, Sal, Address. Let us name the file as Emp_Src.txt. Below are the sample data in the file

EMPNO,ENAME,JOB,SAL,ADDRESS			
7900,JAMES,CLERK,950,CHOA CHU KANG
8001,SANJAY,ANALYST,33000,BUKIT BATOK
7654,MARTIN,SALESMAN,1375,RUSSEL ST
7566,JONES,MANAGER,1050,YEW TEE
7844,TURNER,SALESMAN,1650,BISHAN
7698,BLAKE,Manager,3740,JURONG
7788,SC,ANALYST,3300,LAKESIDE
7370,SMITH,CLERK,2500,COMMONWEALTH
7402,ADAM,SERVICE,5500,ADAM ST

Very simple, right? Yes, it is. Still we will once again look into the nitty-gritty of importing a flat file in Informatica. The below diagram specifies the different steps while importing a flat file, in our case a simple comma separated file.

On successful execution of the Simple File to Table map, below is the dataset in the DB table:

Now suppose in the same flat file we have an address which has a comma in its text. Then what happens? Let us execute the same mapping with the below data.

The address for EmpNo 7566 has a comma in its text. His address is now Block 35, Yew Tee

EMPNO,ENAME,JOB,SAL,ADDRESS			
7900,JAMES,CLERK,950,CHOA CHU KANG
8001,SANJAY,ANALYST,33000,BUKIT BATOK
7654,MARTIN,SALESMAN,1375,RUSSEL ST
7566,JONES,MANAGER,1050,BLOCK 35,YEW TEE
7844,TURNER,SALESMAN,1650,BISHAN
7698,BLAKE,Manager,3740,JURONG
7788,SC,ANALYST,3300,LAKESIDE
7370,SMITH,CLERK,2500,COMMONWEALTH
7402,ADAM,SERVICE,5500,ADAM ST

On successful execution, let us check the data in the table.

Flatfile Data Loaded in Database

Oops address got truncated for EmpNo 7566. How do we handle such case? In such scenarios we have to request Source Systems to enclose such text which has the delimiters as a part of the text with some other identifiers. Suppose in our case we request source to send us files with such texts within double quotes. Now the file would look like:

EMPNO,ENAME,JOB,SAL,ADDRESS			
7900,JAMES,CLERK,950,CHOA CHU KANG
8001,SANJAY,ANALYST,33000,BUKIT BATOK
7654,MARTIN,SALESMAN,1375,RUSSEL ST
7566,JONES,MANAGER,1050,"BLOCK 35,YEW TEE"
7844,TURNER,SALESMAN,1650,BISHAN
7698,BLAKE,Manager,3740,JURONG
7788,SC,ANALYST,3300,LAKESIDE
7370,SMITH,CLERK,2500,COMMONWEALTH
7402,ADAM,SERVICE,5500,ADAM ST

Now we have to make a small change in the source definition. Double click to edit on the Source Definition of the flat file and click on the Advanced button

Advanced Option for Flatfile Loading

And then check the type of Optional Quotes that our source system has agreed to send. In our case we would need to check the Double check box as shown below:

Double Quote option

Save and validate corresponding Mapping and Sessions and then re-execute it. Now let us check the result:

Double Quote showing correct data

We can also make the same change when defining the source definition at the very beginning while importing. The specification has to be given as Text qualifier in Step 2 as shown below:

text qualifier


Have a question on this subject?

Ask questions to our expert community members and clear your doubts. Asking question or engaging in technical discussion is both easy and rewarding.

Are you on Twitter?

Start following us. This way we will always keep you updated with what's happening in Data Analytics community. We won't spam you. Promise.

  • Using Informatica Normalizer Transformation

    Normalizer transformation is a native transformation in Informatica that can ease many complex data transformation requirements. Learn how to effectively use normalizer in this tutorial.

  • Implementing Informatica Persistent Cache

    You must have noticed that the "time" Informatica takes to build the lookup cache can be too much sometimes depending on the lookup table size/volume. Using Persistent Cache, you may save lot of your time. This article describes how to do it.

  • Implementing Informatica Incremental Aggregation

    Using incremental aggregation, we apply captured changes in the source data (CDC part) to aggregate calculations in a session. If the source changes incrementally and we can capture the changes, then we can configure the session to process those...

  • Challenges of Informatica Partitioning with Sequencing

    In the previous article, we showed how surrogate keys can be generated without using Sequence Generator transformation. However, if Informatica partitioning is implemented for such cases, then since each partition pipeline will call the lookup...

  • All about Informatica Lookup

    A Lookup is a Passive, Connected or Unconnected Transformation used to look up data in a relational table, view, synonym or flat file. The integration service queries the lookup table to retrieve a value based on the input source value and the...

  • Pushdown Optimization In Informatica

    Pushdown Optimization which is a new concept in Informatica PowerCentre, allows developers to balance data transformation load among servers. This article describes pushdown techniques.

  • PowerCenter SOA Components

    PowerCenter has a Service-Oriented Architecture that provides the ability to scale services and share resources across multiple machines. Let us know more about the components and services associated with Powercenter.

  • Informatica Performance Tuning - Complete Guide

    This article is a comprehensive guide to the techniques and methodologies available for tuning the performance of Informatica PowerCentre ETL tool. It's a one stop performance tuning manual for Informatica.

  • Implementing Informatica Partitions

    Identification and elimination of performance bottlenecks will obviously optimize session performance. After tuning all the mapping bottlenecks, we can further optimize session performance by increasing the number of pipeline partitions in the...

  • Implementing SCD2 in Informatica Using ORA_HASH at Source

    In this article we shall see how we can implement SCD type2 in Informatica using ORA_HASH, which is an ORACLE function that computes hash value for a given expression. We can use this feature to find the existence of any change in any of the SCD...