SSIS- Merge Join Transform
This is a Fast Forward tutorial on Merge Join Transformation in Microsoft SQL Server Integration Services.
Merge Join
The Merge Join transformation provides an output that is generated by joining two sorted datasets using a FULL, LEFT, or INNER join. This is similar to database join operation. This transformation has two inputs and one output. It does not support an error output.
Implementation
In this scenario we want to populate the employee and department information coming from two source tables into a single target staging table. Here we want to perform a inner join on emp and dept table based on deptno.
Here we have two OLEDB Source tables EMP and DEPT. Next we use two Sort transform to perform a sort based on DEPTNO. Next we use a Merge Join transform to perform INNER JOIN. Finally we have a OLEDB Destination table.
We select the DEPTNO Input Column to perform an ascending Sort Type.
Similarly we set sort order on DEPTNO column. In Sort transformation we can set the Sort Order if we want to get output dataset sorted based on more than one columns. Also we can use Sort transform to remove rows having duplicate sort values.
NOTE: In a SORT transformation at least one column must be selected for sorting.
Let's take a look at the Advanced Editor of Sort transformation.
Below Sort Transform Advanced Editor- Input Columns.
Below Sort Transform Advanced Editor- Input & Output Properties
Next we go to the Merge Join transform. Double-click to open the Editor. Next we define the join columns. EMP.DEPTNO = DEPT.DEPTNO. Select the Join type from Inner join, Left outer join and Full outer join. Here in this case we use Inner join. Next we select the desired Input columns and rename as Alias if required. When we connect the Sort output to Join transform we select Merge Join Left Input to mark the input dataset as the Left table.
The Merge Join transformation requires that the joined columns have matching metadata. If the data has a string data type, the length of the column in the second input must be less than or equal to the length of the column in the first input with which it is merged.
NULL Handling
We can specify whether the transformation handles null values as equal to other nulls. Set the property TreatNullsAsEqual to True. If null values are not treated as equal values, the transformation handles null values like any ANSI database does.