Conditional split task in SSIS

Hello friends, today we will learn how to implement conditional split task in SSIS. Before moving to the implementation part, we should know where to use conditional split task. Lets have a look at the below scenario.

Scenario: - I have a table “Emp_detail” as a data source and I want to split the data coming from the data source into smaller data sets. For example, I want to split the source data based on condition if salary is less than 20000 then it should go to in a different destination table and salary greater than 20000 should go to another destination table.

To implement this scenario lets create three tables, Emp_detail, Emp_LessSalary, Emp_GreaterSalary.

CREATE TABLE Emp_Detail](
      [Eid] [int] NULL,
      [Name] [varchar](50) NULL,
      [Dept] [varchar](20) NULL,
      [salary] [int] NULL
) ON [PRIMARY]

INSERT INTO Emp_Detail(Eid,Name,Dept,salary)VALUES(1,'neeraj','IT',10000)
INSERT INTO Emp_Detail(Eid,Name,Dept,salary VALUES(2,'akshay','Acct',20000)
INSERT INTO Emp_Detail(Eid,Name,Dept,salary)VALUES(3,'Mohan','HR',50000)
INSERT INTO Emp_Detail(Eid,Name,Dept,salary)VALUES(4,'priya','Payroll',5000)
INSERT INTO Emp_Detail(Eid,Name,Dept,salary)VALUES(5,'Ritu','Finance',8000)
INSERT INTO Emp_Detail(Eid,Name,Dept,salary)VALUES(6,'Harshit','Engg',40000)
INSERT INTO Emp_Detail(Eid,Name,Dept,salary)VALUES(7,'Amit','Planning',30000)

We will create “Emp_detail” table using above script and insert few rows into it, now we will create two more tables using below scripts.

select * into Emp_LessSalary from Emp_detail where 1=0

select * into Emp_GreaterSalary from Emp_detail where 1=0

Now its time to create the SSIS package to implement the conditional split task.We will follow below steps.First we will drag and drop one data flow task into the control flow region as shown in the below screen shot.


Double click on the "data flow" task that will open data flow region. Now we will add “OLEDB Source” into the data flow region(as shown in the below screenshot).


We will configure the connection manager for “OLEDB Souce” by double clicking on it(you can refer my earlier articles to see how to configure connection manager for “OLEDB Source”.
I have configured the connection manager for the “OLEDB Source” and selected the “Emp_Detail” table as a data source as shown in the below screenshot.


Now we will add “Conditional Split” task to the data flow region as shown in the below screenshot.


We will right click on the “Conditional Split” task and click on “Edit” menu that will bring a new screen to us(refer the below screenshot for reference).


In the above screenshot, we have three sections, first section contain the variables and columns that we are receiving from the “OLEDB Source”. Second section contains all the inbuilt fuctions those are available to apply in the “Conditional Split” task. In the third section, we define the conditions.I have defined two conditions and given “Output Name”(LessSalary and GreaterSalary).In the condition column, I have put the condition,we will click on “OK” button to close the screen.

Now we will add “OLEDB Destination” task to our package as shown in the below screenshot.


We will connect the green connector of the “Conditional Split” task with the “OLEDB Destination” task. As soon as we connect the connector with the “OLEDB Destination” task, a new pop up window will come up(as shown in the below screenshot).


In the pop up window we have three output shown in the “Output” combo box.
We need to understand these three output options.

GreaterSalary:- This will contain the rows those will qualify the condition(Salary >20000)
LessSalary:- This will contain the rows those will qualify the condition(Salary <20000)
Conditional Split Default Output:- Suppose I have 10 rows in the source data set and 4 rows qualify the condition(Salary>20000) and 4 rows qualify the condition(Salary <20000) then rest 2 rows from the source data set will come under Conditional Split Default Output.

Now we will select “GreaterSalary” in the output combo box and click on “OK” button.


Now we will take one more “OLEDB Destination” task and connect the “Conditional Split” task with the second “OLEDB Destination" task(as shown in the below screenshot).


In the above screenshot we can see that our both “OLEDB Destination” tasks are showing error sign(red cross).The reason behind these errors is that we haven't specified the destination table yet.To remove these errors, we will right click on the any of the “OLEDB Destination” task and click on edit menu, that will bring a new screen to us(as shown in the below screenshot).


You can see in the above screenshot, I have specified the “Emp_GreaterSalary” table, in which I want to import the rows which will qualify the condition(Salary>20000).In the mappings tab, we can map the source and destination columns.
We will perform the same activity for the second “OLEDB Destination” task to remove the error but this time we will select the different table “Emp_LessSalary”.


We can see in the above screenshot that all the errors have been removed.Now we are ready to run our SSIS package.


In the above screenshot,we can see that 3 rows has been imported to each “OLEDB Destination” tasks but we had 7 rows in the source table. So 1 row is missing, if we want to get that row we have to a use the “Conditional Split Default Output” as output(I have already explained this above) and take one more “OLEDB Destination” task and connect with the “Conditional Split” task(as shown in the below screenshot).


I hope you will find this article useful for you. Please feel free to ask any question related to this article.

No comments: