Aggregate task in SSIS

Hello friends, today we are going to learn about aggregate task in SSIS. This task is equivalent to aggregation in T-SQL which we does using group by and other functions(Min, Max,Count,Sum,Average).

Lets take an example to implement the aggregate task. For that purpose we will create two tables using below SQL statements.

Create table Sales_detail(SaleId int,ItemName varchar(50),Quantity int,Amount int)

insert into Sales_detail values(101,'Item A',10,200)
insert into Sales_detail values(101,'Item B',10,100)
insert into Sales_detail values(102,'Item C',5,75)
insert into Sales_detail values(103,'Item A',8,60)
insert into Sales_detail values(103,'Item B',8,80)
insert into Sales_detail values(103,'Item C',8,100)
insert into Sales_detail values(104,'Item A',15,400)
insert into Sales_detail values(104,'Item B',10,150)
insert into Sales_detail values(104,'Item C',20,250)
insert into Sales_detail values(104,'Item D',5,140)

Create table Aggregated_Sales(Sale_Id int,Amount int)

We have ”Sales_Detail” table where we are storing data for each sale, now we want to get the total sale amount per sales. For that purpose we need aggregation.After applying aggregation, we will store that data into “Aggregated_Sales” table.

First of all we will take one “Data Flow” task into the “Control Flow” region(as shown in the below screenshot).


We will double click on the “Data flow” task that will take us to “Data Flow” region. In “Data Flow” region, we will take one “OLEDB Source” and configure the connection manager for “OLEDB Source”.
While configuring the connection manager for “OLEDB Source”, I have selected the table “Sales_Detail” for source Data.


Now we will click on “OK” button that will close the above screen. We will add the “Aggregate” task to the “Data Flow” region (as shown in the below screenshot).


To configure the “Aggregate” task, we will right click on it and click on the “Edit” menu that will bring a new screen to us( as shown in the below screenshot).


In the above screenshot, there are two regions, in the first region we can see the column names from the source table “Sales_Detail”, I have selected only two columns “SaleId” and “Amount” as we need only two columns to get the total sale amount against each sale. In the second section, there is column “Operation” where I have selected “Group by” operator for “SaleId” column and “Sum” Operator for “Amount” Column(Similarly we do in T-SQL). In the Operation Column we can select different aggregate functions(Min, Max, Count,Sum,Average,Count Distinct and Group by) based on our requirement.
We will click on “OK” button to close the current screen. Now we will add “OLEDB Destination” to our SSIS package that will hold our aggregated result set received by “Aggregator” task(please refer the below screenshot for your reference).


To configure the “OLEDB Destination” task, we will right click on it and click on the “Edit” menu that will bring a new screen to us to configure the connection manager( Refer below screenshot for your reference).


In the above screenshot , we can see that I have selected “Aggregated_Sales” table to hold the result set received from the “Aggregator” task. We will click on “OK” button to close the current window.
Note:-We can map the column of source and destination by selecting the “Mapping” tab(encircled by red color).

Now we are ready to run our SSIS package, press “F5” key to run the package and lets see the result.


Above screenshot with all green tasks shows that our package ran successfully. Lets check “Aggregated_Sales” table to see result.


I hope you will find this article useful, please feel free to ask any question related to this article.

No comments: