Hello friends, today we are going to learn about aggregate
task in SSIS. This task is equivalent to aggregation in T-SQL which we does
using group by and other functions(Min, Max,Count,Sum,Average).
Lets take an example to implement the aggregate task. For that
purpose we will create two tables using below SQL statements.
Create table Sales_detail(SaleId
int,ItemName varchar(50),Quantity int,Amount int)
insert into Sales_detail values(101,'Item A',10,200)
insert into Sales_detail values(101,'Item B',10,100)
insert into Sales_detail values(102,'Item C',5,75)
insert into Sales_detail values(103,'Item A',8,60)
insert into Sales_detail values(103,'Item B',8,80)
insert into Sales_detail values(103,'Item C',8,100)
insert into Sales_detail values(104,'Item A',15,400)
insert into Sales_detail values(104,'Item B',10,150)
insert into Sales_detail values(104,'Item C',20,250)
insert into Sales_detail values(104,'Item D',5,140)
Create table Aggregated_Sales(Sale_Id
int,Amount int)
We have ”Sales_Detail” table where we are storing data for each sale, now we want to get the total sale amount per sales. For that
purpose we need aggregation.After applying aggregation, we will store that data
into “Aggregated_Sales” table.
First of all we will take one “Data Flow” task into the “Control
Flow” region(as shown in the below screenshot).
We will double click on the “Data flow” task that will take
us to “Data Flow” region. In “Data Flow” region, we will take one “OLEDB Source”
and configure the connection manager for “OLEDB Source”.
While configuring the connection manager for “OLEDB Source”,
I have selected the table “Sales_Detail” for source Data.
Now we will click on “OK” button that will close the above
screen. We will add the “Aggregate” task to the “Data Flow” region (as shown in
the below screenshot).
To configure the “Aggregate” task, we will right click on it
and click on the “Edit” menu that will bring a new screen to us( as shown in
the below screenshot).
In the above screenshot, there are two regions, in the first
region we can see the column names from the source table “Sales_Detail”, I have
selected only two columns “SaleId” and “Amount” as we need only two columns to
get the total sale amount against each sale. In the second section, there is column “Operation”
where I have selected “Group by” operator for “SaleId” column and “Sum”
Operator for “Amount” Column(Similarly we do in T-SQL). In the Operation Column
we can select different aggregate functions(Min, Max, Count,Sum,Average,Count
Distinct and Group by) based on our requirement.
We will click on “OK” button to close the current screen.
Now we will add “OLEDB Destination” to our SSIS package that will hold our
aggregated result set received by “Aggregator” task(please refer the below
screenshot for your reference).
To configure the “OLEDB Destination” task, we will right
click on it and click on the “Edit” menu that will bring a new screen to us to
configure the connection manager( Refer below screenshot for your reference).
In the above screenshot , we can see that I have selected “Aggregated_Sales”
table to hold the result set received from the “Aggregator” task. We will click
on “OK” button to close the current window.
Note:-We can map
the column of source and destination by selecting the “Mapping” tab(encircled
by red color).
Now we are ready to run our SSIS package, press “F5” key to
run the package and lets see the result.
Above screenshot with all green tasks shows that our package
ran successfully. Lets check “Aggregated_Sales” table to see result.
I hope you will find this article useful, please feel free
to ask any question related to this article.
No comments:
Post a Comment