Aggregator Transformation in Informatica PowerCenter

Posted by Admin-Lisa Posted by Aug 25, 2023 in Informatica Powercenter Commonly Asked Questions Answers

The Aggregator transformation is an active and connected transformation.

The Aggregator transform is used to perform aggregate calculations such mean/average, First Count, Last, Maximum, Median, Minimum, Percentile, Standard Deviation, Sum, Variance.

The Integration Service performs aggregate calculations as it reads data groups and rows of data and stores them in the aggregate cache. The Aggregator transformation differs from the Expression transformation, but it is used with the Aggregator transformation to perform group calculations.

Filtering rows based on conditional clauses is more flexible than the SQL language.

Elements of the Aggregator Transformation

Aggregate cache

Aggregate expression

Group by port

Sorted input.

Aggregate cache: This Integration Service saves data in the aggregate cache until the aggregate calculation is complete. The Integration Service stores group values ​​in the index cache and row data stored in the data cache.

Aggregate expression: Enter the expression to the output port. Expressions can also include non-aggregate expressions and conditional clauses for further processing.

Group by port: Specifies how groups are created. You can set the group's input, in-out, output, or variable port. When you group data, the Aggregator transformation outputs the last row of each group unless specified otherwise.

Sorted input: This option has been chosen to improve session performance of mapping. To use sorted input, user must pass data sorted in ascending or descending order by group and port to the Aggregator transformation.

Aggregate Caches:

When the user runs a session with an Aggregator transformation, the Integration Service builds indexes and caches data in memory to process these transformations. The Integration Service stores overflow values ​​in cache files when it needs more disk space.

Users can configure the index and data cache in the aggregator transformation or session properties. Alternatively, users can configure the Integration Service to determine cache size at run time of job.

a. Index Cache:

The index cache stores group information from groups on a per-port basis. If users are using 'group by' on ports like 'EMP_ID', this index cache will store grouped port values ​​like 1, 2, 3, etc.

b. Data Cache:

In general, the size of DATA-CACHE is larger than the size of INDEX-CACHE.

Information is stored

Variable port (if created)

I/O Port (Non-Group)

Ports using aggregate functions.

Do not group by input ports used in non-aggregate output expressions.

Aggregate Expressions.

Aggregate expressions are associated with conditional clauses and non-aggregate functions. Expressions can also include aggregate functions within other aggregate functions in mapping, such as MAX (COUNT (ITEM)).

The result of the aggregated expression depends on how the ports are grouped within the transformation. If you do not define grouping by port, the Integration Service returns only one row as output. The total quantity of products sold is determined as follows:

SUM(SALARY) to calculate the salary of all the employees.

Using the same expression concept and grouping by SALARY port, the Integration Service returns the total quantity of total salary paid to employee. You can create an aggregate expression on each output port and use multiple aggregate ports in your transformation.

You can use one aggregate function within other aggregate functions. If you use any of these functions, you must use them in an expression within the Aggregator transformation.

Aggregate function of Aggregator Transformation.

Mean (Average)

Min

Max

STDDEV

Median

Percentile

Variance

Count First

Last

Sums

Normal Aggregate Functions:

MAX (marks), MIN (marks), AVG (marks), SUM (marks), Count(students)

Nested Aggregate Functions:

Various output ports of the Aggregator transformation can contain multiple single-level functions or multiple nested functions. You must not include both single-level functions and nested functions in the Aggregator transformation.

If one output port of the Aggregator transformation contains a single-level function, another port within that transformation cannot be used as nested functions. If you include single-level functions and nested functions in the same Aggregator transformation, the mapping or mapplet is invalid.

Conditional Clauses:

Use of conditional clauses in aggregate transformation to reduce the no. of rows used in the aggregation. A conditional clause can be any clause that results in TRUE or FALSE.

Null Values in Aggregate Functions:

You must configure your integration service to handle null values ​​in aggregate functions. Null values​​in aggregate functions can be treated as null or zero. By default, the Integration Service treats NULL values ​​in aggregate functions as NULL.

This transformation (Aggregator) lets you define groups for aggregation instead of performing aggregation on all input data of the mappings. The Group By port tells the Integration Service how to create groups. Example: If you want to find out the total sales for your company, you can group the total sales by region.

To define groups of aggregate expressions, go to the Ports tab, select the appropriate input, input/output, output, and variable ports, and select them as groupings for the Aggregate transformation.

The Integration Service creates a row for each group. If you do not group the values, the Integration Service returns one row for each input row. The Integration Service normally returns the last row of each group.

Using Sorted Input

Use sorted input to improve session performance. When using sorted input, the input data must be sorted in ascending or descending order by group and port before passing it to the Aggregator transformation.

Using this sorted input option tells the aggregator transform that the incoming data is already sorted by group and port.

To use this option, go to the Ports tab and select Sorted Input.

If you enable the Sorted Input option and the input data is not sorted before submission. After this the session failed. You are suggested not to use sorted input if any of the following conditions apply:

Aggregate expressions use nested aggregate functions.

If The session uses incremental aggregation.

If you use sorted input and the data is not sorted correctly, the session will fail.

Sorted input condition cannot be used if any of the below mentioned case is true.

1. If nested function is used in an aggregate expression

2. if the session uses incremental aggregation.

Related Posts :

Aug 21, 2023

What are the types of Lookup transformations in informatica MDM?

What are the types of Lookup transformations in informatica MDM?There are four different types of lo...

Posted by Support@InventModel.com

Aug 25, 2023

Aggregator Transformation in Informatica PowerCenter

The Aggregator transformation is an active and connected transformation.The Aggregator transform is...

Posted by Admin-Lisa

Sep 08, 2023

Transformation in Designer Tools in Informatica Power Center (IPC)

Transformation in Designer Tools in Informatica Power CenterOverview of Transformation:Informatica T...

Posted by Admin-Lisa

Sep 12, 2023

Parallel processing in informatica PowerCenter

Parallel processing in informatica PowerCenterIn informatica PowerCenter parallel processing further...

Posted by Admin-Lisa