Skip to main content

Overview Normalizer Transformation in Informatica



Transformation type:
Active
Connected





Normalization is the process of organizing data. In database terms, this includes creating normalized tables and establishing relationships between those tables according to rules designed to both protect the data and make the database more flexible by eliminating redundancy and inconsistent dependencies. 

The Normalizer transformation normalizes records from COBOL and relational sources, allowing you to organize the data according to your own needs. A Normalizer transformation can appear anywhere in a pipeline when you normalize a relational source. Use a Normalizer transformation instead of the Source Qualifier transformation when you normalize a COBOL source. When you drag a COBOL source into the Mapping Designer workspace, the Mapping Designer creates a Normalizer transformation with input and output ports for every column in the source. 

You primarily use the Normalizer transformation with COBOL sources, which are often stored in a denormalized format. The OCCURS statement in a COBOL file nests multiple records of information in a single record. Using the Normalizer transformation, you break out repeated data within a record into separate records. For each new record it creates, the Normalizer transformation generates a unique identifier. You can use this key value to join the normalized records. 

You can also use the Normalizer transformation with relational sources to create multiple rows from a single row of data.


Normalizing Data in a Mapping 

Although the Normalizer transformation is designed to handle data read from COBOL sources, you can also use it to denormalize data from any type of source in a mapping. You can add a Normalizer transformation to any data flow within a mapping to normalize components of a single record that contains denormalized data. 

If you have denormalized data for which the Normalizer transformation has created key values, connect the ports representing the repeated data and the output port for the generated keys to a different pipeline branch in the mapping. Ultimately, you may want to write these values to different targets. 

You can use a single Normalizer transformation to handle multiple levels of denormalization in the same record. For example, a single record might contain two different detail record sets. Rather than using two Normalizer transformations to handle the two different detail record sets, you handle both normalizations in the same transformation. 
Normalizer Ports 

When you create a Normalizer for a COBOL source, or in the mapping pipeline, the Designer identifies the OCCURS and REDEFINES statements and generates the following columns: 
Generated key. One port for each REDEFINES clause. For more information, see Generated Key. 
Generated Column ID. One port for each OCCURS clause. For more information, see Generated Column ID. 

You can use these ports for primary and foreign key columns. The Normalizer key and column ID columns are also useful when you want to pivot input columns into rows. You cannot delete these ports. 

Generated Key 

The Designer generates a port for each REDEFINES clause to specify the generated key. You can use the generated key as a primary key column in the target table and to create a primary-foreign key relationship. The naming convention for the Normalizer generated key is: 

GK_<redefined_field_name> 

the Designer adds one column (GK_FILE_ONE and GK_HST_AMT) for each REDEFINES in the COBOL source. The Normalizer GK columns tell you the order of records in a REDEFINES clause. For example, if a COBOL file has 10 records, when you run the workflow, the PowerCenter Server numbers the first record 1, the second record 2, and so on. 

You can create approximately two billion primary or foreign key values with the Normalizer by connecting the GK port to the desired transformation or target and using the values ranging from 1 to 2147483647. At the end of each session, the PowerCenter Server updates the GK value to the last value generated for the session plus one. 

If you have multiple versions of the Normalizer transformation, the PowerCenter Server updates the GK value across all versions when it runs a session. 

If you open the mapping after you run the session, the current value displays the last value generated for the session plus one. Since the PowerCenter Server uses the GK value to determine the first value for each session, you should only edit the GK value if you want to reset the sequence. 

If you have multiple versions of the Normalizer, and you want to reset the sequence, you must check in the mapping after you modify the GK value. 


Generated Column ID

The Designer generates a port for each OCCURS clause to specify the positional index within an OCCURS clause. You can use the generated column ID to create a primary-foreign key relationship. The naming convention for the Normalizer generated column ID is: 

GCID_<occuring_field_name> 

the Designer adds one column (GCID_HST_MTH and GCID_HST_AMT) for each OCCURS in the COBOL source. The Normalizer GCID columns tell you the order of records in an OCCURS clause. For example, if a record occurs two times, when you run the workflow, the PowerCenter Server numbers the first record 1 and the second record 2.

Comments

Popular posts from this blog

Contact Me

Do You have any queries ?                   If you are having any query or wishing to get any type of help related Datawarehouse, OBIEE, OBIA, OAC then please e-email on below. I will reply to your email within 24 hrs. If I didn’t reply to you within 24 Hrs., Please be patience, I must be busy in some work. kashif7222@gmail.com

Top 130 SQL Interview Questions And Answers

1. Display the dept information from department table.   Select   *   from   dept; 2. Display the details of all employees   Select * from emp; 3. Display the name and job for all employees    Select ename ,job from emp; 4. Display name and salary for all employees.   Select ename   , sal   from emp;   5. Display employee number and total salary   for each employee. Select empno, sal+comm from emp; 6. Display employee name and annual salary for all employees.   Select empno,empname,12*sal+nvl(comm,0) annualsal from emp; 7. Display the names of all employees who are working in department number 10   Select ename from emp where deptno=10; 8. Display the names of all employees working as   clerks and drawing a salary more than 3000   Select ename from emp where job=’clerk’and sal>3000; 9. Display employee number and names for employees who earn commission   Select empno,ename from emp where comm is not null and comm>0. 10

Informatica sample project

Informatica sample project - 1 CareFirst – Blue Cross Blue Shield, Maryland (April 2009 – Current) Senior ETL Developer/Lead Model Office DWH Implementation (April 2009 – Current) CareFirst Blue Cross Blue Shield is one of the leading health care insurance provided in Atlantic region of United States covering Maryland, Delaware and Washington DC. Model Office project was built to create data warehouse for multiple subject areas including Members, Claims, and Revenue etc. The project was to provide data into EDM and to third party vendor (Verisk) to develop cubes based on data provided into EDM. I was responsible for analyzing source systems data, designing and developing ETL mappings. I was also responsible for coordinating testing with analysts and users. Responsibilities: ·          Interacted with Data Modelers and Business Analysts to understand the requirements and the impact of the ETL on the business. ·          Understood the requirement and develope