Skip to main content

SCD -1, SCD -2 , SCD -3 (Slowly Changing dimensional in Informatica)

Slowly Changing dimensional in Informatica with example ( SCD -1, SCD -2 , SCD -3)


Dimensions that change over time are called Slowly Changing Dimensions. For instance, a product price changes over time; People change their names for some reason; Country and State names may change over time. These are a few examples of Slowly Changing Dimensions since some changes are happening to them over a period of time.


slowly changing dimension in informatica
slowly changing dimension in informatica

Slowly Changing Dimensions are often categorized into three types namely Type1, Type2 and Type3. The following section deals with how to capture and handling these changes over time.
The "Product" table mentioned below contains a product named, Product1 with Product ID being the primary key. In the year 2004, the price of Product1 was $150 and over the time, Product1's price changes from $150 to $350. With this information, let us explain the three types of Slowly Changing 
 Dimensions.

Product Price in 2004:
Product ID(PK)
Year
Product Name
Product Price
1
2004
Product1
$150
 


Type 1: Overwriting the old values.
In the year 2005, if the price of the product changes to $250, then the old values of the columns "Year" and "Product Price" have to be updated and replaced with the new values. In this Type 1, there is no way to find out the old value of the product "Product1" in year 2004 since the table now contains only the new price and year information.

Product
Product ID(PK)
Year
Product Name
Product Price
1
2005
Product1
$250
 


Type 2: Creating an another additional record.
In this Type 2, the old values will not be replaced but a new row containing the new values will be added to the product table. So at any point of time, the difference between the old values and new values can be retrieved and easily be compared. This would be very useful for reporting purposes.

Product
Product ID(PK)
Year
Product Name
Product Price
1
2004
Product1
$150
1
2005
Product1
$250
The problem with the above mentioned data structure is "Product ID" cannot store duplicate values of "Product1" since "Product ID" is the primary key. Also, the current data structure doesn't clearly specify the effective date and expiry date of Product1 like when the change to its price happened. So, it would be better to change the current data structure to overcome the above primary key violation.

Product
Product ID(PK)
Effective
DateTime(PK)
Year
Product Name
Product Price
Expiry
DateTime
1
01-01-2004 12.00AM
2004
Product1
$150
12-31-2004 11.59PM
1
01-01-2005 12.00AM
2005
Product1
$250

In the changed Product table's Data structure, "Product ID" and "Effective DateTime" are composite primary keys. So there would be no violation of primary key constraint. Addition of new columns, "Effective DateTime" and "Expiry DateTime" provides the information about the product's effective date and expiry date which adds more clarity and enhances the scope of this table. Type2 approach may need additional space in the data base, since for every changed record, an additional row has to be stored. Since dimensions are not that big in the real world, additional space is negligible.
 


Type 3: Creating new fields.
In this Type 3, the latest update to the changed values can be seen. Example mentioned below illustrates how to add new columns and keep track of the changes. From that, we are able to see the current price and the previous price of the product, Product1.
Product
Product ID(PK)
Current
Year
Product
Name
Current
Product Price
Old Product
Price
Old Year
1
2005
Product1
$250
$150
2004
The problem with the Type 3 approach, is over years, if the product price continuously changes, then the complete history may not be stored, only the latest change will be stored. For example, in year 2006, if the product1's price changes to $350, then we
would not be able to see the complete history of 2004 prices, since the old values would have been updated with 2005 product information.
Product
Product ID(PK)
Year
Product
Name
Product
Price
Old Product
Price
Old Year
1
2006
Product1
$350
$250
2005

Comments

Popular posts from this blog

Contact Me

Do You have any queries ?                   If you are having any query or wishing to get any type of help related Datawarehouse, OBIEE, OBIA, OAC then please e-email on below. I will reply to your email within 24 hrs. If I didn’t reply to you within 24 Hrs., Please be patience, I must be busy in some work. kashif7222@gmail.com

Top 130 SQL Interview Questions And Answers

1. Display the dept information from department table.   Select   *   from   dept; 2. Display the details of all employees   Select * from emp; 3. Display the name and job for all employees    Select ename ,job from emp; 4. Display name and salary for all employees.   Select ename   , sal   from emp;   5. Display employee number and total salary   for each employee. Select empno, sal+comm from emp; 6. Display employee name and annual salary for all employees.   Select empno,empname,12*sal+nvl(comm,0) annualsal from emp; 7. Display the names of all employees who are working in department number 10   Select ename from emp where deptno=10; 8. Display the names of all employees working as   clerks and drawing a salary more than 3000   Select ename from emp where job=’clerk’and sal>3000; 9. Display employee number and names for employees who earn commission   Select empno,ename from emp where comm is not null and comm>0. 10

Informatica sample project

Informatica sample project - 1 CareFirst – Blue Cross Blue Shield, Maryland (April 2009 – Current) Senior ETL Developer/Lead Model Office DWH Implementation (April 2009 – Current) CareFirst Blue Cross Blue Shield is one of the leading health care insurance provided in Atlantic region of United States covering Maryland, Delaware and Washington DC. Model Office project was built to create data warehouse for multiple subject areas including Members, Claims, and Revenue etc. The project was to provide data into EDM and to third party vendor (Verisk) to develop cubes based on data provided into EDM. I was responsible for analyzing source systems data, designing and developing ETL mappings. I was also responsible for coordinating testing with analysts and users. Responsibilities: ·          Interacted with Data Modelers and Business Analysts to understand the requirements and the impact of the ETL on the business. ·          Understood the requirement and develope