Skip to main content

Normalization in SQL ((1NF, 2NF, 3NF, BCNF, 4NF, 5NF)

Normalization:
Normalization is step-by-step process of reducing complexity of an entity by distributing the attributes to different entities. Following six normal rules are used sequentially on each entity to normalize them and the attribute that does not adhere to the rule is taken out of the entity.

  First Normal Form (1NF)
o   Every attribute must have value. It should not be empty. In RDBMS a column without value has NULL value that can be manipulated or accesses like a normal value.
o   Every attribute must have atomic value that cannot be shorten further. For example: Name is non-atomic attribute as it can be broken into First Name, Middle Name and Surname.

  Second Normal Form (2NF)
o   An entity must be in First Normal Form
o   Entity must have a Primary Key or Composite Primary Key
o   Every attribute must be fully and functionally dependent upon Primary Key

  Third Normal Form (3NF)
o   An entity must be Second Normal Form.
o   There must not be any dependency among non-key attributes (other that Primary Key)

  Boyce-Codd Normal Form (BCNF)
o   An entity must be in Third Normal Form.
o   If the entity has a Composite Primary Key & Alternate Primary Key with one or more attribute common to both, in such case entity must be broken into three entities.

 Fourth Normal Form (4NF)
o   An entity must be in Boyce-Codd Normal Form.
o   If an attribute is based on Value List must be taken out as a separate entity.

  Fifth Normal Form (5NF)
o   An entity must be in Fourth Normal Form.
o   If an attribute is multi-valued attribute then it must be taken out as a separate entity.
 
Normalization in SQL
Normalization in SQL

Database Design Example:
This example takes you into step-by-step approach for designing a small database for Resume Management that is based on Resume Attributes.

Step-1: (Gathering Attributes for Resume Management):
            Personal:
Name, Father Name, Address, Date of Birth, Contact, Marital Status, Gender, Nationality, Religion
            Education:
Degree/Certificate Name, Pass Year, Institute Name, Marks Obtained (%), Rating

            Experience:
Employer Name, Designation, Join Date, Resign Date, Reason of Resignation, Roles
            Project:
Project Name, Client Name, Start Date, Duration (Weeks), Team Size, Role, Description

Step-2 (Defining Entities using Noun Approach):
Entities below are derived from the list of attributes using a simple approach of Noun. Under this approach look for all attributes and whatever words are Noun pick them as entity and list them.


Step-3 (Relation among entities):
            Now define the relations among entities as per the following rules:
§  One-to-One Relation (Link directly both the entities)
§  One-to-Many Relation (Link directly both the entities)
§  Many-to-Many Relation (Take a third entity and relate both the entities via third entity)



Normalization in SQL

 

Conventions in E-R Diagram:
§  Normal color rectangle boxes are Real Entities (Person, Religion, Father, etc.).
§  Inverse color rectangle boxes are Intermediate Entities to break Many-to-Many relation based Entities (PersonAddress, PersonContact, PersonDesignation, etc.).
§  Arrow indicates One-to-Many relation from Parent towards Child entity (One Religion Many Person, One Nationality Many Person, One Person Many Degree, etc.).

Rules for defining relations between Entities:
Rule #1:
If two entities are based on One-to-One relation then keep them as separate entities are merge them into single entity, preferred method is keep them as separate entity. In above E-R Diagram there are no two entities based on One-to-One relation.

Rule #2:
If two entities are based on One-to-Many relation then always keep them as separate entities and relate them from One (Parent Entity) towards Many (Child Entity). In above E-R Diagram Religion & Person, Nationality & Person are examples of One-to-Many relations.

Rule #3:
If two entities are based on Many-to-Many relation then introduce a third entity to relate both the entities. Third entity becomes Child Entity for both the Parent Entities. In above E-R Diagram Person & Degree (PersonDegree), Person & Address (PersonAddress), Person & Contact (PersonContact) are examples of Many-to-Many relations.

Rule #4:
If one entity (A) has relation with two entities (B & C) and both have their child entity (BC) then relate A with BC. In above E-R Diagram, Institute Entity has relation with Person Entity & Degree Entity and they both have their child entity PersonDegree so Institute Entity is related to PersonDegree entity.

Rule #5:
Rule #4 can be extended further If one entity (A) has relation with three Entities (B, C & D) and they have their Child Entities then we must relate A with the lowest level Child Entity. In above E-R Diagram, Project Entity has relation with Person Entity, Designation Entity & Employer Entity and they have their child entity EmployerPersonDesignation at the lowest level of relation so Project Entity is related to it. Since relation between Project and EmployerPersonDesignation is Many-to-Many so intermediate entity ProjectEmployerPersonDesignation is required.

Step-4 (Distribution of Attributes):

Tips:
§  Distribute the attributes among entities using common knowledge, if you think a particular attribute should be under an entity, keep there. At the time of Normalization process those attributes that do not belong to entity will be out from there.
§  If an entity remains with one attribute still keep entity separate as in future there might be some more attributes that may not visible right now.

Step-5 (Normalization of Entities):
First Normal Form:

 
            Since Name is non-atomic attribute so it is broken into three atomic attributes.

Second Normal Form:
            Person:                       
            Person ID (PK)          
            First Name                 
            Middle Name             
            Last Name                  
            Marital Status             
            Gender                                   
            Date Of Birth             
            Tips:
§  Check each attribute with following question:
<attribute> IS FOR?
If the answer is Entity then keep the attribute there else take it out. 

For example:


So the Person table is in Second Normal Form since all of its attributes are fully dependent upon Primary Key.

Third Normal Form:
            Person:                       
            Person ID (PK)          
            First Name                 
            Middle Name             
            Last Name                  
            Marital Status             
            Gender                                   
            Date Of Birth             
Person entity is already in Third Normal Form since none of the attributes have dependency among them.

Tips:
§  To check the dependency among attributes use the following approach:
Take one attribute and check that any other attributes’ Existence or Value is controlled by this attribute. For example: Marital Status decides whether Spouse Name will exists or not.

Boyce-Codd Normal Form:
            Person:                       
            Person ID (PK)          
            First Name                 
            Middle Name             
            Last Name                  
            Marital Status             
            Gender                                   
            Date Of Birth             
Person entity is already in Boyce-Codd Normal Form since it does not have Composite Primary Key and Alternate Primary Key.

Tips:
§  If you are following the rule of Noun based entity selection then BCNF case may never come into existence since chances of Composite PK and Alternate Composite PK are rare.

Fourth Normal Form:
            Person:                       
            Person ID (PK)          
            First Name                 
            Middle Name             
            Last Name                  
Date Of Birth             
Marital Status and Gender attributes are removed since they are based on List of Values (Married & Unmarried, Male & Female).

Tips:
§  Marital Status and Gender will be separate entities and attached to Person Entity in the form of One-to-Many relation (One gender many persons, One marital status many persons)
§  Add two more entities to earlier E-R Diagram and relate them with Person Entity.




 



Fifth Normal Form:
            Person:                       
            Person ID (PK)          
            First Name                 
            Middle Name             
            Last Name                  
Date Of Birth             

Person entity is already in Fifth Normal Form since none of the attribute keeps multiple values. For example Phone No attribute may keeps one or more phone numbers.

Comments

  1. very useful information

    ReplyDelete
  2. Thanks for providing breif explaination.. Thanks you so much. its easy to understand about normalization from your website kashif

    ReplyDelete
  3. Very nice explanation!

    I found this page to be quite helpful as well:

    SQL normalization interview questions

    ReplyDelete

Post a Comment

Popular posts from this blog

Contact Me

Do You have any queries ?                   If you are having any query or wishing to get any type of help related Datawarehouse, OBIEE, OBIA, OAC then please e-email on below. I will reply to your email within 24 hrs. If I didn’t reply to you within 24 Hrs., Please be patience, I must be busy in some work. kashif7222@gmail.com

Top 130 SQL Interview Questions And Answers

1. Display the dept information from department table.   Select   *   from   dept; 2. Display the details of all employees   Select * from emp; 3. Display the name and job for all employees    Select ename ,job from emp; 4. Display name and salary for all employees.   Select ename   , sal   from emp;   5. Display employee number and total salary   for each employee. Select empno, sal+comm from emp; 6. Display employee name and annual salary for all employees.   Select empno,empname,12*sal+nvl(comm,0) annualsal from emp; 7. Display the names of all employees who are working in department number 10   Select ename from emp where deptno=10; 8. Display the names of all employees working as   clerks and drawing a salary more than 3000   Select ename from emp where job=’clerk’and sal>3000; 9. Display employee number and names for employees who earn commissi...

Informatica sample project

Informatica sample project - 1 CareFirst – Blue Cross Blue Shield, Maryland (April 2009 – Current) Senior ETL Developer/Lead Model Office DWH Implementation (April 2009 – Current) CareFirst Blue Cross Blue Shield is one of the leading health care insurance provided in Atlantic region of United States covering Maryland, Delaware and Washington DC. Model Office project was built to create data warehouse for multiple subject areas including Members, Claims, and Revenue etc. The project was to provide data into EDM and to third party vendor (Verisk) to develop cubes based on data provided into EDM. I was responsible for analyzing source systems data, designing and developing ETL mappings. I was also responsible for coordinating testing with analysts and users. Responsibilities: ·          Interacted with Data Modelers and Business Analysts to understand the requirements and the impact of the ETL on the business. ·  ...