Normalization:
Normalization is step-by-step process of reducing complexity of an entity by distributing the attributes to different entities. Following six normal rules are used sequentially on each entity to normalize them and the attribute that does not adhere to the rule is taken out of the entity.
Normalization is step-by-step process of reducing complexity of an entity by distributing the attributes to different entities. Following six normal rules are used sequentially on each entity to normalize them and the attribute that does not adhere to the rule is taken out of the entity.
First Normal Form (1NF)
o Every
attribute must have value. It should not be empty. In RDBMS a column without
value has NULL value that can be manipulated or accesses like a normal value.
o Every
attribute must have atomic value that cannot be shorten further. For example:
Name is non-atomic attribute as it can be broken into First Name, Middle Name
and Surname.
Second Normal Form (2NF)
o An
entity must be in First Normal Form
o Entity
must have a Primary Key or Composite Primary Key
o Every
attribute must be fully and functionally dependent upon Primary Key
Third Normal Form (3NF)
o An
entity must be Second Normal Form.
o There
must not be any dependency among non-key attributes (other that Primary Key)
Boyce-Codd Normal Form (BCNF)
o An
entity must be in Third Normal Form.
o If
the entity has a Composite Primary Key & Alternate Primary Key with one or
more attribute common to both, in such case entity must be broken into three
entities.
Fourth Normal Form (4NF)
o An
entity must be in Boyce-Codd Normal Form.
o If
an attribute is based on Value List must be taken out as a separate entity.
Fifth Normal Form (5NF)
o An
entity must be in Fourth Normal Form.
o If
an attribute is multi-valued attribute then it must be taken out as a separate
entity.
Database Design Example:
This example takes you into step-by-step approach for
designing a small database for Resume Management that is based on Resume
Attributes.
Step-1: (Gathering Attributes for Resume
Management):
Personal:
Name, Father Name, Address, Date
of Birth, Contact, Marital Status, Gender, Nationality, Religion
Education:
Degree/Certificate Name, Pass
Year, Institute Name, Marks Obtained (%), Rating
Experience:
Employer Name, Designation, Join
Date, Resign Date, Reason of Resignation, Roles
Project:
Project Name, Client Name, Start
Date, Duration (Weeks), Team Size, Role, Description
Step-2 (Defining Entities using Noun Approach):
Entities below are derived from
the list of attributes using a simple approach of Noun. Under this approach
look for all attributes and whatever words are Noun pick them as entity and
list them.
Step-3 (Relation among entities):
Now
define the relations among entities as per the following rules:
§ One-to-One
Relation (Link directly both the entities)
§ One-to-Many
Relation (Link directly both the entities)
§ Many-to-Many
Relation (Take a third entity and relate both the entities via third entity)
Normalization in SQL |
Conventions in E-R Diagram:
§ Normal
color rectangle boxes are Real Entities (Person, Religion, Father, etc.).
§ Inverse
color rectangle boxes are Intermediate Entities to break Many-to-Many relation
based Entities (PersonAddress, PersonContact, PersonDesignation, etc.).
§ Arrow
indicates One-to-Many relation from Parent towards Child entity (One Religion
Many Person, One Nationality Many Person, One Person Many Degree, etc.).
Rules for defining relations between Entities:
Rule #1:
If two entities are based on
One-to-One relation then keep them as separate entities are merge them into
single entity, preferred method is keep them as separate entity. In above E-R
Diagram there are no two entities based on One-to-One relation.
Rule #2:
If two entities are based on
One-to-Many relation then always keep them as separate entities and relate them
from One (Parent Entity) towards Many (Child Entity). In above E-R Diagram
Religion & Person, Nationality & Person are examples of One-to-Many
relations.
Rule #3:
If two entities are based on
Many-to-Many relation then introduce a third entity to relate both the
entities. Third entity becomes Child Entity for both the Parent Entities. In
above E-R Diagram Person & Degree (PersonDegree), Person & Address
(PersonAddress), Person & Contact (PersonContact) are examples of
Many-to-Many relations.
Rule #4:
If one entity (A) has relation
with two entities (B & C) and both have their child entity (BC) then relate
A with BC. In above E-R Diagram, Institute Entity has relation with Person
Entity & Degree Entity and they both have their child entity PersonDegree
so Institute Entity is related to PersonDegree entity.
Rule #5:
Rule #4 can be extended further
If one entity (A) has relation with three Entities (B, C & D) and they have
their Child Entities then we must relate A with the lowest level Child Entity.
In above E-R Diagram, Project Entity has relation with Person Entity,
Designation Entity & Employer Entity and they have their child entity
EmployerPersonDesignation at the lowest level of relation so Project Entity is
related to it. Since relation between Project and EmployerPersonDesignation is
Many-to-Many so intermediate entity ProjectEmployerPersonDesignation is
required.
Step-4 (Distribution of Attributes):
Tips:
§ Distribute
the attributes among entities using common knowledge, if you think a particular
attribute should be under an entity, keep there. At the time of Normalization
process those attributes that do not belong to entity will be out from there.
§ If
an entity remains with one attribute still keep entity separate as in future
there might be some more attributes that may not visible right now.
Step-5 (Normalization of Entities):
First Normal Form:
Since Name is non-atomic
attribute so it is broken into three atomic attributes.
Second Normal Form:
Person:
Person ID (PK)
First Name
Middle Name
Last Name
Marital Status
Gender
Date Of Birth
Tips:
§
Check each attribute with following question:
<attribute>
IS FOR?
If
the answer is Entity then keep the attribute there else take it out.
For
example:
So
the Person table is in Second Normal Form since all of its attributes are fully
dependent upon Primary Key.
Third Normal Form:
Person:
Person ID (PK)
First Name
Middle Name
Last Name
Marital Status
Gender
Date Of Birth
Person
entity is already in Third Normal Form since none of the attributes have
dependency among them.
Tips:
§
To check the dependency among attributes use the
following approach:
Take
one attribute and check that any other attributes’ Existence or Value is
controlled by this attribute. For example: Marital Status decides whether
Spouse Name will exists or not.
Boyce-Codd Normal
Form:
Person:
Person ID (PK)
First Name
Middle Name
Last Name
Marital Status
Gender
Date Of Birth
Person
entity is already in Boyce-Codd Normal Form since it does not have Composite
Primary Key and Alternate Primary Key.
Tips:
§
If you are following the rule of Noun based
entity selection then BCNF case may never come into existence since chances of
Composite PK and Alternate Composite PK are rare.
Fourth Normal Form:
Person:
Person ID (PK)
First Name
Middle Name
Last Name
Date
Of Birth
Marital
Status and Gender attributes are removed since they are based on List of Values
(Married & Unmarried, Male & Female).
Tips:
§
Marital Status and Gender will be separate
entities and attached to Person Entity in the form of One-to-Many relation (One
gender many persons, One marital status many persons)
§
Add two more entities to earlier E-R Diagram and
relate them with Person Entity.
Fifth Normal Form:
Person:
Person ID (PK)
First Name
Middle Name
Last Name
Date
Of Birth
Person
entity is already in Fifth Normal Form since none of the attribute keeps
multiple values. For example Phone No attribute may keeps one or more phone
numbers.
thanks!
ReplyDeletevery useful information
ReplyDeleteThanks for providing breif explaination.. Thanks you so much. its easy to understand about normalization from your website kashif
ReplyDeleteVery nice explanation!
ReplyDeleteI found this page to be quite helpful as well:
SQL normalization interview questions
Quick and crisp!
ReplyDeleteThanks Dusturbia
Delete