Normal Forms
Forms of normalization are given below:
- FIRST NORMAL FORM (1NF)
- SECOND NORMAL FORM (2NF)
- THIRD NORMAL FORM (3NF)
- Boyce-Codd Normal Form (BCNF)
- FOURTH NORMAL FORM (4NF)
- FIFTH NORMAL FORM (5NF)
FIRST NORMAL FORM (1NF) :
"A relation schema R is in 1NF, if it does not have any
composite attributes,
multivalued atttribute or their combination."
The objective of first normal form is that the table should contain no repeating groups of data.Data is divided into logical units called entities or tables
All attributes (column) in the entity (table) must be single valued.
Repeating or multi valued attributes are moved into a separate entity (table) & a relationship is established between the two tables or entities.
Example of the first normal form :
Consider the following
Customer table.
Customer :
cid |
name |
address |
contact_no |
society |
city |
C01 |
aaa |
Amul avas,Anand |
{1234567988} |
C02 |
bbb |
near parimal garden,abad |
{123,333,4445} |
C03 |
ccc |
sardar colony , surat |
|
Here,address is a composite attribute , which is further subdivided into two column society and city.And attribute contact_no is multivalued attribute.
Problems with this relation are -
- It is not possible to store multiple values in a single field in a relation. so, if any customer has more than one contact number, it is not possible to store those numbers.
- Another problem is related to information retrieval . Suppose, here, if there is a need to find out all customers belonging to some particular city, it is very difficult to retrieve. The reason is: city name for all customers are combined with society names and stored whole as an address.
Solution for composite attribute
Insert separate attributes in a relation for each sub-attribute of a composite attribute.
In our example, insert two separate attributes for Society and city in a relation in place of single composite attributes address. Now, insert data values separately for Society and City for all tuples.
Customer :
cid |
name |
Society |
city |
contact_no |
C01 |
aaa |
Amul avas |
Anand |
{1234567988} |
C02 |
bbb |
near parimal garden |
abad |
{123,333,4445} |
C03 |
ccc |
sardar colony |
surat |
|
Solution for Multi-valued attribute
Two approaches are available to solve problem of multi-valued attribute
1. First Approach:
In a First approach, determine maximum allowable values for a multi-valued attribute.In our case, if maximum two numbers are allowed to store, insert two separate attributes attributes to store contact numbers as shown.
Customer:
cid |
name |
Society |
city |
contact_no1 |
contact_no2 |
contact_no3 |
C01 |
aaa |
Amul avas |
Anand |
1234567988 |
|
|
C02 |
bbb |
near parimal garden |
abad |
123 |
333 |
4445 |
C03 |
ccc |
sardar colony |
surat |
|
|
|
Now,if customer has only one contact number or no any contact number, then keep the related field empty for tupple of that customer. If customer has two contact numbers, store both number in related fields. If customer has more than two contact numbers, store two numbers and ignore all other numbers.
2.Second Approach:
In a second approach, remove the multi-valued attribute that violates 1NF and place it in a separate relation along with the primary key of given original relation. The primary key of new relation is the combination of multi-valued attribute and primary key of old relation. for example, in our case, remove the contact_no attribute and place it with cid in a separate relation customer_contact. Primary Key for relation Customer_contact will be combination of cid and contact_no.
customer:
cid |
name |
Society |
city |
C01 |
aaa |
Amul avas |
Anand |
C02 |
bbb |
near parimal garden |
abad |
C03 |
ccc |
sardar colony |
surat |
Customer_contact
cid |
contact_no |
C01 |
1234567988 |
C02 |
123 |
C02 |
333 |
C02 |
4445 |
First approach is simple.But, it is not always possible to put restriction on maximum allowable values.It also introduces null values for mant fields.
Second approach is superior as it does not suffer from draw backs of first approach. But, it is some what complicated one.For example, to display all information about any/all customers, two relations - Customer and Customer_contact - need to be accessed.
SECOND NORMAL FORM (2NF) :
"A relation schema R is in 2NF, if It is in First Normal Form, and every non_ prime attribute of relation is fully functionally dependent on primary key."
A relation can violate 2NF only when it has more than one attribute in combination as a primary key. if relation has only single attribute as a primary key, then, relation will definitely be in 2NF.
Example:
consider the folllowing relation Table Depositor_Account
Depositor_Account
This relation contains following functional dependencies.
FD1 : {Cid, ano} -> {access_date, balance, bname}
FD2 : ano -> {balance, bname}
In this relation schema, access_date, balance and bname are non - prime attributes. Among all these three attributes, access_date is fully dependent on primary key (cid and ano). But balance and bname are not fully dependent on primary key. tey depend on ano only.
So, this relation is not in Second normal form. Such kind of partia dependencies result in data redundancy.
Solution:
Decompose the ralation such that, resultant relations do not have any partial functional dependency. For this purpose, remove the partial dependent non-prime attributes that violates 2NF in relation. Place them in a separate new relation along with the prime attribute on which they fully depend.
In our example, balance and bname are partial dependant attribute on primary key. so, remove them and place in separate ralation called account alog with prime attribute ano. For relation Account, ano will be a Primary key.
The Depositor_ account relation will be decomped in two seperate relations, called Account_holder and Account.
Account:
Account_Holder:
THIRD NORMAL FORM (3NF) :
"A relation schema R is in 3NF, if it is in Second normal form, and no any non-prime attribute of relation is transitively dependent on primary key."
Third normal form ensures that the relation does not have any non-prime attribute transitively dependent on primary key. In other words, It ensures that all the non-prime attributes of relation directly depend on the primary key.
Example:
Consider the following relation schema Account_Branch
Account_Branch:
This relationcontains following functional dependencies.
FD1 : ano -> {balance, bname, baddress}, and
FD2 : bname -> baddress
In this relation schema, there is a functional dependency ano -> bname between ano & bname as shown in FD1. also, there is another functional dependency bname -> baddress between bname & baddress as shown in FD2. more over bname is a non-prime attribute. So, there is a transitive dependency from ano to baddress, denoted by ano -> baddress.
Such kind of transitive dependencies result in data redundancy. In this relation, branch address wil be stored repeatedly for each account of the same branch, occupying more amount of memory.
Solution:
Decompose the relation in such a way that, resultant relatons do not have any non-prime attribute transitively dependent on primary key. For this purpose, remove the transitively dependant non-prime attributes that violates 3NF from relation. Place them in a separate new relation along with the non-prime attribute due to which tansitive dependency occured. The primary key of new relation will be the non-prime atttribute.
In our example, baddress is transitively dependent on ano due to non-prime attribute bname. So, remove baddress and place it in separate relation called Branch along with the non-prime attribute bname. for relation Branch, bname will be a primary key.
The Account_Branch relation will be decomposed in two separate relations called Account and Branch.
Account:
Branch:
Ask Question