Wednesday, April 22, 2015

Normalization and Denormalization

Database normalization is the step by step process of organizing data to minimizing data redundancy (data duplication).Which in turn ensures data consistency.

There are 7 types of normalization but we use only 3

If the data is not in the normalized way then these issue can come up
1. Disk Space wastage
2. Data inconsistancy
3. DML operations become slow

 First Normalization Form


·      No repeating columns should be there and the data in the column must be in the atomic form. (Not stored in comma separate or columns groups. e.g. a,b,c,d in one column or a, b, c,d in column wise better to arrange all these in the same column in separate rows ways,
·         For example there is a person table

Create Table Person
(
Id int primary key,
Name varchar(20),
Email varchar(30),
Phone no varchar(20),
Child1 varchar(20),
Child2 varchar(20),
Child3 varchar(20),
Child4 varchar(20)
)



Insert Into
Person Values(1,'ramu','tbjraju@gmail.com','7799291291','ravi','rock','remo','')
Insert into Person Values (2,'rakesh','rakesh45@gmail.com','9985362165','','','','')
Insert
Into Person Values (3,'srikant','srikan89@gmail.com','9848022338','vikas','','','')


Select * from person

*In the result we can observe that there are repeating columns          
 so to reduce repeating columns we divide the table into two parts.

Create Table Person
(
Id int primary key,
Name varchar(20),
Email varchar(30),
Phone no varchar(20)
)

Insert into Person Values (1,'ramu','tbjraju@gmail.com','7799291291')
Insert into Person Values (2,'rakesh','rakesh45@gmail.com','9985362165')
Insert into Person Values (3,'srikant','srikan89@gmail.com','9848022338')

Select * From person

Create Table Person children
(
Id Int constraint FK_personchildren_id References Person (id),
Seqno int,
Child name varchar(20)
)
Insert Into person children values(1,1,'ravi')
Insert Into person children values(1,2,'rock')
Insert Into person children values(1,3,'remo')
Insert Into person children values(3,1,'vikas')

Select * From personchildren


Second Normalization Form

·         Must be in First Normal Form.
      Move the redundant data into the separate table.
·         All attributes depends on Primary Key and new tables must be linked/joined with Primary-Foreign keys.

Example


Create Table Personprojectdetails
(
personid int,
projectid int,
personname varchar(20),
projectcode varchar(20),
projectname varchar(20),
personphoneno varchar(20)
)

Insert into personprojectdetails Values (1, 1,'roy','project1','databaseproject','7799')
Insert into personprojectdetails Values (2, 1,'sai','project1','databaseproject','2244')
Insert into personprojectdetails Values (3, 1,'rambo','project1','databaseproject','3366')
Insert into personprojectdetails Values (4, 1,'pratap','project1','databaseproject','2288)
Insert into personprojectdetails Values (1, 2,'roy','project2','webproject','7799')
Insert into personprojectdetails Values (2, 2,'sai','project2','webproject','2244')
Insert into personprojectdetails Values (3, 2,'rambo','project2','webproject','3366')
Insert into personprojectdetails Values (4, 2,'pratap','project2','webproject','2288')

Select * from personprojectdetails

Create Table Persons
(
personid int,
personname varchar(20),
Person phone varchar(20)
)
Insert into Persons Values (1,'roy','7799')
Insert into Persons Values (2,'sai','2244')
Insert into Persons Values (3,'rambo','3366')
Insert into Persons Values (4,'pratap','2288')

Select * From Persons

Create Table Project
(
projectid int,
projectcode varchar(20),
projectname varchar(20)
)
Insert into Project Values (1,'project1','databaseproject')
Insert into Project Values (2,'project2','webproject')

Select * From Project

Create Table Personprojectid
(
personid int,
projectid int
)
Insert Into personprojectid values (1, 1)
Insert Into personprojectid values (2, 1)
Insert Into personprojectid values (3, 1)
Insert Into personprojectid values (4, 1)
Insert Into personprojectid values (1, 2)
Insert Into personprojectid values (2, 2)
Insert Into personprojectid values (3, 2)
Insert Into personprojectid values (4, 2)

Select * from personprojectid

Select personname,personphone,projectcode,projectname from persons
Join personprojectid on persons.personid=personprojectid.personid
Join project on project.projectid=personprojectid.projectid

Select * from persons join personprojectid on persons.personid=personprojectid.personid
Join project on project.projectid=personprojectid.projectid

Third Normalization Form

·         Must be in Second Normal Form
·         All attributes non transitively dependent on primary key
·         Nontransitive means directly dependent
·         No column should depend on other columns

 Create Table Emplocation
(
Empname varchar(20),
City varchar(20),
Country varchar(20)
)
Insert into Emplocation Values ('raju','chicago','usa')
Insert into Emplocation Values ('raj','chicago','usa')
Insert into Emplocation Values ('raja','chicago','usa')
Insert into Emplocation Values ('ramu','chicago','usa')
Insert into Emplocation Values ('hari','newyork','usa')
Insert into Emplocation Values ('sai','newyork','usa')
Insert into Emplocation Values ('chandra','newyork','usa')

Select * From Emplocation

Create Table Empadd
(
Empname varchar(20),
City varchar(20)
)

Insert into Empadd Values ('raju','chicago')
Insert into Empadd Values ('raj','chicago')
Insert into Empadd Values ('raja','chicago')
Insert into Empadd Values ('ramu','chicago')
Insert into Empadd Values ('hari','newyork')
Insert into Empadd Values ('sai','newyork')
Insert into Empadd Values ('chandra','newyork')

Select * From Empadd

Create Table Emploc
(
City varchar(20),
Country varchar(20)
)

Insert into Emploc Values ('chicago','usa')
Insert into Emploc Values ('newyork','usa')

Select * From Emploc

Denormalization:


*Denomalization is the process of adding redundant data to speed up complex queries involving multiple table JOINS.

*Denormalization is the process of attempting to optimize the read performance of a database by adding redundant data or by grouping dat

*we can have duplicate data to make our search factor olap faster

Example:

Create Table Customers
(
Customerid int,
Customername varchar(20)
)

Insert into Customers Values (1,'raju')
Insert into Customers Values (2,'sai')
Insert into Customers Values (3,'govind')
Insert into Customers Values (4,'ravi')
Insert Into Customers Values (5,'uv')

Select * From Customers

Create Table Orders
(
Orderid int,
Customerid int
)
Insert into Orders values (101, 1)
Insert into Orders values (102, 1)
Insert into Orders values (103, 1)
Insert into Orders values (101, 2)
Insert into Orders values (102, 2)
Insert into Orders values (105, 3)
Insert into Orders values (106, 4)
Insert into Orders values (104, 5)

Select * From Orders


Create Table Order details
(
Orderid int,
Productid int,
Unitprice int,
Quantity int
)
Insert Into Order details Values (101, 200, 10000, 12)
Insert Into Order details Values (102, 201, 30000, 15)
Insert Into Order details Values (103, 202, 5000, 6)
Insert Into Order details Values (104, 203, 7000, 8)
Insert Into Order details Values (105, 204, 9000, 9)

Select * From Order details

Select customers.customerid,customers.customername,orderdetails.unitprice,orderdetails.quantity,orderdetails.unitprice*orderdetails.quantity as total sales
From customers join orders on customers.customerid=orders.customerid
Join order details on orders. orderid=orderdetails.order