What is Normalization?



Database normalization is a technique for designing 
relational database tables to minimize duplication of 
information and, in so doing, to safeguard the database 
against certain types of logical or structural problems, 
namely data anomalies. 

For example, when multiple instances of a given piece of 
information occur in a table, the possibility exists that 
these instances will not be kept consistent when the data 
within the table is updated, leading to a loss of data 
integrity. A table that is sufficiently normalized is less 
vulnerable to problems of this kind, because its structure 
reflects the basic assumptions for when multiple instances 
of the same information should be represented by a single 
instance only.
 
1. data is normalized in oltp systems 
( are of different forms :1NF, 2NF, 3NF, BCNF, 4NF,
5NF,DKNF,6NF)
2. when it comes to olap/datawarehouse/dss sytems data is
generally de-normalized. (N1NF, PNF).

there is always a trade-off to consider between data
redundancy vs performance 
First normal form : 
· A table is in first normal form when it contains no
repeating groups. 
· The repeating column or fields in an un normalized table
are removed from the table and put in to tables of their own. 
· Such a table becomes dependent on the parent table from
which it is derived. 
· The key to this table is called concatenated key, with the
key of the parent table forming a part it. 
 
Second normal form: 
· A table is in second normal form if all its non_key fields
fully dependent on the whole key. 
· This means that each field in a table ,must depend on the
entire key. 
· Those that do not depend upon the combination key, are
moved to another table on whose key they depend on. 
· Structures which do not contain combination keys are
automatically in second normal form. 
Third normal form: 
· A table is said to be in third normal form , if all the
non key fields of the table are independent of all other non
key fields of the same table.
 
 
.....................................................................
It is possible to start creating 
the database at this point. It's just a question of creating a new table
 for every entity identified in the diagram. We'll be using MS-Access to
 do that shortly. But how do you code the relationships?

There is
 a formal process to do that in database modeling. It's called 
normalization. It means applying a set of rules to the data so that you 
group the attributes in such a way that the relationships work. It's not
 really that complicated but it is a formula approach. If you prefer to 
use that approach, get any good book on databases, look-up 
"normalization" and follow the steps.

We'll do normalization 
using the intuitive approach - work with the data until it "feels" OK. 
This could also be called prototyping - create a working model of the 
database that is close to what you want and keep improving it until it 
works perfectly, then put it into production.

However, whatever 
the approach taken, there are some basic rules that have to be adhered 
to. The rules apply to any relational database and cannot be broken. 
They can't even be stretched. Think of them as the Prime directives. The
 rules are:

   1. Every table must have a primary key - an 
attribute or combination of attributes that uniquely identifies every 
occurence in the table.

   2. The primary key can never contain 
an empty or Null value. That makes sense - if you had 2 that were empty,
 they wouldn't be unique anymore.

   3. Every attribute of every 
occurence in the table can contain only one value. Think of the Employee
 table as a grid. Every occurence, or line, represents one employee and 
every column is an attribute. So, every employee can only have one ID 
and one First-name and one Last-name, and so on. 
...............................................................................

     
  
        
                
The formal classifications describing the level of database normalization in a data model are called Normal Forms (NF) and the process of doing this is Normalization.
First normal form

First normal form (1NF) lays the groundwork for an organised database design:
  • Ensure that each table has a primary key: minimal set of attributes which can uniquely identify a record.
  • Eliminate repeating groups (categories of data which would seem to be required a different number of times on different records) by defining keyed and non-keyed attributes appropriately.
  • Atomicity: Each attribute must contain a single value, not a set of values.
  • 'First normal form' depends on functional dependency formula f(x)=y. For every value of x there is value for y.
Second normal form
Second normal form (2NF) If a table has a composite key, all attributes must be related to the whole key:
  • The database must meet all the requirements of the first normal form.
  • The relational schema should not have any partial functional dependency i.e. No proper subset of the primary key should derive a functional dependency belonging to the same schema. For example, consider functional dependencies FD:{AB->C, A->D, C->D} here AB is the primary key, as A->D this relational schema is not in 2NF.


Third normal form
Third normal form (3NF) requires that data stored in a table be dependent only on the primary key, and not on any other field in the table.
  • The database must meet all the requirements of the first and second normal form.
  • All fields must be directly dependent on the primary key field. Any field which is dependent on a non-key field which is in turn dependent on the Primary Key (ie a transitive dependency) is moved out to a separate database table.


Boyce-Codd normal form
Boyce-Codd normal form (or BCNF) requires that there be no non-trivial functional dependencies of attributes on something other than a superset of a candidate key (called a superkey).

     
  
        
                
The formal classifications describing the level of database normalization in a data model are called Normal Forms (NF) and the process of doing this is Normalization.
First normal form

First normal form (1NF) lays the groundwork for an organised database design:
  • Ensure that each table has a primary key: minimal set of attributes which can uniquely identify a record.
  • Eliminate repeating groups (categories of data which would seem to be required a different number of times on different records) by defining keyed and non-keyed attributes appropriately.
  • Atomicity: Each attribute must contain a single value, not a set of values.
  • 'First normal form' depends on functional dependency formula f(x)=y. For every value of x there is value for y.


Second normal form
Second normal form (2NF) If a table has a composite key, all attributes must be related to the whole key:
  • The database must meet all the requirements of the first normal form.
  • The relational schema should not have any partial functional dependency i.e. No proper subset of the primary key should derive a functional dependency belonging to the same schema. For example, consider functional dependencies FD:{AB->C, A->D, C->D} here AB is the primary key, as A->D this relational schema is not in 2NF.


Third normal form
Third normal form (3NF) requires that data stored in a table be dependent only on the primary key, and not on any other field in the table.
  • The database must meet all the requirements of the first and second normal form.
  • All fields must be directly dependent on the primary key field. Any field which is dependent on a non-key field which is in turn dependent on the Primary Key (ie a transitive dependency) is moved out to a separate database table.


Boyce-Codd normal form
Boyce-Codd normal form (or BCNF) requires that there be no non-trivial functional dependencies of attributes on something other than a superset of a candidate key (called a superkey).

     
  
        
                
The formal classifications describing the level of database normalization in a data model are called Normal Forms (NF) and the process of doing this is Normalization.
First normal form

First normal form (1NF) lays the groundwork for an organised database design:
  • Ensure that each table has a primary key: minimal set of attributes which can uniquely identify a record.
  • Eliminate repeating groups (categories of data which would seem to be required a different number of times on different records) by defining keyed and non-keyed attributes appropriately.
  • Atomicity: Each attribute must contain a single value, not a set of values.
  • 'First normal form' depends on functional dependency formula f(x)=y. For every value of x there is value for y.


Second normal form
Second normal form (2NF) If a table has a composite key, all attributes must be related to the whole key:
  • The database must meet all the requirements of the first normal form.
  • The relational schema should not have any partial functional dependency i.e. No proper subset of the primary key should derive a functional dependency belonging to the same schema. For example, consider functional dependencies FD:{AB->C, A->D, C->D} here AB is the primary key, as A->D this relational schema is not in 2NF.


Third normal form
Third normal form (3NF) requires that data stored in a table be dependent only on the primary key, and not on any other field in the table.
  • The database must meet all the requirements of the first and second normal form.
  • All fields must be directly dependent on the primary key field. Any field which is dependent on a non-key field which is in turn dependent on the Primary Key (ie a transitive dependency) is moved out to a separate database table.


Boyce-Codd normal form
Boyce-Codd normal form (or BCNF) requires that there be no non-trivial functional dependencies of attributes on something other than a superset of a candidate key (called a superkey).

0 comments:

Post a Comment