| Database normalization is a technique for designing
relational database tables to minimize duplication of
information and, in so doing, to safeguard the database
against certain types of logical or structural problems,
namely data anomalies.
For example, when multiple instances of a given piece of
information occur in a table, the possibility exists that
these instances will not be kept consistent when the data
within the table is updated, leading to a loss of data
integrity. A table that is sufficiently normalized is less
vulnerable to problems of this kind, because its structure
reflects the basic assumptions for when multiple instances
of the same information should be represented by a single
instance only. 1. data is normalized in oltp systems
( are of different forms :1NF, 2NF, 3NF, BCNF, 4NF,
5NF,DKNF,6NF)
2. when it comes to olap/datawarehouse/dss sytems data is
generally de-normalized. (N1NF, PNF).
there is always a trade-off to consider between data
redundancy vs performance First normal form :
· A table is in first normal form when it contains no
repeating groups.
· The repeating column or fields in an un normalized table
are removed from the table and put in to tables of their own.
· Such a table becomes dependent on the parent table from
which it is derived.
· The key to this table is called concatenated key, with the
key of the parent table forming a part it.
Second normal form:
· A table is in second normal form if all its non_key fields
fully dependent on the whole key.
· This means that each field in a table ,must depend on the
entire key.
· Those that do not depend upon the combination key, are
moved to another table on whose key they depend on.
· Structures which do not contain combination keys are
automatically in second normal form.
Third normal form:
· A table is said to be in third normal form , if all the
non key fields of the table are independent of all other non
key fields of the same table. ..................................................................... It is possible to start creating
the database at this point. It's just a question of creating a new table
for every entity identified in the diagram. We'll be using MS-Access to
do that shortly. But how do you code the relationships?
There is
a formal process to do that in database modeling. It's called
normalization. It means applying a set of rules to the data so that you
group the attributes in such a way that the relationships work. It's not
really that complicated but it is a formula approach. If you prefer to
use that approach, get any good book on databases, look-up
"normalization" and follow the steps.
We'll do normalization
using the intuitive approach - work with the data until it "feels" OK.
This could also be called prototyping - create a working model of the
database that is close to what you want and keep improving it until it
works perfectly, then put it into production.
However, whatever
the approach taken, there are some basic rules that have to be adhered
to. The rules apply to any relational database and cannot be broken.
They can't even be stretched. Think of them as the Prime directives. The
rules are:
1. Every table must have a primary key - an
attribute or combination of attributes that uniquely identifies every
occurence in the table.
2. The primary key can never contain
an empty or Null value. That makes sense - if you had 2 that were empty,
they wouldn't be unique anymore.
3. Every attribute of every
occurence in the table can contain only one value. Think of the Employee
table as a grid. Every occurence, or line, represents one employee and
every column is an attribute. So, every employee can only have one ID
and one First-name and one Last-name, and so on. ...............................................................................
The formal classifications describing the level of database normalization in a data model are called Normal Forms (NF) and the process of doing this is Normalization.
First normal form
- First normal form (1NF) lays the groundwork for an organised database design:
- Ensure that each table has a primary key: minimal set of attributes which can uniquely identify a record.
- Eliminate repeating groups (categories of data which would seem to be required a different number of times on different records) by defining keyed and non-keyed attributes appropriately.
- Atomicity: Each attribute must contain a single value, not a set of values.
- 'First normal form' depends on functional dependency formula f(x)=y. For every value of x there is value for y.
Second normal form
- Second normal form (2NF) If a table has a composite key, all attributes must be related to the whole key:
- The database must meet all the requirements of the first normal form.
- The relational schema should not have any partial functional dependency i.e. No proper subset of the primary key should derive a functional dependency belonging to the same schema. For example, consider functional dependencies FD:{AB->C, A->D, C->D} here AB is the primary key, as A->D this relational schema is not in 2NF.
Third normal form
- Third normal form (3NF) requires that data stored in a table be dependent only on the primary key, and not on any other field in the table.
- The database must meet all the requirements of the first and second normal form.
- All fields must be directly dependent on the primary key field. Any field which is dependent on a non-key field which is in turn dependent on the Primary Key (ie a transitive dependency) is moved out to a separate database table.
Boyce-Codd normal form
- Boyce-Codd normal form (or BCNF) requires that there be no non-trivial functional dependencies of attributes on something other than a superset of a candidate key (called a superkey).
The formal classifications describing the level of database normalization in a data model are called Normal Forms (NF) and the process of doing this is Normalization.
First normal form
- First normal form (1NF) lays the groundwork for an organised database design:
- Ensure that each table has a primary key: minimal set of attributes which can uniquely identify a record.
- Eliminate repeating groups (categories of data which would seem to be required a different number of times on different records) by defining keyed and non-keyed attributes appropriately.
- Atomicity: Each attribute must contain a single value, not a set of values.
- 'First normal form' depends on functional dependency formula f(x)=y. For every value of x there is value for y.
Second normal form
- Second normal form (2NF) If a table has a composite key, all attributes must be related to the whole key:
- The database must meet all the requirements of the first normal form.
- The relational schema should not have any partial functional dependency i.e. No proper subset of the primary key should derive a functional dependency belonging to the same schema. For example, consider functional dependencies FD:{AB->C, A->D, C->D} here AB is the primary key, as A->D this relational schema is not in 2NF.
Third normal form
- Third normal form (3NF) requires that data stored in a table be dependent only on the primary key, and not on any other field in the table.
- The database must meet all the requirements of the first and second normal form.
- All fields must be directly dependent on the primary key field. Any field which is dependent on a non-key field which is in turn dependent on the Primary Key (ie a transitive dependency) is moved out to a separate database table.
Boyce-Codd normal form
- Boyce-Codd normal form (or BCNF) requires that there be no non-trivial functional dependencies of attributes on something other than a superset of a candidate key (called a superkey).
The formal classifications describing the level of database normalization in a data model are called Normal Forms (NF) and the process of doing this is Normalization.
First normal form
- First normal form (1NF) lays the groundwork for an organised database design:
- Ensure that each table has a primary key: minimal set of attributes which can uniquely identify a record.
- Eliminate repeating groups (categories of data which would seem to be required a different number of times on different records) by defining keyed and non-keyed attributes appropriately.
- Atomicity: Each attribute must contain a single value, not a set of values.
- 'First normal form' depends on functional dependency formula f(x)=y. For every value of x there is value for y.
Second normal form
- Second normal form (2NF) If a table has a composite key, all attributes must be related to the whole key:
- The database must meet all the requirements of the first normal form.
- The relational schema should not have any partial functional dependency i.e. No proper subset of the primary key should derive a functional dependency belonging to the same schema. For example, consider functional dependencies FD:{AB->C, A->D, C->D} here AB is the primary key, as A->D this relational schema is not in 2NF.
Third normal form
- Third normal form (3NF) requires that data stored in a table be dependent only on the primary key, and not on any other field in the table.
- The database must meet all the requirements of the first and second normal form.
- All fields must be directly dependent on the primary key field. Any field which is dependent on a non-key field which is in turn dependent on the Primary Key (ie a transitive dependency) is moved out to a separate database table.
Boyce-Codd normal form
- Boyce-Codd normal form (or BCNF) requires that there be no non-trivial functional dependencies of attributes on something other than a superset of a candidate key (called a superkey).
|
0 comments:
Post a Comment