There are three sources of modification anomalies in SQL These are defined as first, second, and third normal forms (1NF, 2NF, 3NF). These normal forms act as remedies to modification anomalies.
First normal form
DATABASE DESIGN: NORMALIZATION NOTE & EXERCISES (Up to 3NF) Tables that contain redundant data can suffer from update anomalies, which can introduce inconsistencies into a database. The rules associated with the most commonly used normal forms, namely first (1NF), second (2NF), and third (3NF). Third-Normal Form (3NF) In our example above, we have unitCode as our primary key, we also have a courseName that is dependent on courseCode and courseCode, dependent on unitCode. Though couseName could be dependent on unitCode it more dependent on courseCode, therefore it is transitively dependent on unitCode.
To be in first normal form (1NF), a table must have the following qualities:
The table is two-dimensional with rows and columns.
Each row contains data that pertains to some thing or portion of a thing.
Each column contains data for a single attribute of the thing it’s describing.
Each cell (intersection of a row and a column) of the table must have only a single value.
Entries in any column must all be of the same kind. If, for example, the entry in one row of a column contains an employee name, all the other rows must contain employee names in that column, too.
Each column must have a unique name.
No two rows may be identical (that is, each row must be unique).
The order of the columns and the order of the rows are not significant.
A table (relation) in first normal form is immune to some kinds of modification anomalies but is still subject to others. The SALES table is in first normal form, and the table is subject to deletion and insertion anomalies. First normal form may prove useful in some applications but unreliable in others.
Second normal form
To appreciate second normal form, you must understand the idea of functional dependency. A functional dependency is a relationship between or among attributes. One attribute is functionally dependent on another if the value of the second attribute determines the value of the first attribute. If you know the value of the second attribute, you can determine the value of the first attribute.
Suppose, for example, that a table has attributes (columns) StandardCharge, NumberOfTests, and TotalCharge that relate through the following equation:
TotalCharge is functionally dependent on both StandardCharge and NumberOfTests. If you know the values of StandardCharge and NumberOfTests, you can determine the value of TotalCharge.
Every table in first normal form must have a unique primary key. That key may consist of one or more than one column. A key consisting of more than one column is called a composite key. To be in second normal form (2NF), all non-key attributes must depend on the entire key. Thus, every relation that is in 1NF with a single attribute key is automatically in second normal form.
If a relation has a composite key, all non-key attributes must depend on all components of the key. If you have a table where some non-key attributes don’t depend on all components of the key, break the table up into two or more tables so that — in each of the new tables — all non-key attributes depend on all components of the primary key.
Sound confusing? Look at an example to clarify matters. Consider the SALES table. Instead of recording only a single purchase for each customer, you add a row every time a customer buys an item for the first time. An additional difference is that charter customers (those with Customer_ID values of 1001 to 1007) get a discount off the normal price.
Customer_ID does not uniquely identify a row. In two rows, Customer_ID is 1001. In two other rows, Customer_ID is 1010. The combination of the Customer_ID column and the Product column uniquely identifies a row. These two columns together are a composite key.
If not for the fact that some customers qualify for a discount and others don’t, the table wouldn’t be in second normal form, because Price (a non-key attribute) would depend only on part of the key (Product). Because some customers do qualify for a discount, Price depends on both CustomerID and Product, and the table is in second normal form.
Third normal form
Tables in second normal form are especially vulnerable to some types of modification anomalies — in particular, those that come from transitive dependencies.
A transitive dependency occurs when one attribute depends on a second attribute, which depends on a third attribute. Deletions in a table with such a dependency can cause unwanted information loss. A relation in third normal form is a relation in second normal form with no transitive dependencies.
Look again at the SALES table, which you know is in first normal form. As long as you constrain entries to permit only one row for each Customer_ID, you have a single-attribute primary key, and the table is in second normal form. However, the table is still subject to anomalies. What if customer 1010 is unhappy with the chlorine bleach, for example, and returns the item for a refund?
You want to remove the third row from the table, which records the fact that customer 1010 bought chlorine bleach. You have a problem: If you remove that row, you also lose the fact that chlorine bleach has a price of $4. This situation is an example of a transitive dependency. Price depends on Product, which, in turn, depends on the primary key Customer_ID.
Breaking the SALES table into two tables solves the transitive dependency problem. The two tables make up a database that’s in third normal form.
1NF vs 2NF vs 3NF
Normalization is a process that is carried out to minimize the redundancies that are present in data in relational databases. This process will mainly divide large tables in to smaller tables with fewer redundancies. These smaller tables will be related to each other through well defined relationships. In a well normalized database, any alteration or modification in data will requires modifying only a single table. First normal form (1NF), Second normal form (2NF) and the Third Normal Form (3NF) was introduced by Edgar F. Codd, who is also the inventor of the relational model and the concept of normalization.
What is 1NF?
1NF is the First normal form, which provides the minimum set of requirements for normalizing a relational database. A table that complies with 1NF assures that it actually represents a relation (i.e. it does not contain any records that are repeating), but there is no universally accepted definition for 1NF. One important property is that a table that comply with 1NF could not contain any attributes that are relational valued (i.e. all the attributes should have atomic values).
What is 2NF?
2NF is the Second normal form used in relational databases. For a table to comply with 2NF, it should be complied with 1NF and any attribute that is not a part of any candidate key (i.e. non-prime attributes) should fully depend on any of the candidate keys in the table.
What is 3NF?
3NF is the Third normal form used in relational database normalization. According to the Codd’s definition, a table is said to be in 3NF, if and only if ,that table is in the second normal form (2NF), and every attribute in the table that do not belong to a candidate key, should directly depend on every candidate key of that table. In 1982 Carlo Zaniolo produced a differently expressed definition for 3NF. Tables that comply with the 3NF generally do not contain anomalies that occur when inserting, deleting or updating records in the table.
What is the difference between 1NF and 2NF and 3NF?
1NF, 2NF and 3NF are normal forms that are used in relational databases to minimize redundancies in tables. 3NF is considered as a stronger normal form than the 2NF, and it is considered as a stronger normal form than 1NF. Therefore in general, obtaining a table that complies with the 3NF form will require decomposing a table that is in the 2NF. Similarly, obtaining a table that complies with the 2NF will require decomposing a table that is in the 1NF. However, if a table that complies with 1NF contains candidate keys that are only made up of a single attribute (i.e. non-composite candidate keys), such a table would automatically comply with 2NF. Decomposition of tables will result in additional join operations (or Cartesian products) when executing queries. This will increase the computational time. On the other hand, the tables that comply with stronger normal forms would have fewer redundancies than tables that only comply with weaker normal forms.