Normalization
1NF scopes to no relationship, 2NF & 3NF to one-to-many relationship, and 4NF &
5NF to many-to-many relationship.
Every column must depend on the Key (1NF), the whole Key (2NF) and nothing but
the key (3NF)
Functional Dependency: X is said to be functionality dependent on Y if the value
of X is always the same for a given value of Y. For example, employee name is functionality
dependent on employee ID.
Transistive Dependency: A fact about a non-key field. For example, if the
employee table include employee name, extension and office and and building
table includes extension and office, then we could get extension from
employee.office => building.extension as well as through employee.extension (i.e
we'd have to update the phone number in two places). In this example,
employee.extension is trasistive because its actually a fact associated about
employee.office not employee.name.
- First Form (1NF) : PKs and no CSV
- All columns are scalar (i.e. no column contains multiple values ('phone1, phone2,
phone3') and no columns are repeated (phone1, phone2, phone3))
- Unquie Primary Key per record (can be composite, e.g. customer number + order number)
- Example: {CustomerID, OrderId, CustomerAddress, OrderDate}
- Second Form (2NF): Add FKs
As above + "No functional dependencies on a non-key attribute" (i.e.
every non-key column must be a fact about the entity identified by the whole of
a multi-column primary key. If you can get the fact using individual columns of
the PK rather than the whole thing. its not 2NF). Obviously this rule only
applies to tables with composite primary keys. For example, if you employee
table has cubical number and office phone number in it, 2NF doesn't require a
seperate table because the information isn't duplicated (provided you don't have
employees sharing cubes and phones).
- Third Form (3NF): Eliminate transistive dependencies
As above + no column is functionaly dependent on anything but the primary key.
For example, while an employee record with both office number and office phone
number in it passes the 2NF test because both items are functionally dependend
on the employee ID. [Note that we could have two employees working in the same
office and sharing the same phone and it would still be 2NF.].However office
phone number is also functionally dependent on the office number. Moving office
phone out to a new table called office solves this.
- Boyce-Codd Normal Format (BCNF))
As above + no overlapping candidate keys (overlapping = share at least field).
For example, if both SupplierID and SupplierName are candidate keys, supplier
name would need to be moved out to its own table.
- Fourth Form (4NF))
As 3NF above + no multi-value dependencies. For example, if
students attend courses and courses are taught by teachers but there is no
connection between student and teacher other than the student attends one of
their courses the 4NF form would be {StudentId, ClassId} + {ClassId, TeacherId}.
It would not be {StudentId, CourseId, TeacherId} is not 4NF as
that implies a relation between StudentId and TeacherId that isn't there. The
- Fifth Form (5NF)
Fifth normal form occurs when a table can not be deconstructed into a series of
smaller tables without repeating a key. For example:
Sales Person
|
Washing Machine
|
Energy Start Rating
|
Manufacturer
|
Dave
|
FastSpin 2000
|
4
|
FastSpin Inc
|
Dave
|
FastSpin 4000
|
5
|
FastSpin Inc
|
John
|
FastSpin 2000
|
4
|
FastSpin Inc
|
John
|
McWasher
|
3
|
DJB Inc
|
John
|
Tumbler J
|
1
|
DJB Incc
|
can be broken down into
Washing Machine
|
Energy Start Rating
|
Manufacturer
|
FastSpin 2000
|
4
|
FastSpin Inc
|
FastSpin 4000
|
5
|
FastSpin Inc
|
McWasher
|
3
|
DJB Inc
|
Tumbler J
|
1
|
DJB Inc
|
+
Sales Person
|
Manufacturer
|
Dave
|
FastSpin Inc
|
John
|
FastSpin
|
John
|
DJB Incc
|
+
Sales Person
|
Washing Machine
|
Dave
|
FastSpin 2000
|
Dave
|
FastSpin 4000
|
John
|
FastSpin 2000
|
John
|
McWasher
|
John
|
Tumbler J
|
- Note that the requirement is a series of smaller tables, not that the overall size
be smaller
- Fifth form normalization ensures that facts are not recorded more than once. For
example, in the above example the original structure recorded that Dave sells items
from FastSpin Inc more than once.
- Note this breakdown would be incorrect if there was an additional restrictions (such
as washing machines can only be sold by a specific manufactor, or a sales persons
can sell any machine from a manufacturer they are associated with)
- 5th Normal Form is considered academic because the excessive number of joins required
make it unsuitable for real-world use.
- Sixth Form (6NF)
- Academic
- Something to do with temporal?