# Databases: Functional Dependencies and Candidate Keys

### Theory

A Functional Dependency is a constraint between two sets of attributes in a relation

\[X \rightarrow Y

\]

such that the value of Y is always determined by the value of X. For instance,

author_id -> author_name // correct author_name -> author_id // wrong

The axioms for FDs are:

- Reflexivity: If Y is a subset of X, then X → Y
- Augmentation: If X → Y, then XZ → YZ
- Transitivity: If X → Y and Y → Z, then X → Z

which can be used to prove the following rules:

- Union: If X → Y and X → Z, then X → YZ
- Decomposition: If X → YZ, then X → Y and X → Z
- Pseudotransitivity: If X → Y and WY → Z, then WX → Z

Now some definitions:

**Super key**: a set of attributes X in a relation R that determines*all the attributes*of the relation**Minimal key**: a super key such that if you remove even one attribute from it, it is not a key anymore- The
**closure of a set of attributes**is the set of all the attributes that can be inferred from a first attribute - The
**closure of a set of FDs**is the set of all the FDs that can be inferred from a first FD.*Two sets of FDs are equivalent if their closures are equivalent* - A
**Minimal Cover**of a set of FDs is the*smallest set of FDs needed to determine all the FDs*in the relation - A
**Prime Attribute**is an attribute which is*part of a super key*.

### Computing the Minimal Cover

Given this set of FDs for a relation **R**(A, B, C, D, E, F, G, H)

AB -> DEH, BEF -> A, FGH -> C, D -> EG, EG -> BF, F -> BH

*1) Rewrite all the FDs with only one attribute on the RHS*

AB -> D, AB -> E, AB -> H, BEF -> A, FGH -> C, D -> E, D -> G, EG -> B, EG -> F, F -> B, F -> H

*2) Consider if any aggregate on the LHS can be derived from something else*

Since F -> B, BEF -> A becomes EF -> A Since F -> H, FGH -> C becomes FG -> C

The new set is therefore

AB -> D, AB -> E, AB -> H, EF -> A, FG -> C, D -> E, D -> G, EG -> B, EG -> F, F -> B, F -> H

*3) Apply transitifity and simplify.*

My favourite way to do so is: for every equal LHS, pick one, develop the full tree, see if in the tree you hit the RHS of another FD in the same group; if so, the latter FD is redundant. Repeat for every FD in the group.

AB -> D, D -> EG, EG -> BF, F -> BH thus AB -> E and AB -> H are both redundant; D -> E (end) D -> G (end) EG -> B (end) EG -> F, F -> BH thus EG -> B is redundant F -> B (end) F -> H (end)

The **minimal cover** is therefore

AB -> D, EF -> A, FG -> C, D -> E, D -> G, EG -> F, F -> B, F -> H

### Minimal Candiate Keys

An important fact to remember is that

- If an attributes appears
**only**on the**RHS**, it is**non-prime** - If an attributes appears
**only**on the**LHS**, it is**prime**and**part of every key**

Which is kinda obvious: if X can be only determined by something else, it cannot be part of a candidate key; and if X is not determined by anything else then it must be at least part of a key. So the first thing to do is figure out which are the non-prime attributes. Then we fiddle with the rest of the attributes until we find a candidate key for *everyone of them*. If one is left out it means we haven’t found all the candidate keys.

Let’s the fiddle start.

We spot immediately that *C, H* are the non-prime attributes.

Then, we compute the closure for any given single attribute on the LHS of the FDs and see if we got a key.

D+ = ABCDEFGH (candidate key) F+ = BFH (not)

Every attribute that leads to a candidate key is also a key

Since AB -> D, AB is a candidate key

Finally, every attributes that leads to a prime attribute might also be a key

// looking for A or B on the RHS to recreate AB F -> B, therefore AF is a ck EF -> A, F -> B therefore EF is a ck // looking for E or F on the RHS to recreate EF EG -> F, therefore EG is a ck // looking for G on the RHS to recrete EG // only D leads to G, D already a key. Search is over.