| |||||||||
In the relational model a candidate key is a set of attributes of a relation such that (1) all attributes of the relation are functionally dependent on this set and (2) there is not a proper subset for which (1) holds. Since a superkey is defined as a set of attributes for which (1) holds, candidate keys are often also defined as minimal superkeys.
The importance of candidate keys is that they tell us how we can identify individual tuples in a relation. As such they are one of the most important types of database constraints that should be specified when designing a database schema. Since a relation is a set it holds that every relation will have at least one candidate key. Since in some RDBMSs tables may also represent multisets (which strictly means these DBMSs are not relational) it is an important design-rule to specify explicitly at least one candidate key for each relation. For practical reasons RDBMSs usually require that for each relation one of its candidate keys is declared as the primary key, which means that it is considered as the preferred way to identify individual tuples. Foreign keys, for example, are usually required to arrive in such a primary key and not in any other of the candidate keys.
Consider the following relation:
| A | B | C | D |
|---|---|---|---|
| a1 | b1 | c1 | d1 |
| a1 | b2 | c2 | d1 |
| a2 | b1 | c2 | d1 |
For this relation we find the following 8 superkeys: {A,B}, {A,C}, {B,C}, {A,B,C}, {A,B,D}, {A,C,D}, {B,C,D}, {A,B,C,D}. Because for {A,B,C}, {A,B,D}, {A,C,D}, {B,C,D} and {A,B,C,D} there are proper subsets that are also superkey, only the sets {A,B}, {A,C} and {B,C} are candidate keys.