In the relational data model a superkey is any set of attributes that uniquely identifies each tuple of a relation.[1][2] Because superkey values are unique, tuples with the same superkey value must also have the same non-key attribute values. That is, non-key attributes are functionally dependent on the superkey.

The set of all attributes is always a superkey (the trivial superkey). Tuples in a relation are by definition unique, with duplicates removed after each operation, so the set of all attributes is always uniquely valued for every tuple. A candidate key (or minimal superkey) is a superkey that can't be reduced to a simpler superkey by removing an attribute.[3]

For example, in an employee schema with attributes employeeID, name, job, and departmentID, if employeeID values are unique then employeeID combined with any or all of the other attributes can uniquely identify tuples in the table. Each combination, {employeeID}, {employeeID, name}, {employeeID, name, job}, and so on is a superkey. {employeeID} is a candidate key, since no subset of its attributes is also a superkey. {employeeID, name, job, departmentID} is the trivial superkey.

If attribute set K is a superkey of relation R, then at all times it is the case that the projection of R over K has the same cardinality as R itself.

Example

English Monarchs
Monarch Name Monarch Number Royal House
Edward II Plantagenet
Edward III Plantagenet
Richard III Plantagenet
Henry IV Lancaster

First, list out all the sets of attributes:

• {}
• {Monarch Name}  
• {Monarch Number}  
• {Royal House}
• {Monarch Name, Monarch Number}
• {Monarch Name, Royal House}
• {Monarch Number, Royal House}
• {Monarch Name, Monarch Number, Royal House}

Second, eliminate all the sets which do not meet superkey's requirement. For example, {Monarch Name, Royal House} cannot be a superkey because for the same attribute values (Edward, Plantagenet), there are two distinct tuples:

Finally, after elimination, the remaining sets of attributes are the only possible superkeys in this example:

In reality, superkeys cannot be determined simply by examining one set of tuples in a relation. A superkey defines a functional dependency constraint of a relation schema which must hold for all possible instance relations of that relation schema.

See also

References

  1. ^ Date, Christopher (2015). "Codd's First Relational Papers: A Critical Analysis" (PDF). warwick.ac.uk. Retrieved 2020-01-04. Note that the extract allows a "relation" to have any number of primary keys, and moreover that such keys are allowed to be "redundant" (better: reducible). In other words, what the paper calls a primary key is what later (and better) became known as a superkey, and what the paper calls a nonredundant (better: irreducible) primary key is what later became known as a candidate key or (better) just a "key".
  2. ^ Introduction to Database Management Systems. Tata McGraw-Hill. 2005. p. 77. ISBN 9780070591196. no two tuples in any legal relation
  3. ^ Saiedian, H. (1996-02-01). "An Efficient Algorithm to Compute the Candidate Keys of a Relational Database Schema". The Computer Journal. 39 (2): 124–132. doi:10.1093/comjnl/39.2.124. ISSN 0010-4620.

Further reading