Definitions of Key, Primary Key, Unique Key, and Foreign Key
In this article, we will discuss the concept of data blending, which involves various examples of data analysis and visualization processes, primarily in Tableau, R, and Looker Studio. Before we dive into these topics, however, I would like to address the concept of a key.
A Key in a database can be defined as one or more fields that qualify a row. It is used to uniquely identify a row or to ensure that a row is only defined once. A key can be defined as a special type, such as a Unique Key, Primary Key, or Foreign Key, each serving a different purpose.
Other key types that may be encountered include Super Key, Candidate Key, Alternate Key, and Composite/Compound Key. However, it may be more appropriate to discuss these types in another article. Note that in examples and explanations, abbreviations such as PRI (Primary Key), UNI (Unique Key), ALT (Alternate Key) may be used for keys, and additional information may be included in parentheses.
Let's move on to the descriptions of keys and their characteristics.
Unique Key (UNI)
A Unique Key is a key that can only have one value in a table. The value cannot be repeated, and the key serves a similar function to a Primary Key. However, the main difference is that a Unique Key can have a
Primary Key (PRI)
A Primary Key is a key that serves as an identifier for each row in a table. It is a constraint that ensures that values are not duplicated and are unique. To ensure that the values of each row do not mix, the information in this field should not be repeated. Its primary function is to determine the order in which data is arranged in the table or to determine where the data will be updated.
It is usually used as a single field (e.g., id, user_id, email, username, national_identification_number, etc.), but it can also be created by combining multiple fields. Generally, numbers are selected as primary keys, but this is not mandatory. A primary key value cannot be left blank and cannot be assigned a
NULL value. In relational database management systems, a table must have a primary key. Note that while every primary key is unique, not every unique key is a primary key. As mentioned under the Unique Key heading,
NULL can be used in Unique Keys. Let's clarify the differences under this heading:
- More than one field can be defined as a single Primary Key in a table. However, the primary key structure can only be used once in each table. There is no such restriction for Unique Keys.
NULLis not allowed in Primary Keys, but it is allowed in Unique Keys.
- An INDEX definition is created on the Primary Key in the data table, and a unique definition is made for each record. In the case of Unique Keys, it is determined whether the values are unique. Of course, unique keys can also be defined as INDEX, but this will only be a definition.
Foreign Key (FK)
Foreign Key is also referred to as a secondary key. It is used to link the values that can be entered in one table with the fields in another table. In short, it is the presence of the primary key of another table in a table. It is mostly used to link a parent table with a child table. This helps to prevent possible data duplicates and ensures that related data is updated everywhere in case of future updates.
To make the above explanation clearer, let's create two example tables and examine their relationship1.
|PRIMARY KEY||UNIQUE KEY||FOREIGN KEY|
Let's name the above table as Table 1. It contains our unique keys in the ID column, and each unique ID value identifies its corresponding row. Every time a new row is added to Table 1, the ID column receives a new value (usually one more than the previous value; auto increment). Let's name the table below as Table 2.
In Table 2, our unique IDs are located in the
DeptID column, and each new row added to Table 2 is identified by a unique new value in the
DeptID column. When we look at Table 1 and Table 2 together, we can see that
DeptID appears in both tables. The
DeptID in Table 1, however, is not unique and can take the same values. Its purpose is to establish the relationship with Table 2. Similarly, in another table (e.g. Table 3), we can use the ID value of Table 1 (along with other potential fields such as
EnrollNo) in a similar way. In such uses as the
DeptID example, the
DeptID is referred to as a unique primary key within its table, and as a foreign key within the table it is related to. Of course, these column names do not have to be the same, but for readability, it is generally preferred that related columns have similar names (e.g.
[related table name]_id).
Let's also take a look at the
wp_postmeta table as an example of related usage2. In this table,
meta_id is the unique identifier for each row, while
post_id is a foreign key that refers to the ID column in the
wp_posts table. This relationship allows us to associate metadata with a specific post in the
wp_posts table. Since there can be multiple rows in
wp_postmeta that correspond to a single row in
post_id column is not unique in
wp_postmeta. However, it is unique within the context of the relationship between
wp_posts. This allows us to link the metadata in
wp_postmeta to the correct post in
In the above table, you can see that
meta_id (and all related
*_id definitions on that page) are defined as PRI (see the KEY column). This allows us to mark the rows related to the reference ID values and perform operations (update, delete, etc.) on them.
In summary, when working with different data tables, we establish various relationships between them. These relationships can be defined directly (during the table creation phase) or later. In such cases, we use unique fields to link the tables and merge the data to analyze and visualize more comprehensive data within the tables. One simple example of such cases is creating calculated fields from different data sources and displaying them on a single chart. We will also demonstrate a few examples in Looker reports to reinforce the topic.