FOREIGN KEY constraint: rules and methods
The FOREIGN KEY constraint creates and enforces a reference link between 2 columns. It ensures that a referencing column (the child) can only contain values already present in its referenced column (the parent).
A FOREIGN KEY is defined within the child table, its REFERENCES keyword points to the parent table and its specific columns. Multiple child rows can reference the same parent column value.
//The child column must match the parent data type
//A PRIMARY KEY parent column avoids any duplicate column value and NULL for the child
CREATE TABLE products (
chiave integer PRIMARY KEY, name text, price numeric
);
insert into products(chiave, name, price) values (15, 'brio',550), (10, 'malbo',200);
//Any existing constraint still applies within the child table
//If the parent column is omitted, it will use its own column name for the reference.
CREATE TABLE orders (
order integer, //UNIQUE can limit the child columns values
chiavetta integer REFERENCES products (chiave) //Will use chiavetta if omitted
);
//The FOREIGN KEY only allows values form the specified column, not the entire table.
insert into orders(order, chiavetta) values
(3, 15),
(10, 10), //The parent reference value can be repeated within the row
(3, 15), //Allowed if teh order child columns isn't UNIQUE
(3, 200); //Error, it violates the foreign key constraint
A child FOREIGN KEY must reference UNIQUE or PRIMARY KEY parent columns. The JOIN operation combines table columns data using the FOREIGN KEY relations.
Self-referencing tables and the FOREING KEY relationships types.
A column that references the PRIMARY KEY of a row within the same table is called self-referencing. It creates a hierarchical structure where rows are organized into branches based on the PRIMARY KEY they reference, with rows containing NULL foreign key values serving as the root nodes that originate the structure. It enabled the query to retrieve and order data based on the relations between the table rows.
A query involving self-referencing columns uses multiple instances of the same table within its FROM clause. The first instance (t1) represents the current row being processed, while the second instance (t2) is used in the matching operation. This comparison links the foreign key value from the first instance to its referenced parent primary key, found using the second instance.
//It references the PRIMARY KEY by default.
CREATE TABLE tree (
node_id integer PRIMARY KEY, name TEXT,
parent_id integer REFERENCES tree -- Self-referencing foreign key
);
INSERT INTO tree (node_id, parent_id, name) values
(1, NULL, 'base camp'), (2, NULL, 'aviation'), (101, 1, 'tank'),
(102, 1, 'jep'), (201, 101, 'soldier'), (301, 2, 'plane'),
(302, 302, 'suplies'), -- self referring column
(305, 90, 'dog'); -- ERROR, no PRIMARY KEY to reference
//The INNER JOIN clause only returns matching row values in the output.
select
t1.node_id, t1.name, t2.name as parent_name
from
tree as t1
left join
tree as t2 on t1.parent_id = t2.node_id
order by t1.node_id
node_id|name |parent_name|
-------+---------+-----------|101|tank |base camp |
1|base camp| |102|jep |base camp |301|plane |aviation |
2|aviation | |201|soldier |tank |302|suplies |suplies |
The LEFT JOIN clause includes all rows from the current table instance (the left operand) in the query's output. It appends the result columns from the matching operation, assigning NULL to unmatched rows. It effectively flattens the output, presenting both child and parent information within the same row.
There are multipe types of tables relationships, based on their cardinality:
It requires an extra linking table that references the parent primary keys in its child columns. The composite primary key combines the foreign key columns, with each column referencing a different table.
//Both parent tables are referenced using their PRIMARY KEY
//A FOREIGN KEY keyword can't define it, as it can only reference one parent.
create table meal(
porta INT primary key, name TEXT
)
insert into meal(porta, name) values (5, 'steak'), (12, 'frittata');
create table dessert(
vaso INT primary key, name TEXT
)
insert into dessert(vaso, name) values (2, 'torta'), (7, 'bigne');
//The sets of the linking table's composite primary key
CREATE TABLE cena(
meal_id integer REFERENCES meal, vaso_id integer REFERENCES dessert,
cost integer, PRIMARY KEY (meal_id, vaso_id)
);
insert into cena(meal_id, vaso_id, cost) values
(5, 7, 40),
(12, 7, 40), //Repeated vaso_id primary key value allowed
(5, 7, 120); //Error, repeated primary key set
Parent column values are defined once but can be referenced many times in the linking table. Any update will only affect the parent table it's declared in, with no need to change its corresponding reference in the linking table.
Referential Actions and Event Specifiers for Referential Integrity
The database restricts any change made to parent columns currently being referenced. The FOREIGN KEYS must follow the referencial integrity rule, which states that all non-NULL child values must references an existing parent column.
The event specifiers set the parent row actions that will trigger a child's column response. The referential actions then define how the foreign key values will be modified to maintain the referential integrity after parent changes.
The CASCADE option, with the ON DELETE event specifier, ensures that deleting a parent row also deletes all its related child rows, preventing any unassigned non-NULL child values.
//The action specifier REQUIRES a referencial action option to work
create table base(
primo INT primary key, name TEXT
)
insert into base(primo, name) values (12, 'moto'), (8, 'car'), (99, 'lambo');
create table casca(
secondo INT references base on delete CASCADE, name TEXT
)
insert into casca(secondo, name) values (12, 'mirai'), (99 ,'goldie');
//The entire child row gets deleted, not just the referencing columns
delete from base where primo = 12;
select * from casca;
secondo|cognome|
-------+-------+
99|goldie |
The NO ACTION option prevents parent operations, symilar to RESTRICT. It's the only referential action that can be combined with the DEFERRABLE clause, which enables it to delay the referential integrity check until the end of its transaction. It allows subsequent operations within the same transaction to resolve any potential integrity violations before the final commit.
The DEFERRABLE clause, by itself, results in an INITIALLY IMMEDIATE integrity check. We need to explicitly specify the INITIALLY DEFERRED clause to enable the delayed referential check.
Many database editors group multiple operations into a single transaction, allowing a DEFERRED clause to place its referential integrity check at the transaction's final commit.
//NO ACTION and RESTRICT are exactly the same in some database systems
create table base2(
primo INT primary key, name TEXT
)
insert into base2(primo, name) values (12, 'uno'), (5, 'due'), (42, 'tre');
create table azione(
catena INT references base2 on delete NO action DEFERRABLE INITIALLY DEFERRED,
modo TEXT
)
insert into azione values (12, 'primo'), (5, 'due'), (42, 'terzo');
//We delete both parent and childs in the same transaction to keep the integrity
delete from base2 where primo = 12; //By itself: Error.
delete from azione where catena = 12; //Executed together: Correct.
select * from azione;
We can change the FOREIGN KEY's current DEFERRABLE setting using the SET CONSTRAINTS command.
The SET NULL and SET DEFAULT options maintain referential integrity by changing the foreign key column values within the child rows.
The SET NULL option, applied ON DELETE, sets the foreign key columns in the child row to NULL when its referenced parent row gets deleted. The NULL value, following the referential integrity rule, implicitly removes the parent-child reference in its FOREIGN KEY column.
//It can be aplied only to NULLABLE columns
create table base3(
base INT primary key, name TEXT
)
insert into base3(base, name) values (9, 'scala'), (12, 'piano'), (33, 'molo');
create table resetta(
foglio INT references base3 on delete set null, name TEXT
)
insert into resetta(foglio, name) values (9, 'piede'),(12, 'sala'),(33, 'nave');
delete from base3 where base = 9;
select * from resetta;
foglio|name |
------+-----+
12|sala |
33|nave |
|piede|
The event specifiers must still satisfy any pre-existing constraints on their foreign key child columns.
Identifying Foreign Keys used as a child Primary Key
Both event specifiers and referential actions can't be directly applied to PRIMARY KEY columns.
We call identifying foreign keys the foreign keys columns that are used as PRIMARY KEYS values. They will be implicitly set as NOT NULL and UNIQUE, while also needing to reference existing parent column values. This dual role limits the types of referential actions that can be applied to them.
//It is mostly used in many-to-many relations
create table student1(
stud_id INT primary key, name TEXT
)
insert into student1(stud_id, name) values (12, 'volan'), (1, 'prima'), (7, 'emmy');
create table corsi1(
cors_id INT primary key, name TEXT
)
insert into corsi1(cors_id, name) values (5, 'bio'), (12, 'volan'), (17, 'ruota');
create table classe1(
student_id INTEGER REFERENCES student1, course_id INT references corsi1,
PRIMARY KEY (student_id, course_id)
)
insert into classe1(student_id, course_id) values (12, 5);
insert into classe1(student_id, course_id) values (7, 12);
insert into classe1(student_id, course_id) values (12, 12);
select * from classe1;
We can add a SET NULL option to an identifying FOREING KEY, but the database won't execute it. If a referenced parent row is deleted, the foreign key columns cannot be updated to NULL because of their implicit NOT NULL constraint when used as primary keys.
//The parent delete will return error
delete from student1 where stud_id=12;
create table classe1(
student_id INTEGER REFERENCES student1 ON DELETE set NULL,
...
)
Foreign keys ON UPDATE actions and the MATCH FULL option.
The ON UPDATE event specifier implements its referential actions differently from ON DELETE. Because the parent row still exists after the update, the database must either maintain the reference to the updated parent row or delete the foreign key's reference without deleting the entire child row.
The CASCADE option maintains the referential integrity by updating the child column values to match the new parent column.
//It can update a single column from a composite foreign key.
create table base6(
uno int, due TEXT,
primary KEY(uno, due)
)
insert into base6(uno, due) values (1, 'copiato'), (2, 'secondo'), (3, 'terzo');
update base6 set uno=11 where uno=1 returning *;
create table cambio(
base INT, primo INT, secondo TEXT,
foreign KEY(primo, secondo) references base6 on update CASCADE
)
insert into cambio(base, primo, secondo) values (12, 1, 'copiato');
insert into cambio(base, primo, secondo) values (12, 2, 'secondo');
insert into cambio(base, primo, secondo) values (3, 1, 'copiato');
select * from cambio;
base|primo|secondo|
----+-----+-------+
12| 2|secondo|
12| 11|copiato|
3| 11|copiato|
The MATCH SIMPLE option, set by default, considers a foreign key not part of any reference if any of its columns are NULL. The MATCH FULL option requires all foreign key columns to be NULL for it to be considered unreferenced. It blocks any operation that might result in a partially NULL foreign key, such as SET NULL with specified columns, or SET DEFAULT where some default values are NULL.
//MATCH FULL requires all foreign keys to be either null or not-null.
//The option goes before the event specifier.
create table base7(
primo INT, secondo INT, name TEXT,
primary KEY(primo, secondo)
)
insert into base7(primo, secondo, name) values
(1, 12, 'vaso'), (3, 22, 'sedia'), (5, 8, 'root');
delete from base7 where primo = 3; //ERROR, partial set null.
create table tutti(
uno INT, due INT, name TEXT,
foreign key (uno, due) references base7 match full on delete set NULL(uno)
)
insert into tutti(uno, due, name) values
(1, 12, 'altro'), (3, 22, 'basso'), (5, 8, 'lato');
The values of the PRIMARY KEY and UNIQUE parent columns are implicitly stored in an index structure. It validates the child columns that reference the parent's values, but it doesn't store any data about the relation.
Referential actions like CASCADE, SET NULL, and SET DEFAULT involve locating child column values after parent row operations. We use an additional index on the foreign key columns to optimize the retrieval.
//It can optimize JOIN operations that use the foreign key columns.
create table tutti(
uno INT, due INT, name TEXT,
foreign key (uno, due) references base7 match full on delete set NULL
)
CREATE INDEX tutti_index ON tutti (uno, due);
The foreign key constraint specifies the rules for column values, but it is not a column definition that can directly include an index statement.
Last updated
Was this helpful?