SQL Scalar table Data Types
Data types in PostgreSQL define the data format within table column.
//Data types within column definition
TEXT and CHAR(n), VARCHAR(n): text based types,
DATE, TIME, TIMESTAMP, and INTERVAL: include date and time types
JSON: for semi structured data,
INT, REAL: for numeric types and
ENUM, ARRAY: specialized types
The TEXT, CHAR(n), and VARCHAR(n) data types store string values in table columns. TEXT allows strings of any length. While both CHAR(n) and VARCHAR(n) set a maximum n length. The CHAR(n) is also a fixed-length type, it will pad the end of the string with trailing spaces in order to mantain its specified length.
//CHAR and VARCHAR without (n) will be treated as CHAR(1) and TEXT respectively.
CREATE TABLE resorce (
primo CHAR(20), secondo VARCHAR(20), terzo TEXT
);
insert into resorce(primo, secondo, terzo)
values (' accidenti ', ' accidenti ', ' accidenti ');
//(" accidenti "," accidenti "," accidenti ")
The || operator concatenates string values, with the return string converted to TEXT regardless of the data types being used. The substring() function returns a segment of a string, starting at the FROM index and continuing for the FOR length. The replace() function searches each string in a column for a specified substring and replaces all occurrences with a given new string.
//We can concatenate different data types.
select (text 'prima ' || ' e ' || 211 ); //TEXT "prima e 211"
//We can concatenate string values from columns.
SELECT nome || ' ' || titolo as new_sting FROM parole;
//The strings follow the 1-index rule, like the arrays
select substring('parola'::text from 1 for 3); //par
SELECT substring(nome FROM 1 FOR 5) FROM parole; //It selects from table columns
//It can either be a specific column or a stirng value, it replaces the substring
select replace('mottura'::text, 'ttu', 'lla' ); //mollara
SELECT replace(nome, 'pre', 'post') FROM parole where id=2;
The INTERVAL data type represents a time duration, which allows adding and subtracting intervals from date and time values. The standard format for an interval value is years-months days hours:minutes:seconds, with the hyphen(-) specifically separating years and months.
//It is not affected by time zones
//We can use either numerical and string values, singular or plural.
create table inte(
id serial primary key, numero INTERVAL
);
INSERT INTO inte (numero)
VALUES ('47'); //single values seconds by default
VALUES ('12:50'); //Hours and Minutes
VALUES ('00:21:47'); //necessary 00 for Minutes and Seconds
VALUES ('21 minutes 47'); //String interval integration
VALUES ('2 12:12:50'); //Days value
VALUES ('1 2 12:12:50'); //error, Year required for month int value
VALUES ('0-1 2 12:12:50'); //correct only month interval value
VALUES ('1 months 12:12:50'); //stirng intervals can skip element
JSON and REAL
The JSON and JSONB data types store and manipulate JSON data in the table rows. JSON mantains its original plain text format, while JSONB store data in a decomposed binary format, which enhances query performance and storage efficiency.
PostgreSQL validates any JSON data inserted into a JSONB column, converting it to an internal binary format. Upon retrieval, this format is automatically converted back to a JSON literal; the binary representation is for internal use only.
//JSONB supports GIN indexing
//JSONB also removes any unnecessary whitespace and orders the keys
create table basico(
id SERIAL primary key,
parte JSONB
);
insert into basico(parte) values
('{"name": "Alice", "age": 30, "city": "New York"}'),
('{"products": [{"id": 1, "name": "Laptop"}, {"id": 2, "name": "Mouse"}]}');
The '->' operator retrieves the JSON value associated with the selected key, preserving its original type.
//-> returns a jsonb value, regardless of its property value in the jsonb column
//->> will return the string-converted value
//For nested JSON we repeat -> for each layer, using the index to access arrays values
SELECT parte -> 'numero' FROM basico where id = 1; //30 ,jsonb
SELECT parte -> 'products' -> 1 ->> 'name' FROM basico where id = 2; //Mouse ,text
//The #> operator extracts JSON values based on a specified path.
//And #>> returns the value as string.
select parte #> '{numero}' from basico where id = 1; //30 jsonb
select parte #>> '{products,1,numero}' from basico where id = 2; //'67' text

The REAL data type stores floating-point and scientific notation values with approximate precision.
//REAL stores data in 4 bytes instead of the 8 of FLOAT
//Can be used fo exponential values or pi()
create table basico(
id SERIAL primary key,
numa REAL
);
insert into basico(numa) values (12.999);
insert into basico(numa) values (1.0e6);
insert into basico(numa) values (1.6E+19);
insert into basico(numa) values ( pi() );
ENUM and ARRAY
The ENUM custom data type defines a set of allowed values for a table column. It cannot be defined directly in a column definition; we must CREATE TYPE it outside the table.
PostgreSQL stores ENUM values internally as small integers. The pg_enum system catalog maps the integers to theirs ENUM values during queries.
//The ENUM order will affect its values sorting and comparison operations
CREATE TYPE status AS ENUM ('primo', 'secondo', 'terzo', 'quarto');
DROP TYPE status; //It won't remove automatically on table delete
CREATE TABLE users (
id SERIAL PRIMARY KEY,
username TEXT NOT NULL,
account_status status
);
//We can only add column values from the ENUM
INSERT INTO users (username, account_status) VALUES
('Alice', 'primo'), //secondo, ...
('yanico', 'sesto'); //Error: non ENUM value being added
The ARRAY data type stores an ordered collection of elements of the same data type within a single column.
We define arrays columns using their data type and dimention level, we INSERT array values as {} strings or using the ARRAY constructor.
In the current PostgreSQL implementation, the database will not strictly enforce the declared array dimension at runtime; it's mainly for user documentation.
//All arrays will be considered as one-dimention internally
//Nested array elements must contain the same number of elements
CREATE TABLE resorce (
id SERIAL PRIMARY KEY,
lista INT[2], testo VARCHAR(5)[]
);
//Error, different nested array elements and above 5-digit string
insert into resorce(lista, TESTO) values ('{{12, 12}, {45}}', '{"ridotto"}');
//Null can be used to mantain the same number of elements in nested arrays
insert into resorce(lista, testo) values
( '{{12, 12}, {34, 67}}', ARRAY['List', 'minna'] );
( '{12, null, 99}', ARRAY['basic'] ); -- {12,,99}
( ARRAY[[33, 33], [null, 43] ], '{pent, ball, prin}' ); --[, 43]
//The PostgreSQL arrays use 1-based indexing for retrieving data.
select testo from resorce where id = 1; --{List,minna}
select testo[1] from resorce where id = 1; --List
select lista[1][1] from resorce where id = 1; --12
During the INSERT operation, the PostgreSQL parser infers{}
array values and converts them to the target column's array data type.
Arrays literals defined outside a column definition require explicit type casting to be recognized at their specific data type, array operations only allow arrays with identical data types.
//The ARRAY[] constructor default data type is TEXT
//String literals will be considered incompatible based on their data types
select array['pear', 'banjo'] && ARRAY['mango', 'banjo']; -- true
select array['pear', 'banjo'] && ARRAY['mango', 'banjo']::VARCHAR(5)[]; -- ERROR
//The ARRAY[] constructor doesn't get inferred on INSERT
insert into resorce(lista) values
(array[[44, null], ['44', '122']]), -- Error
('{ {44, null}, {"44", "122"} }'); --{{44, },{44,122}}
The [ : ] syntax can slice an array section, with both ends being inclusive.
//On multi-dimentional arrays, it will only slice the outmost array elements
//Empty array bounds are by default infinite, [:] will return the entire array
INSERT INTO resorce (id, testo, lista) VALUES
(3, '{551,"corto", 12, 62, "cinque"}', '{{"12", 5}, {8, 20}}');
select testo[4:] from intero where id = 3; -- {62,cinque}
SELECT (ARRAY[1, 2, 3, 45, 4])[3:]; -- {3,45,4}
SELECT ('{12, 34, 56, 67}'::int[])[:2]; -- {12,34}
select ('{{{12, 34}, {56, 67}, {null, 99}}}'::int[])[:2]; --{{{12,34},{56,67},{,99}}}
PostgreSQL applies the immutable principle to its functions and operators. PostgreSQL's built-in array functions return new array values, requiring the UPDATE command to modify the corresponding table column data.
The array_append() function appends a specified element to the end of the array.
//It doesn't support multi dimentional arrays.
//Will append to each array in the rows if not specified.
select array_append(testo, 'ultimo') from resorce;
//{corto,min1,ultimo},{...,ultimo}
update resorce set testo = array_append(testo, 'speci') where id = 1;
The array data type, symilar to interval, has access to the @>
and &&
operators.
The containment operator @> checks if the first array contains all elements of the second array. It ignores duplicates, meaning ARRAY[1] and ARRAY[1, 1] are considered equivalent.
//We add [] to specify the array datatype
insert into resorce(lista, TESTO) values ( '{{12, 12}}', '{"min"}' );
insert into resorce(lista, TESTO) values ( '{{92, 12}}', '{"min1"}' );
SELECT * from resorce where testo @> ARRAY['min1', 'min1']::VARCHAR(5)[];
//{out,corto,min1}
SELECT * from resorce where lista @> ARRAY[12, 12]; //{{12,12},.}, {{99,12},.}
SELECT * from resorce where lista @> ARRAY[12]; //{{12,12},.}, {{99,12},.}
The ANY keyword functions as a quantifier, modifying a comparison operation to allow a single value to be compared to each element within an array.
//it returns true if ANY element satisfies the comparison operation.
create table doppio(
id serial primary key,
serie INT[], numeri INT
)
//Symilar to how WHERE compares to single values
INSERT into doppio (serie) VALUES (array[101, 12]), (array[12]);
insert into doppio (numeri) values (101), (12);
//It returns the entire array
select * from doppio where 30 > any (serie); //{12}, {101, 12}
select * from doppio where 30 > numeri; //12
select 4 = any (array[1, 2, 3, 4, 5]); //true, an element within the array is = 4
The ALL keyword is a quantifier that compares a value to every element of an array, returning true only if all comparisons are true. It works on subqueries.
//Not suited for multi-dimantional arrays unless UNNEST
//ALL needs to be set in the right side, like ANY.
create table doppio(
serie INT[];
)
//If 50 is > than all elements in the table array columns
INSERT into doppio (serie) VALUES (array[12, 45]), (array[56, 12]);
select * from doppio where 50 > ALL(serie); //{12,45}
//10 is not equal to than ALL array elements
SELECT 10 != ALL (ARRAY[5, 8, 2, 5]); //true
The EXISTS operator is a logical operator that requires a subquery as its operand. It is used in the WHERE clause to specify which rows the main query will process, and returns true if its subquery returns at least one row.
We correlate the outer query with the subquery by using a variable. This allows the EXISTS condition to evaluate each row of the outer query against related data in the subquery.
//The subquery correlates to the current outher query column for each row it executes.
CREATE TABLE departments ( depname VARCHAR(100) );
CREATE TABLE employees ( empname VARCHAR(100), empjob TEXT );
insert into departments(depname) values ('human'), ('info'), ('market');
insert into employees(empname, empjob) values
('lowrence', 'market'), ('mike', 'bus'), ('dory', 'walk');
//The correlated emp.empjob column is updated for each row of teh employees table
SELECT empname FROM employees AS emp
WHERE EXISTS (
SELECT 1
FROM departments AS dept
WHERE emp.empjob = dept.depname
);
//lowrence
Both IN and JOIN can perform similar operations to EXISTS, but they require the complete evaluation of the subquery or a table join. While EXISTS stops evaluation at the first match, avoiding full table scans.
The EXISTS predicate queries the specific columns from the main table based on the related table data, without performing a full join or processing the subquery on the entire main table.
//The main table from where we will process query
CREATE TABLE large_customers (
customer_id SERIAL PRIMARY KEY,
customer_name VARCHAR(100)
);
//We insert 1000 rows of unique values
INSERT INTO large_customers (customer_name)
select uuid_generate_v4() FROM generate_series(1, 1000);
//The second table will store specific id values which may be in the main table
CREATE TABLE customers_with_orders (
customer_id INT
);
//We insert 100 values which might be shared with main table id
INSERT INTO customers_with_orders (customer_id)
select floor(random() * 2000)::integer FROM generate_series(1, 100);
//We return the rows form the customer_id values that match in the main table
//The EXIST predicate returns only the second table ids present in the main table
SELECT customer_id ,customer_name
FROM large_customers c
WHERE EXISTS (
SELECT 1
FROM customers_with_orders o
WHERE o.customer_id = c.customer_id
);
//43/100 rows are returned from 100 in the second table for example.
Last updated
Was this helpful?