We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Advanced String Patterns
Beyond the basic string functions, PostgreSQL offers powerful tools for splitting strings apart, padding them to fixed widths, performing regex-based replacements, and aggregating text across rows. These functions are essential for parsing structured text, generating formatted reports, and handling complex data transformations.
Splitting Strings with SPLIT_PART
SPLIT_PART divides a string by a delimiter and returns the part at the specified position.
SPLIT_PART(string, delimiter, position)
SELECT SPLIT_PART('one-two-three', '-', 1);
-- Returns: 'one'
SELECT SPLIT_PART('one-two-three', '-', 2);
-- Returns: 'two'
SELECT SPLIT_PART('one-two-three', '-', 3);
-- Returns: 'three'
If you request a position that does not exist, PostgreSQL returns an empty string:
SELECT SPLIT_PART('one-two', '-', 5);
-- Returns: ''
A practical application is parsing email addresses:
SELECT
email,
SPLIT_PART(email, '@', 1) AS username,
SPLIT_PART(email, '@', 2) AS domain
FROM employees;
You can also split domain names further:
SELECT
email,
SPLIT_PART(SPLIT_PART(email, '@', 2), '.', 1) AS domain_name
FROM employees;
Extracting with LEFT and RIGHT
LEFT returns the first n characters from the start of a string. RIGHT returns the last n characters from the end.
SELECT LEFT('PostgreSQL', 4);
-- Returns: 'Post'
SELECT RIGHT('PostgreSQL', 3);
-- Returns: 'SQL'
These are simpler alternatives to SUBSTRING when you just need characters from either end:
-- First initial and last 3 characters of each employee's name
SELECT
name,
LEFT(name, 1) AS first_initial,
RIGHT(name, 3) AS last_three
FROM employees;
Using negative values with LEFT and RIGHT removes characters from the opposite end:
-- All characters except the last 3
SELECT LEFT('PostgreSQL', -3);
-- Returns: 'Postgre'
-- All characters except the first 4
SELECT RIGHT('PostgreSQL', -4);
-- Returns: 'greSQL'
Padding with LPAD and RPAD
LPAD pads a string on the left to reach a target length. RPAD pads on the right.
LPAD(string, target_length, fill_character)
RPAD(string, target_length, fill_character)
SELECT LPAD('42', 5, '0');
-- Returns: '00042'
SELECT RPAD('Hello', 10, '.');
-- Returns: 'Hello.....'
If the fill character is omitted, spaces are used by default:
SELECT LPAD('42', 5);
-- Returns: ' 42'
Common uses include generating formatted IDs or aligning output:
-- Create zero-padded employee IDs
SELECT
name,
'EMP-' || LPAD(CAST(salary AS TEXT), 6, '0') AS formatted_salary
FROM employees;
Building a simple text-based table:
SELECT
RPAD(name, 20, ' ') || LPAD(CAST(salary AS TEXT), 10, ' ') AS formatted_row
FROM employees;
If the string is already longer than the target length, LPAD and RPAD truncate it:
SELECT LPAD('Hello World', 5, '*');
-- Returns: 'Hello'
Repeating and Reversing
REPEAT duplicates a string a specified number of times:
SELECT REPEAT('ab', 3);
-- Returns: 'ababab'
SELECT REPEAT('-', 20);
-- Returns: '--------------------'
REVERSE flips the order of characters in a string:
SELECT REVERSE('PostgreSQL');
-- Returns: 'LQSergtsoP'
A fun trick -- checking if a string is a palindrome:
SELECT
'racecar' AS word,
REVERSE('racecar') AS reversed,
'racecar' = REVERSE('racecar') AS is_palindrome;
-- is_palindrome: true
Pattern Replacement with REGEXP_REPLACE
While REPLACE handles simple text substitution, REGEXP_REPLACE supports full regular expression patterns.
REGEXP_REPLACE(string, pattern, replacement [, flags])
-- Remove all digits from a string
SELECT REGEXP_REPLACE('abc123def456', '[0-9]', '', 'g');
-- Returns: 'abcdef'
The 'g' flag means "global" -- replace all occurrences. Without it, only the first match is replaced:
SELECT REGEXP_REPLACE('abc123def456', '[0-9]', '');
-- Returns: 'abc23def456' (only the first digit removed)
SELECT REGEXP_REPLACE('abc123def456', '[0-9]', '', 'g');
-- Returns: 'abcdef' (all digits removed)
More practical examples:
-- Normalize whitespace (replace multiple spaces with a single space)
SELECT REGEXP_REPLACE('hello world', '\s+', ' ', 'g');
-- Returns: 'hello world'
-- Remove non-alphabetic characters
SELECT REGEXP_REPLACE('Hello, World! 123', '[^a-zA-Z ]', '', 'g');
-- Returns: 'Hello World '
-- Mask email addresses
SELECT REGEXP_REPLACE(email, '(.).*(@.*)', '\1***\2') AS masked_email
FROM employees;
-- Turns 'alice@example.com' into 'a***@example.com'
Common flags:
'g'-- global, replace all matches'i'-- case-insensitive matching'gi'-- both global and case-insensitive
SELECT REGEXP_REPLACE('Hello HELLO hello', 'hello', 'hi', 'gi');
-- Returns: 'hi hi hi'
Aggregating Strings with STRING_AGG
STRING_AGG is an aggregate function that concatenates values from multiple rows into a single string, separated by a delimiter.
STRING_AGG(expression, delimiter ORDER BY ...)
-- List all departments as a comma-separated string
SELECT STRING_AGG(DISTINCT department, ', ') AS all_departments
FROM employees;
-- Returns something like: 'Engineering, Marketing, Sales'
You can add an ORDER BY inside the function to control the order of concatenation:
SELECT STRING_AGG(DISTINCT department, ', ' ORDER BY department) AS sorted_departments
FROM employees;
STRING_AGG is especially powerful in GROUP BY queries:
-- List employee names grouped by department
SELECT
department,
STRING_AGG(name, ', ' ORDER BY name) AS team_members
FROM employees
GROUP BY department;
Another example -- listing customer cities by country:
SELECT
country,
STRING_AGG(DISTINCT city, ', ' ORDER BY city) AS cities
FROM customers
GROUP BY country;
Combining Advanced Functions
Here is a more complex example that combines several of these functions:
SELECT
department,
STRING_AGG(
LPAD(LEFT(name, 1), 2, ' ') || '. ' || SPLIT_PART(name, ' ', 2),
'; '
ORDER BY name
) AS abbreviated_names
FROM employees
WHERE name LIKE '% %'
GROUP BY department;
Parsing and reformatting structured data:
SELECT
email,
UPPER(LEFT(SPLIT_PART(email, '@', 1), 1)) ||
LOWER(SUBSTRING(SPLIT_PART(email, '@', 1) FROM 2)) AS formatted_username,
UPPER(SPLIT_PART(SPLIT_PART(email, '@', 2), '.', 1)) AS domain_upper
FROM employees;
Try It Yourself
-
Write a query that splits each employee's email at the
@sign and returns the username and domain as separate columns usingSPLIT_PART. -
Create zero-padded 6-digit representations of each employee's salary using
LPAD. Display the name and padded salary. -
Use
STRING_AGGto list all employee names in each department, separated by " | ", ordered alphabetically. -
Use
REGEXP_REPLACEto remove all digits from customer email addresses and display the result alongside the original email. -
For each employee, display their name reversed and check whether any name is a palindrome.