PostgreSQL
Introduction to PostgreSQL
PostgreSQL, often simply called Postgres, is a powerful, open-source object-relational database management system (ORDBMS). It has a strong reputation for reliability, feature robustness, and performance. PostgreSQL was first developed in 1986 at the University of California, Berkeley as part of the POSTGRES project. It has since evolved into one of the most advanced and widely-used database systems, with a strong community supporting its development. PostgreSQL supports all major operating systems, including Linux, Mac OS, and Windows.
Key Features of PostgreSQL
PostgreSQL offers a wide range of features that make it a popular choice for many applications:
-
Extensive data types: PostgreSQL supports a large variety of built-in data types and allows users to define their own custom data types. It can handle complex data types such as arrays, JSON, and geometric types.
-
ACID compliance: PostgreSQL adheres to the ACID principles (Atomicity, Consistency, Isolation, Durability), ensuring reliable and trustworthy transactions. More details
-
Concurrency control: PostgreSQL uses multi-version concurrency control (MVCC) to provide high concurrency without conflicts, allowing multiple transactions to access the same data simultaneously.
-
Advanced querying capabilities: PostgreSQL supports complex SQL queries, subqueries, common table expressions (CTEs), recursive queries, and window functions. It also allows users to define their own functions, triggers, and stored procedures in various programming languages.
-
Full-text search: PostgreSQL provides powerful full-text search capabilities, including stemming, ranking, and phrase-searching support. It uses indexes like B-tree, hash, and GiST to optimize search performance.
-
Replication and high availability: PostgreSQL supports various replication strategies, such as asynchronous streaming, logical, and synchronous replication, providing data redundancy, fault tolerance, and high availability.
-
Security and authentication: PostgreSQL offers robust security features, including SSL encryption, username/password authentication, LDAP authentication, Kerberos authentication, role-based access control (RBAC), and row-level security (RLS).
Setting Up PostgreSQL
To get PostgreSQL running on your local machine, you will need to have the following tools installed:
-
PostgreSQL Server: You can follow the step-by-step instructions provided on the official website. Once the installation is complete, you can run the server by opening the application.
-
PostgreSQL Query Tools: Once the PostgreSQL server is installed, you can install tools to manage and interact with PostgreSQL. There are multiple choices, each with its own set of unique features and all of them support the basic functionalities. Here are some famous ones - PgAdmin, DBeaver, or you can even use terminal tools like Psql.
Hint
Installation on Mac
PostgreSQL can be installed on Mac by using homebrew
. Run the command brew install postgresql
. For more details and options, follow the official website.
Learning the Basics
Practice makes man perfect, so let's learn PostgreSQL through sample codes. Below are some sample code snippets in increasing order of complexity, designed to help you understand various aspects of PostgreSQL.
Hint
Before we begin, please note that to interact with the database, you need to use the PostgreSQL Query Language, which is a variant of the SQL language. If you are using terminal, then you can activate psql mode by running psql
. Once inside you can connect to the database by running the following command:
-- Connecting to a PostgreSQL database
-- Use a client or terminal with appropriate access credentials
\c my_database;
1. Creating a Database
-- Creating a database. Replace `my_database` with your database name
CREATE DATABASE my_database;
2. Creating a Table
-- Creating a simple table. Replace `employees` with your table name
CREATE TABLE employees (
id SERIAL PRIMARY KEY,
name VARCHAR(50),
position VARCHAR(50),
departmentid INT,
salary DECIMAL
);
Hint
Here is a detailed list of all supported data types in PostgreSQL. Note, you can also create custom data types.
3. Inserting Data
-- Inserting data into the table
INSERT INTO employees (name, position, salary)
VALUES ('John Doe', 'Software Engineer', 70000);
4. Basic Data Retrieval
-- Retrieving all data from a table
SELECT * FROM employees;
-- Limiting the number of rows returned
SELECT * FROM employees LIMIT 10;
-- Retrieving specific columns
SELECT name, position FROM employees;
-- Retrieving data in descending order
SELECT * FROM employees ORDER BY salary DESC;
5. Data Retrieval with Conditions
-- Retrieving specific data with a condition
SELECT name, position FROM employees WHERE salary > 50000;
-- Filtering on string columns
SELECT * FROM employees WHERE name LIKE '%Doe%';
-- Filtering on datetime columns
SELECT * FROM orders WHERE order_date BETWEEN '2022-01-01' AND '2022-02-01';
-- Filtering on datetime columns with interval (works same as above)
SELECT * FROM orders WHERE order_date BETWEEN '2022-01-01' AND '2022-02-01'::date + interval '1 month';
-- To filter based on multiple conditions and values
SELECT * FROM employees WHERE name LIKE '%Doe%' AND salary > 50000
AND position in ('Software Engineer', 'Data Scientist');
6. Updating Data
-- Updating data in the table
UPDATE employees SET salary = 75000 WHERE name = 'John Doe';
7. Deleting Data
-- Deleting data from the table
DELETE FROM employees WHERE id = 1;
-- Deleting all data from the table
DELETE FROM employees;
-- Deleting the table
DROP TABLE employees;
-- Deleting multiple tables
DROP TABLE employees, departments;
8. Joining Tables
-- Creating another table
CREATE TABLE departments (
id SERIAL PRIMARY KEY,
name VARCHAR(50)
);
-- Inserting data into the new table
INSERT INTO departments (name) VALUES ('Engineering');
-- Joining two tables
SELECT employees.name, departments.name AS department_name
FROM employees
JOIN departments ON employees.departmentid = departments.id;
9. Using Aggregate Functions
-- Using an aggregate function to get the average salary
SELECT AVG(salary) FROM employees;
-- Group by a column (ex: getting the average salary by department)
SELECT department_name, AVG(salary) AS avg_salary
FROM employees
JOIN departments ON employees.id = departments.id
GROUP BY department_name;
10. Complex Query with Subquery and Grouping
-- Finding the highest salary in each department
SELECT department_name, MAX(salary) AS max_salary
FROM (
SELECT employees.name, employees.salary, departments.name AS department_name
FROM employees
JOIN departments ON employees.id = departments.id
) AS department_salaries
GROUP BY department_name;
These examples cover a range of basic to more complex tasks you can perform with PostgreSQL, from establishing a connection to executing advanced queries. As you become more comfortable with these operations, you'll be able to tackle more complex scenarios and optimize your database interactions.
Python Sample Code
There are multiple python packages available for PostgreSQL like, psycopg2 and asyncpg. For this section, we will use asyncpg package that provides support for asynchronous programming.
A sample code to connect to the PostgreSQL server and fetch the result is shown below,
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
|
Creating dynamic queries based on user input can be easily done by passing the variables to the fetch
function. Below is the modification you need to do. If you notice, we have two variables in the query for id
and limit
denoted by $1
and $2
respectively. The respective values are passed in the fetch
function. Rest of the code remains same.
1 2 3 4 5 |
|
You can use conn.execute
to run the query without fetching the result. Below is the modification needed.
1 2 3 4 5 |
|
If you want to execute for multiple rows, you can use conn.executemany
instead of conn.execute
. Below is the modification to the code shown above.
1 2 3 4 5 |
|
You might want to create a generic function to execute queries and retry in case of failure. Here is how you can do it using the tenacity
library. The below code will retry 3 times if the query fails with exponential backoff.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 |
|
If you noticed, all of the above examples are executing the query within a single transaction. In case you want to execute multiple queries in one transaction, you can do as shown below,
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
|
Snippets
Real world problems will require much more than what we covered in the above sections. Lets cover some important queries in this section.
Casting a column to a different data type
-- Casting a column to a different data type
SELECT CAST(salary AS VARCHAR) FROM employees;
Using JSONB column
-- Extracting data from JSONB column
-- Suppose data column contains {"name": "John", "address": {"city": "New York", "state": "NY"}}
SELECT name, jsonb_extract_path(data, 'address', 'city') AS city FROM employees;
Extracting components from a DateTime column
-- Extracting month from DATE column
-- Suppose in a tbl, order_date col contains info like 2022-01-01
SELECT DATE_TRUNC('month', order_date) AS month, COUNT(*) AS order_count
FROM orders
GROUP BY month
ORDER BY month;
-- Extract year from DATE column, use: DATE_TRUNC('year', order_date)
-- Extract quarter from DATE column, use: DATE_TRUNC('quarter', order_date)
-- Extract week from DATE column, use: DATE_TRUNC('week', order_date)
-- Extract day from DATE column, use: DATE_TRUNC('day', order_date)
-- Extract hour from DATE column, use: DATE_TRUNC('hour', order_date)
-- Extract minute from DATE column, use: DATE_TRUNC('minute', order_date)
-- Extract second from DATE column, use: DATE_TRUNC('second', order_date)
Conclusion
PostgreSQL's combination of features, performance, and reliability makes it a popular choice for a wide range of applications, from small projects to large-scale enterprise systems. Its open-source nature, strong community support, and continuous development ensure that PostgreSQL will remain a leading database management system for years to come. Hope this article helped you understand the basics of PostgreSQL and piqued your interest in learning more.
References
[1] GeeksforGeeks - What is PostgreSQL? | PostgreSQL Tutorial
[2] w3schools - PostgreSQL Tutorial
[3] Tutorialspoint - PostgreSQL Tutorial