PostgreSQL CREATE TABLE AS

The CREATE TABLE AS statement in PostgreSQL enables users to create a new table populated with data derived from a query...

PostgreSQL, a powerful open-source relational database management system, offers various functionalities for efficient data handling and manipulation. Among these features, the CREATE TABLE AS (CTAS) statement stands out as a versatile tool for creating new tables based on existing data sets. 

PostgreSQL CREATE TABLE AS

This article delves into the intricacies of the PostgreSQL CREATE TABLE AS statement, its syntax, applications, and best practices.

Introduction to PostgreSQL CREATE TABLE AS Statement

The CREATE TABLE AS statement in PostgreSQL allows users to create a new table by copying the structure and data from an existing table or a result set generated by a SELECT query. This capability streamlines the process of table creation and data transformation, facilitating tasks such as data aggregation, summarization, and denormalization.

Syntax and Usage:

The syntax of the CREATE TABLE AS statement is straightforward:

CREATE TABLE new_table_name AS
SELECT column1, column2, ...
FROM existing_table_name
[WHERE condition];

This statement creates a new table named new_table_name with columns specified in the SELECT clause, populated with data retrieved from existing_table_name based on optional filtering conditions specified in the WHERE clause.

Create a New Table from an Existing Table

Creating a new table from an existing table, often referred to as the CREATE TABLE AS operation, is a SQL statement that allows you to generate a new table based on the structure and data of an existing table. This operation is particularly useful for tasks such as creating backups, generating summary tables, or transforming data for specific purposes without altering the original data.

CREATE TABLE employees (
    employee_id SERIAL PRIMARY KEY,
    first_name VARCHAR(50),
    last_name VARCHAR(50),
    department VARCHAR(50),
    salary NUMERIC(10, 2)
);

INSERT INTO employees (first_name, last_name, department, salary) VALUES
('John', 'Doe', 'Engineering', 60000),
('Jane', 'Smith', 'Marketing', 55000),
('Alice', 'Johnson', 'HR', 50000),
('Bob', 'Brown', 'Engineering', 62000);
CREATE TABLE employees_copy AS
SELECT *
FROM employees;
SELECT * FROM employees copy;

This will produce a result set like the following:

employee_id | first_name | last_name | department   | salary
------------+------------+-----------+--------------+--------
1           | John       | Doe       | Engineering  | 60000.00
2           | Jane       | Smith     | Marketing    | 55000.00
3           | Alice      | Johnson   | HR           | 50000.00
4           | Bob        | Brown     | Engineering  | 62000.00

The employees_copy table will have the same structure and data as the employees table, as specified in the CREATE TABLE AS SELECT statement.

Create a New Table with Selected Columns

Creating a new table with selected columns using the CREATE TABLE AS statement involves specifying the columns you want to include in the new table, along with the source table from which you're selecting those columns.

Here's an example of creating a new table named employee_names with selected columns first_name and last_name) from an existing table named employees. Let's assume we have an employees table with the following structure and data:

-- Existing table
CREATE TABLE employees (
    employee_id SERIAL PRIMARY KEY,
    first_name VARCHAR(50),
    last_name VARCHAR(50),
    department VARCHAR(50),
    salary NUMERIC(10, 2)
);

-- Sample data insertion
INSERT INTO employees (first_name, last_name, department, salary) VALUES
('John', 'Doe', 'Engineering', 60000),
('Jane', 'Smith', 'Marketing', 55000),
('Alice', 'Johnson', 'HR', 50000),
('Bob', 'Brown', 'Engineering', 62000);

-- Creating a new table with selected columns
CREATE TABLE employee_names AS
SELECT first_name, last_name
FROM employees;
SELECT * FROM employee_names;

This will produce a result set like the following:

first_name | last_name
-----------+----------
John       | Doe
Jane       | Smith
Alice      | Johnson
Bob        | Brown

This new table employee_names contains only the first_name and last_name columns copied from the employees table.

Create a New Table with Data Filtered by a Condition

Creating a new table with data filtered by a condition involves using the CREATE TABLE AS SELECT statement along with a WHERE clause to specify the condition for filtering the data. 

Let's consider an example where we have an existing table named 'sales' with the following structure:

CREATE TABLE sales (
    transaction_id SERIAL PRIMARY KEY,
    product_name VARCHAR(100),
    quantity INTEGER,
    amount NUMERIC(10, 2)
);
INSERT INTO sales (product_name, quantity, amount) VALUES
('Product A', 10, 500.00),
('Product B', 5, 250.00),
('Product C', 20, 800.00),
('Product D', 15, 300.00);
CREATE TABLE high_quantity_sales AS
SELECT *
FROM sales
WHERE quantity > 10;
SELECT * FROM high_quantity_sales; 

This will produce a result set like the following:

transaction_id | product_name | quantity | amount
---------------+--------------+----------+--------
3              | Product C    | 20       | 800.00
4              | Product D    | 15       | 300.00

This new table high_quantity_sales contains only the rows from the sales table where the quantity is greater than 10, as per the filtering condition specified in the WHERE clause.

Create a New Table with Aggregated Data

In PostgreSQL, you can create a new table with aggregated data using the CREATE TABLE AS syntax along with aggregate functions like SUM, AVG, COUNT, etc. This allows you to create a new table that summarizes data from an existing table based on certain criteria.

CREATE TABLE sales (
    product_id INT,
    quantity INT,
    price NUMERIC(10, 2)
);

INSERT INTO sales (product_id, quantity, price) VALUES
(1, 10, 20.00),
(1, 5, 25.00),
(2, 8, 15.00),
(2, 12, 18.00),
(3, 15, 10.00),
(3, 20, 12.00);
CREATE TABLE product_summary AS
SELECT product_id, 
       SUM(quantity) AS total_quantity,
       SUM(quantity * price) AS total_revenue
FROM sales
GROUP BY product_id;
SELECT * FROM product_summary; 

This will produce a result set like the following:

 product_id | total_quantity | total_revenue 
------------+----------------+---------------
          3 |             35 |        390.00
          2 |             20 |        336.00
          1 |             15 |        325.00

In this example, the sales table contains data for different products, including the product_id, quantity, and price. After running the query to create a new table product_summary, we aggregate the data by product_id and calculate the total quantity and revenue for each product. The expected output shows the contents of the product_summary table, which summarizes the data based on the aggregation.

Create a New Table with Joined Data

Creating a new table with joined data involves combining information from multiple tables based on a shared key or condition and storing the result in a new table. This process is commonly known as table joining and is often used to consolidate data from different sources or to denormalize data for improved query performance. The CREATE TABLE AS statement can be used in conjunction with the JOIN clause to create a new table with joined data. The JOIN clause is used to specify the relationship between the tables, typically through a common column or a defined condition.

CREATE TABLE employees (
    employee_id SERIAL PRIMARY KEY,
    name VARCHAR(100),
    department_id INT
);

CREATE TABLE departments (
    department_id SERIAL PRIMARY KEY,
    department_name VARCHAR(100)
);

INSERT INTO employees (name, department_id) VALUES
('John Doe', 1),
('Jane Smith', 2),
('Alice Johnson', 1);

INSERT INTO departments (department_name) VALUES
('Sales'),
('Marketing');
CREATE TABLE employee_department AS
SELECT e.employee_id, e.name AS employee_name, d.department_name
FROM employees e
JOIN departments d ON e.department_id = d.department_id;
SELECT * FROM employee_department; 

This will produce a result set like the following:

 employee_id | employee_name | department_name 
-------------+---------------+-----------------
           1 | John Doe      | Sales
           2 | Jane Smith    | Marketing
           3 | Alice Johnson | Sales

In this example, we joined the employees table with the departments table based on the department_id column. The CREATE TABLE AS statement creates a new table employee_department containing the employee ID, name, and corresponding department name. The output shows the contents of the newly created table.

Create a New Table with Calculated Columns

Creating a new table with calculated columns involves deriving new values based on expressions or operations performed on existing columns within one or more source tables. These calculated columns can be useful for storing pre-computed data, performing data transformations, or simplifying complex queries. The CREATE TABLE AS statement along with expressions or functions to define calculated columns. These expressions can involve arithmetic operations, string manipulations, date calculations, or any other supported operations in SQL.

CREATE TABLE orders (
    order_id SERIAL PRIMARY KEY,
    unit_price NUMERIC(10, 2),
    quantity INT
);

INSERT INTO orders (unit_price, quantity) VALUES
(10.00, 5),
(15.50, 3),
(20.75, 2);
CREATE TABLE order_totals AS
SELECT order_id,
       unit_price,
       quantity,
       unit_price * quantity AS total_price
FROM orders;
SELECT * FROM order_totals; 

This will produce a result set like the following:

order_id | unit_price | quantity | total_price
---------|------------|----------|------------
1        | 10.00      | 5        | 50.00
2        | 15.50      | 3        | 46.50
3        | 20.75      | 2        | 41.50

In this example, we created a new table order_totals using the CREATE TABLE AS statement, where the total_price column is calculated by multiplying the unit_price by the quantity. The output shows the contents of the newly created table with the calculated columns.

Create a New Table with Data Sorted

Creating a new table with data sorted in PostgreSQL involves inserting data from an existing table into a new table while specifying the order in which the data should appear. This can be achieved using the CREATE TABLE AS statement along with the ORDER BY clause to sort the data before it is inserted into the new table.

CREATE TABLE employees (
    employee_id SERIAL PRIMARY KEY,
    name VARCHAR(100),
    salary NUMERIC(10, 2)
);

INSERT INTO employees (name, salary) VALUES
('John Doe', 50000.00),
('Jane Smith', 60000.00),
('Alice Johnson', 45000.00);
CREATE TABLE employees_sorted AS
SELECT *
FROM employees
ORDER BY salary DESC;
SELECT * FROM employees_sorted;

This will produce a result set like the following:

employee_id | name          | salary
------------|---------------|----------
2           | Jane Smith    | 60000.00
1           | John Doe      | 50000.00
3           | Alice Johnson | 45000.00

In this example, we created a new table employees_sorted using the CREATE TABLE AS statement, where the data from the employees table is copied and sorted by the salary column in descending order. The output shows the contents of the newly created table with the sorted data.

Create a New Table with Limited Rows

Creating a new table with limited rows in PostgreSQL involves selecting a subset of rows from an existing table and inserting them into a new table. This can be achieved using the CREATE TABLE AS statement along with the LIMIT clause to specify the maximum number of rows to be copied.

CREATE TABLE orders (
    order_id SERIAL PRIMARY KEY,
    customer_id INT,
    order_date DATE
);

INSERT INTO orders (customer_id, order_date) VALUES
(1, '2023-01-15'),
(2, '2023-02-10'),
(1, '2023-03-20'),
(3, '2023-04-05'),
(2, '2023-05-12'),
(1, '2023-06-30');
CREATE TABLE recent_orders AS
SELECT *
FROM orders
ORDER BY order_date DESC
LIMIT 5;
SELECT * FROM recent_orders;

This will produce a result set like the following:

order_id | customer_id | order_date
----------|-------------|------------
6         | 1           | 2023-06-30
5         | 2           | 2023-05-12
4         | 3           | 2023-04-05
3         | 1           | 2023-03-20
2         | 2           | 2023-02-10

In this example, we created a new table recent_orders using the CREATE TABLE AS statement, where the data from the orders table is copied and sorted by the order_date column in descending order. The LIMIT 5 clause ensures that only the most recent 5 orders are included in the new table. The output shows the contents of the newly created table with the limited rows.

Applications and Benefits

  1. Data Aggregation and Summarization: CTAS enables users to aggregate and summarize data from existing tables into new tables, facilitating analytical tasks such as generating reports, creating data marts, or building summary tables for performance optimization.
  2. Data Transformation and Cleansing: By selecting specific columns and applying transformations during the CTAS operation, users can cleanse and transform data according to their requirements, ensuring data integrity and consistency.
  3. Temporary Tables for Complex Queries: CTAS is often used to create temporary tables that store intermediate results of complex queries, improving query performance and simplifying query logic.
  4. Schema Management: CTAS can aid in schema management by allowing users to create new tables with specific schemas or structures based on existing tables, ensuring consistency and standardization across the database.

Best Practices

  1. Optimize SELECT Queries: Before executing the CTAS statement, optimize the SELECT query to retrieve only the necessary columns and rows, minimizing resource consumption and improving performance.
  2. Consider Indexing: Evaluate the need for indexes on the new table based on query patterns and access patterns. Proper indexing can enhance query performance but requires careful consideration to avoid overhead.
  3. Transaction Management: Be mindful of transaction management when using CTAS within transactions. Ensure that the CTAS operation behaves as expected within the transaction context and consider transaction isolation levels to maintain data consistency.
  4. Security Considerations: Grant appropriate permissions on the newly created table to ensure data security and integrity. Limit access to authorized users or roles based on the principle of least privilege.

Conclusion

The PostgreSQL CREATE TABLE AS statement offers a powerful mechanism for creating new tables based on existing data, providing flexibility, efficiency, and ease of use in various data management scenarios. By understanding its syntax, applications, and best practices, users can leverage CTAS effectively to streamline data operations, optimize query performance, and enhance overall database management processes.

In conclusion, mastering the CREATE TABLE AS statement empowers PostgreSQL users to harness the full potential of their data infrastructure, driving insights, innovation, and business value in today's data-driven world.