SQL Joins: A Comprehensive Guide

Introduction

In the realm of relational databases, understanding how to combine data from multiple tables is a fundamental skill. SQL Joins provide a powerful mechanism for fetching and consolidating information spread across different tables. This comprehensive guide aims to demystify SQL Joins, offering a step-by-step exploration from the basics to advanced techniques.

JOIN, INNER JOIN, RIGHT JOIN, FULL JOIN, CROSS JOIN operations:

  • A JOIN clause is used to combine rows from two or more tables, based on a related column between them.
  • Different types of Joins:
    • INNER JOIN: Returns records that have matching values in both tables. In a CROSS JOIN, there is no specific condition for joining; it simply forms a combination of all possible pairs of rows from the involved tables.
    • LEFT JOIN: Returns all records from the left table, and the matched records from the right table
    • RIGHT JOIN: Returns all records from the right table, and the matched records from the left table
    • FULL JOIN: Returns all records when there is a match in either left or right table
    • CROSS JOIN: The CROSS JOIN keyword matches ALL records from the "left" table with EACH record from the "right" table.
    • FULL OUTER JOIN EXCLUDING INNER JOIN: A FULL OUTER JOIN returns all rows from both tables and includes matching rows. The "EXCLUDING INNER JOIN" part suggests that you want to exclude rows that have a match (i.e., rows that would be part of an INNER JOIN) from the result set.
    • UNION: The UNION operator is used to combine the result-set of two or more queries. The queries in the union must follow these rules:
      • They must have the same number of columns
      • The columns must have the same data types
      • The columns must be in the same order
    • UNION ALL: With the UNION operator, if some rows in the two queries returns the exact same result, only one row will be listed, because UNION selects only distinct values. Use UNION ALL to return duplicate values.
    • FULL JOIN:

                SELECT product_id, product_name, category_name
                FROM products
                INNER JOIN categories ON products.category_id = categories.category_id;
                -- Join products to categories using the category_id column --

                SELECT testproduct_id, product_name, category_name
                FROM testproducts
                INNER JOIN categories ON testproducts.category_id = categories.category_id;
                -- Join testproducts to categories using the category_id column --

                SELECT testproduct_id, product_name, category_name
                FROM testproducts
                FULL JOIN categories ON testproducts.category_id = categories.category_id;

                SELECT testproduct_id, product_name, category_name
                FROM testproducts
                CROSS JOIN categories;
                -- Join testproducts to categories using the CROSS JOIN keyword --

                SELECT product_id, product_name
                FROM products
                UNION
                SELECT testproduct_id, product_name
                FROM testproducts
                ORDER BY product_id;
                -- Combine products and testproducts using the UNION operator --

                SELECT product_id
                FROM products
                UNION
                SELECT testproduct_id
                FROM testproducts
                ORDER BY product_id;
                -- 
            
  • GROUP BY: The GROUP BY clause groups rows that have the same values into summary rows.

    
                    SELECT COUNT(customer_id), country
                    FROM customers
                    GROUP BY country;
                    -- Lists the number of customers in each country -- 
    
                    SELECT customers.customer_name, COUNT(orders.order_id)
                    FROM orders
                    LEFT JOIN customers ON orders.customer_id = customers.customer_id
                    GROUP BY customer_name;
                    -- The following SQL statement lists the number of orders made by each customer --
                
  • HAVING: The HAVING clause was added to SQL because the WHERE clause cannot be used with aggregate functions

    
                    SELECT COUNT(customer_id), country
                    FROM customers
                    GROUP BY country
                    HAVING COUNT(customer_id) > 5;
                    -- List only countries that are represented more than 5 times --
                    
                    SELECT order_details.order_id, SUM(products.price)
                    FROM order_details
                    LEFT JOIN products ON order_details.product_id = products.product_id
                    GROUP BY order_id
                    HAVING SUM(products.price) > 400.00;
                    -- lists only orders with a total price of 400$ or more --
                
  • EXISTS: The EXISTS operator is used to test for the existence of any record in a sub query.

    
                    SELECT customers.customer_name
                    FROM customers
                    WHERE EXISTS (
                      SELECT order_id
                      FROM orders
                      WHERE customer_id = customers.customer_id
                    );
                    -- Return all customers that is represented in the orders table --
                
  • NOT EXISTS: To check which customers that do not have any orders, we can use the NOT operator together with the EXISTS operator

    
                    SELECT customers.customer_name
                    FROM customers
                    WHERE NOT EXISTS (
                      SELECT order_id
                      FROM orders
                      WHERE customer_id = customers.customer_id
                    );
                    -- Return all customers that is NOT represented in the orders table --
                
  • ANY: The ANY operator allows you to perform a comparison between a single column value and a range of other values.

    
                    SELECT product_name
                    FROM products
                    WHERE product_id = ANY (
                    SELECT product_id
                    FROM order_details
                    WHERE quantity > 120
                    );
                    -- List products that have ANY records in the order_details table with a quantity larger than 120 --
                
  • ALL: The ALL operator:

    • returns a Boolean value as a result
    • returns TRUE if ALL of the sub query values meet the condition
    • is used with SELECT, WHERE and HAVING statements

    
                    SELECT product_name
                    FROM products
                    WHERE product_id = ALL (
                      SELECT product_id
                      FROM order_details
                      WHERE quantity > 10
                    );
                    -- List the products if ALL the records in the order_details with quantity larger than 10. --
                
  • CASE: The CASE expression goes through conditions and returns a value when the first condition is met (like an if-then-else statement). Once a condition is true, it will stop reading and return the result. If no conditions are true, it returns the value in the ELSE clause. If there is no ELSE part and no conditions are true, it returns NULL.

    
                    SELECT product_name,
                    CASE
                      WHEN price < 10 THEN 'Low price product'
                      WHEN price > 50 THEN 'High price product'
                    ELSE
                      'Normal product'
                    END
                    FROM products;
                    -- Return specific values if the price meets a specific condition --
    
                    SELECT product_name,
                    CASE
                        WHEN price < 10 THEN 'Low price product'
                        WHEN price > 50 THEN 'High price product'
                    ELSE
                        'Normal product'
                    END AS "price category"
                    FROM products;
                    -- Same example, but with an alias for the case column --
                
  • References


    Some other interesting things to know: