Practice Topics
- Advanced Joins
- Subqueries
- Set Operations
Advanced Joins
Overview
In this Week 2 post, you were introduced to a basic join. In that post, an INNER JOIN
was used. In this post, you will learn about the difference between INNER JOIN
and LEFT JOIN
. These are the only two types of joins that you will need to understand.
INNER JOIN: An INNER JOIN
will successfully display rows and fields from two tables when they each contain matching values found in the field(s) used to connect both tables together. Tables are connected by specifying common fields in the join condition within the FROM
statement.
Let’s revisit the example where we display customer FirstName
and LastName
from the AdventureWorks2019.Person.Person
table alongside the CustomerID
and TotalDue
from the AdventureWorks2019.Sales.SalesOrderHeader
table. Reminder, we have to leverage the AdventureWorks2019.Sales.Customer
table as a bridge between the Person
and SalesOrderHeader
tables because there isn’t a common field that connects the SalesOrderHeader
and Person
tables.
In the event that a CustomerID
found in the SalesOrderHeader
table is missing in the Customer
table, the entire record spanning the SalesOrderHeader
, Customer
, and Person
tables would be eliminated from the query results. This is due to the fact that INNER JOINs
require matching values to be present across all connected tables.
LEFT JOIN: A LEFT JOIN
will always display rows and fields from the first table listed in the FROM
statement (ie. the “LEFT” table). If the second table listed in the FROM
statement (ie. the “RIGHT” table) contains a matching value in the common field(s) specified in the join, then rows and fields from the “RIGHT” table will also be successfully displayed. Similar to an INNER JOIN
, the common field(s) connecting two tables together are specified in the join condition within the FROM
statement.
Let’s again revisit the example where we display customer FirstName
and LastName
from the AdventureWorks2019.Person.Person
table alongside the CustomerID
and TotalDue
from the AdventureWorks2019.Sales.SalesOrderHeader
table. Reminder, we have to leverage the AdventureWorks2019.Sales.Customer
table as a bridge between the Person
and SalesOrderHeader
tables because there isn’t a common field that connects the SalesOrderHeader
and Person
tables.
If a CustomerID
found in the SalesOrderHeader
table is missing in the Customer
table, we will still be able to display the record from the SalesOrderHeader
table. However, we will not be able to display the corresponding record from the Customer
table. We will see the rows and fields in the “LEFT” table because LEFT JOINs
do not require matching values to be present in both connected tables.
Example: List the ProductID
and ProductName
fields from the AdventureWorks2019.Production.Product
table along with the Rating
and Comments
fields from the AdventureWorks2019.Production.ProductReview
table. Please only display products that have a review.
SELECT a.ProductID, a.Name, b.Rating, b.Comments FROM AdventureWorks2019.Production.Product a INNER JOIN AdventureWorks2019.Production.ProductReview b ON a.ProductID = b.ProductID;
Example: List the ProductID
and ProductName
fields from the AdventureWorks2019.Production.Product
table along with the Rating
and Comments
fields from the AdventureWorks2019.Production.ProductReview
table. Please display all products, even if they don’t have a review.
SELECT a.ProductID, a.Name, b.Rating, b.Comments FROM AdventureWorks2019.Production.Product a LEFT JOIN AdventureWorks2019.Production.ProductReview b ON a.ProductID = b.ProductID;
Example: List the ProductID
and ProductName
fields from the AdventureWorks2019.Production.Product
table that do not have a review in the AdventureWorks2019.Production.ProductReview
table. Hint: Use a LEFT JOIN
, but add a condition in the WHERE
statement that only retains records when the Rating
field IS NULL
. I call this a negative join
because you are using a LEFT JOIN
to identify records from your “LEFT” table that do not have a corresponding record in the “RIGHT” table.
SELECT a.ProductID, a.Name, b.Rating, b.Comments FROM AdventureWorks2019.Production.Product a LEFT JOIN AdventureWorks2019.Production.ProductReview b ON a.ProductID = b.ProductID WHERE B.Rating IS NULL;
Self Practice
- Write a query to perform an
INNER JOIN
betweenAdventureWorks2019.Sales.SalesOrderHeader
andAdventureWorks2019.Sales.Customer
onCustomerID
and selectSalesOrderID
,OrderDate
, andCustomerID
. - Write a query to perform a
LEFT JOIN
betweenAdventureWorks2019.Production.Product
andAdventureWorks2019.Production.ProductReview
onProductID
and select all columns fromProduct
andComments
fromProductReview
.
SELECT soh.SalesOrderID, soh.OrderDate, soh.CustomerID FROM AdventureWorks2019.Sales.SalesOrderHeader soh INNER JOIN AdventureWorks2019.Sales.Customer c ON soh.CustomerID = c.CustomerID;
SELECT p.*, pr.Comments FROM AdventureWorks2019.Production.Product p LEFT JOIN AdventureWorks2019.Production.ProductReview pr ON p.ProductID = pr.ProductID;
Subqueries
Overview
Sometimes, you need to embed a query within a query. This is called a subquery
. There are several reasons for this:
- Occasionally, you need to create dynamic lookups based on live data within your database.
- Sometimes, you need to return the latest record within a defined group. For example, maybe you want to identify what a customer ordered on his/her most recent purchase.
- When creating a metric summary, you may need to mix data at different levels of granularity. For example, you may need to calculate the distribution of sales for each product.
- More complex queries often require logic to be processed sequentially. In other words, you may need to preprocess data before aggregating it.
subquery
can be used in the SELECT
, FROM
, and WHERE
statements. Most of the time, it is used in the FROM
statement and the output of a subquery
is treated exactly like a table. This means that you can SELECT
fields from the results of a subquery
. You can even join
other tables to the results of a subquery. In the “Self-Practice” section, you will practice using subqueries
in the WHERE
statement.
Example: Using the AdventureWorks2019.Sales.SalesOrderDetail
table, calculate the % of Total Sales
for each ProductID
per SalesOrderID
. Total sales can be determined by using aggregation within a subquery
. In the output, include the Name
from the AdventureWorks2019.Production.Product
table and give it an alias name of ProductName
. You haven’t yet learned how to do math in SQL. To determine the % of Total Sales
, you can divide the LineTotal
by the TotalSales
that you will derive in your subquery
. Division works exactly the same as it does in Excel; use /
.
SELECT a.SalesOrderID, a.ProductID, b.Name as ProductName, c.TotalSales, a.LineTotal / c.TotalSales as "Percent of Total Sales" FROM AdventureWorks2019.Sales.SalesOrderDetail a INNER JOIN AdventureWorks2019.Production.Product b ON a.ProductID = b.ProductID INNER JOIN ( SELECT SalesOrderID, SUM(LineTotal) as TotalSales FROM AdventureWorks2019.Sales.SalesOrderDetail GROUP BY SalesOrderID ) c ON a.SalesOrderID = c.SalesOrderID;
Self Practice
- Write a query to select
FirstName
andLastName
from theAdventureWorks2019.Person.Person
table whereBusinessEntityID
is in a subquery that selectsBusinessEntityID
from theAdventureWorks2019.HumanResources.Employee
table. - Write a query to select all products from the
AdventureWorks2019.Production.Product
table whereListPrice
is greater than the averageListPrice
of all products. - Write a query that returns every product
Name
purchased for each customer’s most recent order. Use theMAX(SalesOrderID)
for eachCustomerID
to identify each customer’s most recent order. Display theSalesOrderID
,CustomerID
, and productName
SELECT FirstName, LastName FROM AdventureWorks2019.Person.Person WHERE BusinessEntityID IN ( SELECT BusinessEntityID FROM AdventureWorks2019.HumanResources.Employee );
SELECT * FROM AdventureWorks2019.Production.Product WHERE ListPrice > ( SELECT AVG(ListPrice) FROM AdventureWorks2019.Production.Product );
SELECT a.SalesOrderID, c.CustomerID, b.Name as ProductName FROM AdventureWorks2019.Sales.SalesOrderDetail a INNER JOIN AdventureWorks2019.Production.Product b ON a.ProductID = b.ProductID INNER JOIN ( SELECT CustomerID, MAX(SalesOrderID) as LastOrderID FROM AdventureWorks2019.Sales.SalesOrderHeader GROUP BY CustomerID ) c ON a.SalesOrderID = c.LastOrderID;
Set Operations
Overview
The UNION
operator in SQL is used to combine the results of two or more SELECT
queries into a single result set. The key points about UNION
are:
- Distinct Results:
UNION
removes duplicate rows from the combined result set. If you want to include duplicates, you can useUNION ALL
. - Same Number of Columns: Each
SELECT
query within theUNION
must have the same number of columns in the result sets. - Compatible Data Types: The columns must have compatible data types in each
SELECT
query.
Example: Create a BusinessEntityID
list of people with contact information stored in either the AdventureWorks2019.Person.Address
or AdventureWorks2019.Person.EmailAddress
tables. Ensure that the list is unique.
SELECT BusinessEntityID FROM AdventureWorks2019.Person.BusinessEntityAddress UNION SELECT BusinessEntityID FROM AdventureWorks2019.Person.EmailAddress;
The INTERSECT
operator in SQL is used to return only the rows that are common to the result sets of two SELECT
queries. The key points about INTERSECT
are:
- Distinct Results:
INTERSECT
automatically removes duplicate rows from the result set. - Same Number of Columns: Each
SELECT
query within theINTERSECT
must have the same number of columns in the result sets. - Compatible Data Types: The columns must have compatible data types in each
SELECT
query.
Example: Find common CustomerID
values from AdventureWorks2019.Sales.SalesOrderHeader
and BusinessEntityID
values from AdventureWorks2019.Person.BusinessEntityAddress
to see which active customers have an address on file.
SELECT CustomerID FROM AdventureWorks2019.Sales.SalesOrderHeader INTERSECT SELECT BusinessEntityID AS CustomerID FROM AdventureWorks2019.Person.BusinessEntityAddress;
Self Practice
- Write a query to find all
BusinessEntityID
that are either in theAdventureWorks2019.HumanResources.Employee
table or theAdventureWorks2019.Sales.Customer
table. - Write a query to find all
ProductID
that are in both theAdventureWorks2019.Production.Product
table and theAdventureWorks2019.Sales.SalesOrderDetail
table.
SELECT BusinessEntityID FROM AdventureWorks2019.HumanResources.Employee UNION SELECT CustomerID AS BusinessEntityID FROM AdventureWorks2019.Sales.Customer;
SELECT ProductID FROM AdventureWorks2019.Production.Product INTERSECT SELECT ProductID FROM AdventureWorks2019.Sales.SalesOrderDetail;
Conclusion
As we wrap up this week’s SQL homework assignments, we’ve delved into even more advanced aspects of SQL, including complex joining
, subqueries
, and set operations
. These exercises are designed to build your confidence and proficiency in SQL, equipping you with the skills necessary for more advanced queries in the future.
Remember, consistent practice is key to mastering SQL. If you find yourself needing extra help or wanting to deepen your understanding, don’t hesitate to sign up for our private or group training sessions. These sessions offer personalized guidance and a collaborative learning environment to further enhance your SQL skills. Stay tuned for next week’s assignments, and keep pushing your boundaries as you continue on your journey to becoming an SQL expert!