Practice Topics
- Advanced Joins
- Subqueries
- Set Operations
Advanced Joins
Overview
In this Week 2 post, you were introduced to a basic join. In that post, an INNER JOIN was used. In this post, you will learn about the difference between INNER JOIN and LEFT JOIN. These are the only two types of joins that you will need to understand.
INNER JOIN: An INNER JOIN will successfully display rows and fields from two tables when they each contain matching values found in the field(s) used to connect both tables together. Tables are connected by specifying common fields in the join condition within the FROM statement.
Let’s revisit the example where we display customer FirstName and LastName from the AdventureWorks2019.Person.Person table alongside the CustomerID and TotalDue from the AdventureWorks2019.Sales.SalesOrderHeader table. Reminder, we have to leverage the AdventureWorks2019.Sales.Customer table as a bridge between the Person and SalesOrderHeader tables because there isn’t a common field that connects the SalesOrderHeader and Person tables.
In the event that a CustomerID found in the SalesOrderHeader table is missing in the Customer table, the entire record spanning the SalesOrderHeader, Customer, and Person tables would be eliminated from the query results. This is due to the fact that INNER JOINs require matching values to be present across all connected tables.
LEFT JOIN: A LEFT JOIN will always display rows and fields from the first table listed in the FROM statement (ie. the “LEFT” table). If the second table listed in the FROM statement (ie. the “RIGHT” table) contains a matching value in the common field(s) specified in the join, then rows and fields from the “RIGHT” table will also be successfully displayed. Similar to an INNER JOIN, the common field(s) connecting two tables together are specified in the join condition within the FROM statement.
Let’s again revisit the example where we display customer FirstName and LastName from the AdventureWorks2019.Person.Person table alongside the CustomerID and TotalDue from the AdventureWorks2019.Sales.SalesOrderHeader table. Reminder, we have to leverage the AdventureWorks2019.Sales.Customer table as a bridge between the Person and SalesOrderHeader tables because there isn’t a common field that connects the SalesOrderHeader and Person tables.
If a CustomerID found in the SalesOrderHeader table is missing in the Customer table, we will still be able to display the record from the SalesOrderHeader table. However, we will not be able to display the corresponding record from the Customer table. We will see the rows and fields in the “LEFT” table because LEFT JOINs do not require matching values to be present in both connected tables.
Example: List the ProductID and ProductName fields from the AdventureWorks2019.Production.Product table along with the Rating and Comments fields from the AdventureWorks2019.Production.ProductReview table. Please only display products that have a review.
SELECT a.ProductID, a.Name, b.Rating, b.Comments FROM AdventureWorks2019.Production.Product a INNER JOIN AdventureWorks2019.Production.ProductReview b ON a.ProductID = b.ProductID;
Example: List the ProductID and ProductName fields from the AdventureWorks2019.Production.Product table along with the Rating and Comments fields from the AdventureWorks2019.Production.ProductReview table. Please display all products, even if they don’t have a review.
SELECT a.ProductID, a.Name, b.Rating, b.Comments FROM AdventureWorks2019.Production.Product a LEFT JOIN AdventureWorks2019.Production.ProductReview b ON a.ProductID = b.ProductID;
Example: List the ProductID and ProductName fields from the AdventureWorks2019.Production.Product table that do not have a review in the AdventureWorks2019.Production.ProductReview table. Hint: Use a LEFT JOIN, but add a condition in the WHERE statement that only retains records when the Rating field IS NULL. I call this a negative join because you are using a LEFT JOIN to identify records from your “LEFT” table that do not have a corresponding record in the “RIGHT” table.
SELECT a.ProductID, a.Name, b.Rating, b.Comments FROM AdventureWorks2019.Production.Product a LEFT JOIN AdventureWorks2019.Production.ProductReview b ON a.ProductID = b.ProductID WHERE B.Rating IS NULL;
Self Practice
- Write a query to perform an
INNER JOINbetweenAdventureWorks2019.Sales.SalesOrderHeaderandAdventureWorks2019.Sales.CustomeronCustomerIDand selectSalesOrderID,OrderDate, andCustomerID. - Write a query to perform a
LEFT JOINbetweenAdventureWorks2019.Production.ProductandAdventureWorks2019.Production.ProductReviewonProductIDand select all columns fromProductandCommentsfromProductReview.
SELECT soh.SalesOrderID, soh.OrderDate, soh.CustomerID FROM AdventureWorks2019.Sales.SalesOrderHeader soh INNER JOIN AdventureWorks2019.Sales.Customer c ON soh.CustomerID = c.CustomerID;
SELECT p.*, pr.Comments FROM AdventureWorks2019.Production.Product p LEFT JOIN AdventureWorks2019.Production.ProductReview pr ON p.ProductID = pr.ProductID;
Subqueries
Overview
Sometimes, you need to embed a query within a query. This is called a subquery. There are several reasons for this:
- Occasionally, you need to create dynamic lookups based on live data within your database.
- Sometimes, you need to return the latest record within a defined group. For example, maybe you want to identify what a customer ordered on his/her most recent purchase.
- When creating a metric summary, you may need to mix data at different levels of granularity. For example, you may need to calculate the distribution of sales for each product.
- More complex queries often require logic to be processed sequentially. In other words, you may need to preprocess data before aggregating it.
subquery can be used in the SELECT, FROM, and WHERE statements. Most of the time, it is used in the FROM statement and the output of a subquery is treated exactly like a table. This means that you can SELECT fields from the results of a subquery. You can even join other tables to the results of a subquery. In the “Self-Practice” section, you will practice using subqueries in the WHERE statement.
Example: Using the AdventureWorks2019.Sales.SalesOrderDetail table, calculate the % of Total Sales for each ProductID per SalesOrderID. Total sales can be determined by using aggregation within a subquery. In the output, include the Name from the AdventureWorks2019.Production.Product table and give it an alias name of ProductName. You haven’t yet learned how to do math in SQL. To determine the % of Total Sales, you can divide the LineTotal by the TotalSales that you will derive in your subquery. Division works exactly the same as it does in Excel; use /.
SELECT a.SalesOrderID, a.ProductID, b.Name as ProductName, c.TotalSales, a.LineTotal / c.TotalSales as "Percent of Total Sales" FROM AdventureWorks2019.Sales.SalesOrderDetail a INNER JOIN AdventureWorks2019.Production.Product b ON a.ProductID = b.ProductID INNER JOIN ( SELECT SalesOrderID, SUM(LineTotal) as TotalSales FROM AdventureWorks2019.Sales.SalesOrderDetail GROUP BY SalesOrderID ) c ON a.SalesOrderID = c.SalesOrderID;
Self Practice
- Write a query to select
FirstNameandLastNamefrom theAdventureWorks2019.Person.Persontable whereBusinessEntityIDis in a subquery that selectsBusinessEntityIDfrom theAdventureWorks2019.HumanResources.Employeetable. - Write a query to select all products from the
AdventureWorks2019.Production.Producttable whereListPriceis greater than the averageListPriceof all products. - Write a query that returns every product
Namepurchased for each customer’s most recent order. Use theMAX(SalesOrderID)for eachCustomerIDto identify each customer’s most recent order. Display theSalesOrderID,CustomerID, and productName
SELECT FirstName, LastName FROM AdventureWorks2019.Person.Person WHERE BusinessEntityID IN ( SELECT BusinessEntityID FROM AdventureWorks2019.HumanResources.Employee );
SELECT * FROM AdventureWorks2019.Production.Product WHERE ListPrice > ( SELECT AVG(ListPrice) FROM AdventureWorks2019.Production.Product );
SELECT a.SalesOrderID, c.CustomerID, b.Name as ProductName FROM AdventureWorks2019.Sales.SalesOrderDetail a INNER JOIN AdventureWorks2019.Production.Product b ON a.ProductID = b.ProductID INNER JOIN ( SELECT CustomerID, MAX(SalesOrderID) as LastOrderID FROM AdventureWorks2019.Sales.SalesOrderHeader GROUP BY CustomerID ) c ON a.SalesOrderID = c.LastOrderID;
Set Operations
Overview
The UNION operator in SQL is used to combine the results of two or more SELECT queries into a single result set. The key points about UNION are:
- Distinct Results:
UNIONremoves duplicate rows from the combined result set. If you want to include duplicates, you can useUNION ALL. - Same Number of Columns: Each
SELECTquery within theUNIONmust have the same number of columns in the result sets. - Compatible Data Types: The columns must have compatible data types in each
SELECTquery.
Example: Create a BusinessEntityID list of people with contact information stored in either the AdventureWorks2019.Person.Address or AdventureWorks2019.Person.EmailAddress tables. Ensure that the list is unique.
SELECT BusinessEntityID FROM AdventureWorks2019.Person.BusinessEntityAddress UNION SELECT BusinessEntityID FROM AdventureWorks2019.Person.EmailAddress;
The INTERSECT operator in SQL is used to return only the rows that are common to the result sets of two SELECT queries. The key points about INTERSECT are:
- Distinct Results:
INTERSECTautomatically removes duplicate rows from the result set. - Same Number of Columns: Each
SELECTquery within theINTERSECTmust have the same number of columns in the result sets. - Compatible Data Types: The columns must have compatible data types in each
SELECTquery.
Example: Find common CustomerID values from AdventureWorks2019.Sales.SalesOrderHeader and BusinessEntityID values from AdventureWorks2019.Person.BusinessEntityAddress to see which active customers have an address on file.
SELECT CustomerID FROM AdventureWorks2019.Sales.SalesOrderHeader INTERSECT SELECT BusinessEntityID AS CustomerID FROM AdventureWorks2019.Person.BusinessEntityAddress;
Self Practice
- Write a query to find all
BusinessEntityIDthat are either in theAdventureWorks2019.HumanResources.Employeetable or theAdventureWorks2019.Sales.Customertable. - Write a query to find all
ProductIDthat are in both theAdventureWorks2019.Production.Producttable and theAdventureWorks2019.Sales.SalesOrderDetailtable.
SELECT BusinessEntityID FROM AdventureWorks2019.HumanResources.Employee UNION SELECT CustomerID AS BusinessEntityID FROM AdventureWorks2019.Sales.Customer;
SELECT ProductID FROM AdventureWorks2019.Production.Product INTERSECT SELECT ProductID FROM AdventureWorks2019.Sales.SalesOrderDetail;
Conclusion
As we wrap up this week’s SQL homework assignments, we’ve delved into even more advanced aspects of SQL, including complex joining, subqueries, and set operations. These exercises are designed to build your confidence and proficiency in SQL, equipping you with the skills necessary for more advanced queries in the future.
Remember, consistent practice is key to mastering SQL. If you find yourself needing extra help or wanting to deepen your understanding, don’t hesitate to sign up for our private or group training sessions. These sessions offer personalized guidance and a collaborative learning environment to further enhance your SQL skills. Stay tuned for next week’s assignments, and keep pushing your boundaries as you continue on your journey to becoming an SQL expert!