Performance tuning - Nested and Merge Loops with Execution Plans

SQLShack

SQL Server training Español

Performance tuning – Nested and Merge SQL Loop with Execution Plans

April 2, 2018 by Thomas LeBlanc In this article, we will explore Nested and Merge SQL Loops in the SQL Execution plan from a performance tuning view. Even though reading execution is technical, it is more an art than science. The main iterator used when joining tables is a Loop. Nested and Merge loops are 2 of the most common. A plan can even have a loop without joining tables when a Seek needs a Lookup to find additional columns. This art of reading execution plan loops can help with performance tuning and debugging T-SQL. Once over the hump of reading a plan, going from beginner to intermediate is simple. The first loop to look at is the Nested SQL Loop. Figure 1 is a Nested Loop from the INNER JOIN of tables SalesOrderHeader and Customer in the Adventure Works database. Figure 1: Execution plan in ApexSQL Plan The actual T-SQL is in the Code 1 example below. This example shows a Clustered Index Seek finding one row in the Customer table from the WHERE clause CustomerID = 11091. 12345 SELECT cust.CustomerID, soh.SalesOrderID FROM Sales.Customer cust INNER JOIN Sales.SalesOrderHeader soh ON soh.CustomerID = cust.CustomerID WHERE cust.CustomerID = 11091 Code 1: T-SQL for joining 2 tables The WHERE clause in the T-SQL looks for key of the customer table. The Clustered Index Seek in Figure 1 returns one row for the customer in the PK_Customer_CustomerID index. This is the Primary Key (and clustered index) of the Sales.Customer table. Figure 2 shows the detail properties of the Seek Iterator. These properties include Cost, Rows, descriptions and many others that are helpful. The cost is separated into I/O, CPU, Subtree and Operator. Figure 2: Clustered index seek for primary key of customer table The one row from the Customer seek is then passed to the Nested SQL Loop to find the data in the joined table SalesOrderHeader. The outer part of the loop is where the data from the Clustered Index Seek is processed. In this case, there is only one row to traverse the outer loop. The inner loop takes each value from the outer SQL loop and processes more information. The SalesOrderHeader table is being joined and requested for column SalesOrderId in the SELECT statement. The iterator from Figure 1 is an Index Seek on the index IX_SalesOrderHeader_CustomerID from SalesOrderHeader table. This is a non-clustered index seek to find data related to that one customer. It uses the _CustomerId index because the SalesOrderId is in the index. It is in the index because that Id is the Clustered Index (Primary Key) of the SalesOrderHeader table. The script for the both indexes is below in Code 2. 123456789 ALTER TABLE [Sales].[SalesOrderHeader] ADD CONSTRAINT [PK_SalesOrderHeader_SalesOrderID] PRIMARY KEY CLUSTERED ( [SalesOrderID] ASC ) CREATE NONCLUSTERED INDEX [IX_SalesOrderHeader_CustomerID] ON [Sales].[SalesOrderHeader] ( [CustomerID] ASC ) Code 2: SalesOrderHeader Primary Key and Non-Clustered Index IX_SalesOrderHeader_CustomerID Initially, some people are confused about the SalesOrderId column not visible in the create of the non-clustered index IX_SalesOrderHeader_CustomerID. An understanding of clustered indexes is needed. When creating non-clustered indexes, SQL Server needs the Cluster Index column(s) in the non-cluster indexes to lookup data. It makes a lot of sense once this is understood. If there is no clustered index on table (Heap table), it will use a RowId lookup which is an internal id in a Table for uniquely identifying a row in a data page. If other columns are added to the T-SQL SELECT, AccountNumber and OrderDate, the plan changes because the non-clustered index used in Figure 1 does not have these values. 12345 SELECT cust.CustomerID, soh.SalesOrderID, soh.AccountNumber, soh.OrderDate FROM Sales.Customer cust INNER JOIN Sales.SalesOrderHeader soh ON soh.CustomerID = cust.CustomerID WHERE cust.CustomerID = 11091 Code 3: Additional columns for SalesOrderHeader Figure 3 shows the new plan with an additional Nested Loop to get the new columns from a lookup on the Clustered index of table SalesOrderHeader. The Index Seek is now an Index Seek plus Nested Loop to get additional columns in the Key Lookup of the clustered index. Figure 3: New execution plan Figure 3 shows for each of the 28 records found in index IX_SalesOrderHeader_CustomerID a row is retrieved from the Clustered Index PK_SalesOrderHeader with Iterator Key Lookup. If the cost of the Key Lookup for the Clustered Index cost too much, a covering index could be created to improve performance. Code 4 shows a covering index that would help this query. 1234 CREATE NONCLUSTERED INDEX [IX_SalesOrderHeader_CustomerID_IncludeAcctNumOrderDate] ON [Sales].[SalesOrderHeader] ( [CustomerID] ASC ) INCLUDE (AccountNumber, OrderDate) Code 4: Covering index for T-SQL query This new index now ‘covers’ the query with additional columns. Figure 4 shows the new plan which no longer has the second Nested SQL Loop and uses the index created in Code 4 to get additional information. Figure 4: Execution plan using Covering Index NOTE: If this query changes, like more columns in the SELECT, the plan might change back to the one in Figure 3. Always monitor the usage of indexes on a database. The Merge Loop is simpler than a Nested Loop. The data being merged together must be in the same order. The merge works like a zipper. The data is sorted on both streams as part of the join, and as intersections of the 2 streams happen, the data is joined together. 1234 SELECT P.Name, total_qty = SUM(I.Quantity) FROM Production.Product P JOIN Production.ProductInventory I ON I.ProductID = P.ProductID GROUP BY P.Name Code 5: T-SQL using a merge SQL loop The T-SQL in Code 5 shows the joining of the Product table with the ProductInventory table. Since the key column is ProductID, both tables either have a Clustered or Non-Clustered Index to retrieve the data. The key field in the indexes is ProductID and is the first column in the main part of the index. Figure 5 shows a Merge Join in the execution plan. Figure 5: Merge loop (or merge join) as displayed The outer part of the SQL loop is going through the ProductId of the rows from ProductInventory and joins the rows with data from the table Product. ProductId is unique because it is the primary key of the Product table. The T-SQL needs the product table to satisfy the GROUP BY in the T-SQL. The ProductInventory table does not have the Product Name column but is needed for the Sum of Quantity. The join between the 2 tables is where the plan indicates to get the ProductID. Be aware when there is a Sort iterator in this kind of execution plan. Sorts are high in the cost and it might be wiser to not use a GROUP BY or DISTINCT that forces a Sort. Figure 6: Results from the GROUP BY T-SQL 12345 -- Add some columnsSELECT P.Name, I.LocationID, total_qty = SUM(I.Quantity) FROM Production.Product P JOIN Production.ProductInventory I ON I.ProductID = P.ProductID GROUP BY P.Name, I.LocationID Code 6: Adding additional columns to MERGE loop The T-SQL in Code 6 adds the additional column LocationID to the SELECT and GROUP BY. The Actual Execution Plan changes are shown in Figure 7. Figure 7: New execution plan for LocationID addition The new plan has a Sort added for the Product Name and Location ID combination, but we still have the Merge Join. The cost in Figure 7 show the Sort is 47% of the query. The Stream Aggregation has moved to the Left of the Merge Join because of the addition of the LocationId column. If the Sort is to costly, returning to the original T-SQL will product the first plan like in Figure 8. Figure 8: No sort and stream aggregate inside the merge SQL loop Seemingly simple additions can change a plan. Sometimes to the SQL loop but sometimes to another part of the plan like the new Sort iterator added for the new column. Knowing how to read a plan can help diagnose when a covering index can help or when adding a column can change a plan with a costly iterator. Even though the costly iterator is added, the cost might not be high even to be concerned. Usually the end user will notify IT if something is running too slow.

References

Execution Plans ApexSQL Tools Nested Loop Author Recent Posts Thomas LeBlancThomas LeBlanc is a Data Warehouse Architect in Baton Rouge, LA. Today, he works with designing Dimensional Models in the financial area while using Integration (SSIS) and Analysis Services (SSAS) for development and SSRS & Power BI for reporting.

Starting as a developer in COBOL while at LSU, he has been a developer, tester, project manager, team lead as well as a software trainer writing documentation. Involvement in the SQL Server community includes speaking at SQLPASS.org Summits and SQLSaturday since 2011 and has been a speaker at IT/Dev Connections and Live! 360.

Currently, he is the Chair of the PASS Excel Business Intelligence Virtual Chapter and worked on the Nomination Committee for PASS Board of Directors for 2016.

View all posts by Thomas LeBlanc Latest posts by Thomas LeBlanc (see all) Performance tuning – Nested and Merge SQL Loop with Execution Plans - April 2, 2018 Time Intelligence in Analysis Services (SSAS) Tabular Models - March 20, 2018 How to create Intermediate Measures in Analysis Services (SSAS) - February 19, 2018

SQL Server query execution plans – Examples with the WHERE clause SQL Server Query Execution Plans for beginners – NON-Clustered Index Operators Using the SQL Execution Plan for Query Performance Tuning Parallel Nested Loop Joins – the inner side of Nested Loop Joins and Residual Predicates SQL Server query execution plans – Understanding and reading the plans 40,484 Views

SQL Convert Date functions and formats SQL Variables: Basics and usage SQL PARTITION BY Clause overview Different ways to SQL delete duplicate rows from a SQL Table How to UPDATE from a SELECT statement in SQL Server SQL Server functions for converting a String to a Date SELECT INTO TEMP TABLE statement in SQL Server SQL WHILE loop with simple examples How to backup and restore MySQL databases using the mysqldump command CASE statement in SQL Overview of SQL RANK functions Understanding the SQL MERGE statement INSERT INTO SELECT statement overview and examples SQL multiple joins for beginners with examples Understanding the SQL Decimal data type DELETE CASCADE and UPDATE CASCADE in SQL Server foreign key SQL Not Equal Operator introduction and examples SQL CROSS JOIN with examples The Table Variable in SQL Server SQL Server table hints – WITH (NOLOCK) best practices

SQL Server Transaction Log Backup, Truncate and Shrink Operations Six different methods to copy tables between databases in SQL Server How to implement error handling in SQL Server Working with the SQL Server command line (sqlcmd) Methods to avoid the SQL divide by zero error Query optimization techniques in SQL Server: tips and tricks How to create and configure a linked server in SQL Server Management Studio SQL replace: How to replace ASCII special characters in SQL Server How to identify slow running queries in SQL Server SQL varchar data type deep dive How to implement array-like functionality in SQL Server All about locking in SQL Server SQL Server stored procedures for beginners Database table partitioning in SQL Server How to drop temp tables in SQL Server How to determine free space and file size for SQL Server databases Using PowerShell to split a string into an array KILL SPID command in SQL Server How to install SQL Server Express edition SQL Union overview, usage and examples

Solutions

Read a SQL Server transaction logSQL Server database auditing techniquesHow to recover SQL Server data from accidental UPDATE and DELETE operationsHow to quickly search for SQL database data and objectsSynchronize SQL Server databases in different remote sourcesRecover SQL data from a dropped table without backupsHow to restore specific table(s) from a SQL Server database backupRecover deleted SQL data from transaction logsHow to recover SQL Server data from accidental updates without backupsAutomatically compare and synchronize SQL Server dataOpen LDF file and view LDF file contentQuickly convert SQL code to language-specific client codeHow to recover a single table from a SQL Server database backupRecover data lost due to a TRUNCATE operation without backupsHow to recover SQL Server data from accidental DELETE, TRUNCATE and DROP operationsReverting your SQL Server database back to a specific point in timeHow to create SSIS package documentationMigrate a SQL Server database to a newer version of SQL ServerHow to restore a SQL Server database backup to an older version of SQL Server

Categories and tips

►Auditing and compliance (50) Auditing (40) Data classification (1) Data masking (9) Azure (295) Azure Data Studio (46) Backup and restore (108) ►Business Intelligence (482) Analysis Services (SSAS) (47) Biml (10) Data Mining (14) Data Quality Services (4) Data Tools (SSDT) (13) Data Warehouse (16) Excel (20) General (39) Integration Services (SSIS) (125) Master Data Services (6) OLAP cube (15) PowerBI (95) Reporting Services (SSRS) (67) Data science (21) ►Database design (233) Clustering (16) Common Table Expressions (CTE) (11) Concurrency (1) Constraints (8) Data types (11) FILESTREAM (22) General database design (104) Partitioning (13) Relationships and dependencies (12) Temporal tables (12) Views (16) ►Database development (418) Comparison (4) Continuous delivery (CD) (5) Continuous integration (CI) (11) Development (146) Functions (106) Hyper-V (1) Search (10) Source Control (15) SQL unit testing (23) Stored procedures (34) String Concatenation (2) Synonyms (1) Team Explorer (2) Testing (35) Visual Studio (14) DBAtools (35) DevOps (23) DevSecOps (2) Documentation (22) ETL (76) ►Features (213) Adaptive query processing (11) Bulk insert (16) Database mail (10) DBCC (7) Experimentation Assistant (DEA) (3) High Availability (36) Query store (10) Replication (40) Transaction log (59) Transparent Data Encryption (TDE) (21) Importing, exporting (51) Installation, setup and configuration (121) Jobs (42) ►Languages and coding (686) Cursors (9) DDL (9) DML (6) JSON (17) PowerShell (77) Python (37) R (16) SQL commands (196) SQLCMD (7) String functions (21) T-SQL (275) XML (15) Lists (12) Machine learning (37) Maintenance (99) Migration (50) Miscellaneous (1) ▼Performance tuning (869) Alerting (8) Always On Availability Groups (82) Buffer Pool Extension (BPE) (9) Columnstore index (9) Deadlocks (16) Execution plans (125) In-Memory OLTP (22) Indexes (79) Latches (5) Locking (10) Monitoring (100) Performance (196) Performance counters (28) Performance Testing (9) Query analysis (121) Reports (20) SSAS monitoring (3) SSIS monitoring (10) SSRS monitoring (4) Wait types (11) ►Professional development (68) Professional development (27) Project management (9) SQL interview questions (32) Recovery (33) Security (84) Server management (24) SQL Azure (271) SQL Server Management Studio (SSMS) (90) SQL Server on Linux (21) ►SQL Server versions (177) SQL Server 2012 (6) SQL Server 2016 (63) SQL Server 2017 (49) SQL Server 2019 (57) SQL Server 2022 (2) ►Technologies (334) AWS (45) AWS RDS (56) Azure Cosmos DB (28) Containers (12) Docker (9) Graph database (13) Kerberos (2) Kubernetes (1) Linux (44) LocalDB (2) MySQL (49) Oracle (10) PolyBase (10) PostgreSQL (36) SharePoint (4) Ubuntu (13) Uncategorized (4) Utilities (21) Helpers and best practices BI performance counters SQL code smells rules SQL Server wait types © 2022 Quest Software Inc. ALL RIGHTS RESERVED. GDPR Terms of Use Privacy

TREND NOW

Performance tuning Nested and Merge Loops with Execution Plans

SQLShack

Performance tuning – Nested and Merge SQL Loop with Execution Plans

References

Related posts

Follow us

Popular

Trending

Solutions

Categories and tips

Comments (0)

Leave a Comment