Batch Sort and Nested Loops in SQL Server

Batch Sort and Nested Loops in SQL Server

Batch Sort and Nested Loops in SQL Server

SQLShack

SQL Server training Español

Batch Sort and Nested Loop in SQL Server

April 18, 2018 by Dmitry Piliugin Continuing my blog post series after 24HOP Russia “Query Processor Internals – Joins”. In this (and the next one) blog post, we will talk about the Nested Loop Post Optimization Rewrite optimizations. Some of you may know that a Nested Loop join algorithm preserves order of the outer table. Here is Craig’s Freedman (from the Query Optimizer Team, Microsoft) post: Let’s first setup some test data. 12345678910111213141516171819 create database nldb;gouse nldb;gocreate table dbo.SalesOrder(SalesOrderID int identity primary key, CustomerID int not null, SomeData char(30) not null);gowith nums as ( select top(1000000) rn = row_number() over(order by (select null)) from master..spt_values v1, master..spt_values v2, master..spt_values v3)insert dbo.SalesOrder(CustomerID, SomeData)select rn%500000, str(rn,30) from nums;gocreate index ix_CustomerID on dbo.Salesorder(CustomerID);go Now we will issue two queries. I use an index hint for demonstration purposes only, there might be no hint needed in real life for such a situation. 12 select * from dbo.SalesOrder with(index(ix_CustomerID)) where CustomerID < 500;select * from dbo.SalesOrder with(index(ix_CustomerID)) where CustomerID < 1000; The plans are almost the same: But the results: Obviously, the sort order is different. That is one of the examples showing, that you should never rely on the “default” sort order, as there is no such thing. The only way to get a desired sort order is to add an explicit outer order by in your query. Ok, but still, why the sort order is different.

Implicit Batch Sort and Nested Loop operator

Let’s examine the Nested Loop operator details. You may notice the Optimized property set to true. That is not very informative, what does it mean – “optimized”. Luckily, we have an explanation in the Craig Freedman’s blog for Nested Loop: What I love in Craig’s posts, that every insignificant (from the first glance) word has a solid meaning. The catch, even for an experienced users, is that both plans has an “Optimized=true” keyword in the Nested Loop Join plan operator, but only in the second query, the reordering is done. Why do the reordering? The purpose is to minimize random access impact. If we perform an Index Seek (with a partial scan, probably) we read the entries in the index order, in our case, in the order of CustomerID, which is clearly seen on the first result set. The index on CustomerID does not cover our query, so we have to ask the clustered index for the column SomeData, and actually, we perform one another seek, seeking by the SalesOrderID column. This is a random seek, so what if, before searching by the SalesOrderID we will sort by that key, and then issue an ordered sequence of Index Seeks, turning the random acces into the sequential one, wouldn’t it be more effective? Yes, it would in some cases, and that is what “optimized” property tells us about. However, we remember, that it is not necessarily leads to the real reordering. As for comparing the real impact, I will refer you to the actual Craig’s post or leave it as a homework. How it is implemented inside SQL Server? Let’s issue the query with some extra trace flags, showing the output physical tree (the tree of the physical operators) and the converted tree (the tree of the physical operators converted to a plan tree and ready to be compiled into the executable plan by the Query Executor component). 1234567 select * from dbo.SalesOrder with(index(ix_CustomerID)) where CustomerID < 1000 option( recompile , querytraceon 3604 , querytraceon 8607 , querytraceon 7352);
The first one (1) is the output tree of the physical operators, the result of the optimization. The second one (2) is the converted tree ready to become a plan, for the execution compilation. Notice the in the first one we have no mysterious nodes, like “??? ???”. But we have two of them in the second one. That is the result of Post Optimization Rewrites phase introducing some extra optimizations for the Nested Loops. The node (a) is for the optimization – called “Prefetching” and displayed as “WithUnorderedPrefetch = true” in the plan properties. This is not the topic of this post, but you may refer to the useful links later in this post to read more about it. The second one (b) is for the Batch Sort – our case. You may be interested to know if there is a special iterator for that mysterious node in the executable plan, and there is one. It is called CQScanBatchSortNew. Here how it looks like in the WinDbg call stack. The marked area represents a call to the iterator CQScanBatchSortNew which is responsible for the “OPTIMIZED” property and sorting optimization. Below you may also see the Prefetch iterator (CQScanRangePrefetchDelay), mentioned above. Not pretending to the any drawing skills, but to be clear, the artificial plan, with those artificially drawn operators might look like this.

Explicit Batch Sort and Nested Loop

The implicit batch sort optimization is a lightweight optimization, which is done on the Post Optimization Rewrite phase; however, that kind of optimization might be done before, by explicitly introducing the Sort operator. The decision to do one or another is a cost based decision. First, let’s fool the optimizer to pretend there is much more rows (just to not create a huge test database): 1 update statistics dbo.SalesOrder with rowcount = 10000000, pagecount = 400000 Now consider this example: 1234567 select * from dbo.SalesOrder with(index(ix_CustomerID)) where CustomerID < 10000 option( recompile , maxdop 1 , querytraceon 3604 , querytraceon 8607); I’m using the hints again only for the easy demonstration, it is not necessary in the real life. The plan is now: The sort operator properties clearly show that the order is a clustered index order. Unlikely the OPTIMZED property and implicit Batch Sort, this kind of decision is done before the Post Optimization Rewrite, and we may see the explicit sort operator in the output physical tree. Though they are doing the similar function and serve similar goals, they are slightly different The implicit sort is done before the Post Optimization Rewrite The implicit batch sort may not spill to disk and is net compromise between the explicit sort and not sorted requests Different TF to manage them =)

Magical TFs

There are three of them. All are well known but probably needs some clarification. TF 2340 – Disable Nested Loop Implicit Batch Sort on the Post Optimization Rewrite Phase TF 8744 – Disable Nested Loop Prefetching on the Post Optimization Rewrite Phase TF 9115 – Disable both, and not only on the Post Optimization but the explicit Sort also All of them, except 2340 are undocumented, so not should be used in production, be careful. That is all for that post. There are some other batch sort optimization applications, but not to do the re-write, I’d rather provide the links to further reading and some additional relevant and interesting material. Previous article in this series: Hash Join Execution Internals Author Recent Posts Dmitry PiliuginDmitry is a SQL Server enthusiast from Russia, Moscow. He started his journey to the world of SQL Server more than ten years ago. Most of the time he was involved as a developer of corporate information systems based on the SQL Server data platform.

Currently he works as a database developer lead, responsible for the development of production databases in a media research company. He is also an occasional speaker at various community events and tech conferences. His favorite topic to present is about the Query Processor and anything related to it. Dmitry is a Microsoft MVP for Data Platform since 2014.

View all posts by Dmitry Piliugin Latest posts by Dmitry Piliugin (see all) SQL Server 2017: Adaptive Join Internals - April 30, 2018 SQL Server 2017: How to Get a Parallel Plan - April 28, 2018 SQL Server 2017: Statistics to Compile a Query Plan - April 28, 2018

Related posts

Nested Loop Joins in SQL Server – Batch Sort and Implicit Sort SQL Server – Yet another X-Ray for the QP Parallel Nested Loop Joins – the inner side of Nested Loop Joins and Residual Predicates Performance tuning – Nested and Merge SQL Loop with Execution Plans Introduction to Nested Loop Joins in SQL Server 2,458 Views

Follow us

Popular

SQL Convert Date functions and formats SQL Variables: Basics and usage SQL PARTITION BY Clause overview Different ways to SQL delete duplicate rows from a SQL Table How to UPDATE from a SELECT statement in SQL Server SQL Server functions for converting a String to a Date SELECT INTO TEMP TABLE statement in SQL Server SQL WHILE loop with simple examples How to backup and restore MySQL databases using the mysqldump command CASE statement in SQL Overview of SQL RANK functions Understanding the SQL MERGE statement INSERT INTO SELECT statement overview and examples SQL multiple joins for beginners with examples Understanding the SQL Decimal data type DELETE CASCADE and UPDATE CASCADE in SQL Server foreign key SQL Not Equal Operator introduction and examples SQL CROSS JOIN with examples The Table Variable in SQL Server SQL Server table hints – WITH (NOLOCK) best practices

Trending

SQL Server Transaction Log Backup, Truncate and Shrink Operations Six different methods to copy tables between databases in SQL Server How to implement error handling in SQL Server Working with the SQL Server command line (sqlcmd) Methods to avoid the SQL divide by zero error Query optimization techniques in SQL Server: tips and tricks How to create and configure a linked server in SQL Server Management Studio SQL replace: How to replace ASCII special characters in SQL Server How to identify slow running queries in SQL Server SQL varchar data type deep dive How to implement array-like functionality in SQL Server All about locking in SQL Server SQL Server stored procedures for beginners Database table partitioning in SQL Server How to drop temp tables in SQL Server How to determine free space and file size for SQL Server databases Using PowerShell to split a string into an array KILL SPID command in SQL Server How to install SQL Server Express edition SQL Union overview, usage and examples

Solutions

Read a SQL Server transaction logSQL Server database auditing techniquesHow to recover SQL Server data from accidental UPDATE and DELETE operationsHow to quickly search for SQL database data and objectsSynchronize SQL Server databases in different remote sourcesRecover SQL data from a dropped table without backupsHow to restore specific table(s) from a SQL Server database backupRecover deleted SQL data from transaction logsHow to recover SQL Server data from accidental updates without backupsAutomatically compare and synchronize SQL Server dataOpen LDF file and view LDF file contentQuickly convert SQL code to language-specific client codeHow to recover a single table from a SQL Server database backupRecover data lost due to a TRUNCATE operation without backupsHow to recover SQL Server data from accidental DELETE, TRUNCATE and DROP operationsReverting your SQL Server database back to a specific point in timeHow to create SSIS package documentationMigrate a SQL Server database to a newer version of SQL ServerHow to restore a SQL Server database backup to an older version of SQL Server

Categories and tips

►Auditing and compliance (50) Auditing (40) Data classification (1) Data masking (9) Azure (295) Azure Data Studio (46) Backup and restore (108) ►Business Intelligence (482) Analysis Services (SSAS) (47) Biml (10) Data Mining (14) Data Quality Services (4) Data Tools (SSDT) (13) Data Warehouse (16) Excel (20) General (39) Integration Services (SSIS) (125) Master Data Services (6) OLAP cube (15) PowerBI (95) Reporting Services (SSRS) (67) Data science (21) ►Database design (233) Clustering (16) Common Table Expressions (CTE) (11) Concurrency (1) Constraints (8) Data types (11) FILESTREAM (22) General database design (104) Partitioning (13) Relationships and dependencies (12) Temporal tables (12) Views (16) ►Database development (418) Comparison (4) Continuous delivery (CD) (5) Continuous integration (CI) (11) Development (146) Functions (106) Hyper-V (1) Search (10) Source Control (15) SQL unit testing (23) Stored procedures (34) String Concatenation (2) Synonyms (1) Team Explorer (2) Testing (35) Visual Studio (14) DBAtools (35) DevOps (23) DevSecOps (2) Documentation (22) ETL (76) ►Features (213) Adaptive query processing (11) Bulk insert (16) Database mail (10) DBCC (7) Experimentation Assistant (DEA) (3) High Availability (36) Query store (10) Replication (40) Transaction log (59) Transparent Data Encryption (TDE) (21) Importing, exporting (51) Installation, setup and configuration (121) Jobs (42) ►Languages and coding (686) Cursors (9) DDL (9) DML (6) JSON (17) PowerShell (77) Python (37) R (16) SQL commands (196) SQLCMD (7) String functions (21) T-SQL (275) XML (15) Lists (12) Machine learning (37) Maintenance (99) Migration (50) Miscellaneous (1) ▼Performance tuning (869) Alerting (8) Always On Availability Groups (82) Buffer Pool Extension (BPE) (9) Columnstore index (9) Deadlocks (16) Execution plans (125) In-Memory OLTP (22) Indexes (79) Latches (5) Locking (10) Monitoring (100) Performance (196) Performance counters (28) Performance Testing (9) Query analysis (121) Reports (20) SSAS monitoring (3) SSIS monitoring (10) SSRS monitoring (4) Wait types (11) ►Professional development (68) Professional development (27) Project management (9) SQL interview questions (32) Recovery (33) Security (84) Server management (24) SQL Azure (271) SQL Server Management Studio (SSMS) (90) SQL Server on Linux (21) ►SQL Server versions (177) SQL Server 2012 (6) SQL Server 2016 (63) SQL Server 2017 (49) SQL Server 2019 (57) SQL Server 2022 (2) ►Technologies (334) AWS (45) AWS RDS (56) Azure Cosmos DB (28) Containers (12) Docker (9) Graph database (13) Kerberos (2) Kubernetes (1) Linux (44) LocalDB (2) MySQL (49) Oracle (10) PolyBase (10) PostgreSQL (36) SharePoint (4) Ubuntu (13) Uncategorized (4) Utilities (21) Helpers and best practices BI performance counters SQL code smells rules SQL Server wait types © 2022 Quest Software Inc. ALL RIGHTS RESERVED. GDPR Terms of Use Privacy
Share:
0 comments

Comments (0)

Leave a Comment

Minimum 10 characters required

* All fields are required. Comments are moderated before appearing.

No comments yet. Be the first to comment!

Batch Sort and Nested Loops in SQL Server | Trend Now | Trend Now