Recursive CTEs and Foreign Key References in SQL Server

Recursive CTEs and Foreign Key References in SQL Server

Recursive CTEs and Foreign Key References in SQL Server

SQLShack

SQL Server training Español

Recursive CTEs and Foreign Key References in SQL Server

May 16, 2018 by Gerald Britton

Introduction

Foreign key constraints are a powerful mechanism for preserving referential integrity in a database. They can also represent a challenge when doing bulk table loads, since you need to find a “base” table to start with – that is, a table that has no foreign key constraints defined. Let’s label tables like this as level 0, or ground level if you like. Once that is loaded, you can begin to load other tables that have foreign key references to the base table. We can label those tables level 1, and so on. If you start with table data that already has referentially integrity and load tables by their level numbers — level 0, level 1, level 2 and so on – the load should proceed without problems. Let’s look at a simple example: 1234567891011121314151617181920212223 CREATE TABLE base( id int IDENTITY(1,1) PRIMARY KEY, b float)INSERT INTO base(b) VALUES(42), (3.14159), (2010401) CREATE TABLE facts( id int IDENTITY(11,1) PRIMARY KEY, base_id int FOREIGN KEY REFERENCES base(id), c varchar(50))INSERT INTO facts(base_id, c) VALUES (1, 'The Answer'), (2, 'pi'), (3, 'April Fools Day 2018') CREATE TABLE morefacts( id int IDENTITY(21,1) PRIMARY KEY, facts_id int FOREIGN KEY REFERENCES facts(id), d varchar(50))INSERT INTO morefacts(facts_id, d) VALUES (11, 'to the question'), (12, 'transcendental number'), (13, 'the jokes on you!') This set of three tables are at levels 0, 1, and 2, respectively, since “base” has no FK references, “facts” refers to “base” and “morefacts” has an FK referring to “facts”. Now, imagine that you have new data for all three tables, in the form of INSERT statements: 12345678910 INSERT INTO morefacts(facts_id, d) VALUES (14, 'golden ratio'), (15, 'limit of (1 + 1/n)^n') INSERT INTO facts(base_id, c) VALUES (4, 'phi'), (5, 'Euler''s number') INSERT INTO base(b) VALUES(1.618), (2.718) Now, you know that you can’t insert them that way, or you’ll get an error message like this one: The INSERT statement conflicted with the FOREIGN KEY constraint “FK__morefacts__facts” You need to do these in reverse order to preserve referential integrity. This is easy with this little example since we are in total control. Now, imagine that you were asked to load up a database with lots of foreign key relationships, but you didn’t know the levels of any of the tables. How would you proceed? There are a few different ways to tackle the problem and in this article I’m going to leverage the power of recursive Common Table Expressions, or CTEs, to do it.

The System Catalog View sys foreign_keys

SQL Server now provides almost 300 system catalog views that are useful for all sorts of metadata operations. If I include the dynamic management views the total is almost 500!. sys.foreign_keys, as the name implies, shows you the foreign key relationships for a database. I can combine this view with a recursive CTE to dig out the foreign key relationships in the database. A full discussion of recursive CTEs is outside the scope of this article. See the references section for more detail on how they work. For now, just keep in mind that a recursive CTE has two parts, just like a mathematical recurrence: A base case A recursive case that builds on the base case For our example, a query to get the base case would look like this: 12345678910 SELECT DISTINCT fk.object_id AS FK, fk.schema_id AS SchemaId, fk.parent_object_id AS TableId, t.schema_id AS ReferencedSchema, fk.referenced_object_id AS ReferencedTable FROM sys.foreign_keys AS fk JOIN sys.tables AS t ON fk.referenced_object_id = t.object_id WHERE fk.type = 'F' Run this on any database you have access to and observe the results. On my test database I see: The recursive case builds on this by finding tables referenced by the base case: 1234567891011 SELECT fk.object_id, fk.schema_id, fk.parent_object_id, t.schema_id, fk.referenced_object_id FROM sys.foreign_keys fk JOIN sys.tables t ON fk.referenced_object_id = t.object_id JOIN base_case ON fk.parent_object_id = base_case.referenced_object_id WHERE fk.type = 'F' This query is almost the same as the one above except that it joins with the base case, matching the parent object id, which is the table containing the FK reference. To the base case referenced object id. In this way we can get the tables referring to the base case tables and continue until there are no more, since this is recursive! Putting the two queries – the base case and the recursive case – together in a recursive CTE yields this query: 12345678910111213141516171819202122232425262728293031323334353637383940 WITH cte AS (SELECT DISTINCT fk.object_id, fk.schema_id, fk.parent_object_id, t.schema_id AS referenced_schema_id, fk.referenced_object_id, 0 AS Depth FROM sys.foreign_keys AS fk JOIN sys.tables AS t ON fk.referenced_object_id = t.object_id WHERE fk.type = 'F' --AND fk.parent_object_id = OBJECT_ID(N'morefacts', N'U') UNION ALL SELECT fk.object_id, fk.schema_id, fk.parent_object_id, t.schema_id, fk.referenced_object_id, cte.Depth - 1 FROM sys.foreign_keys AS fk JOIN sys.tables AS t ON fk.referenced_object_id = t.object_id JOIN CTE ON fk.parent_object_id = cte.referenced_object_id WHERE fk.type = 'F' --AND fk.parent_object_id <> cte.referenced_object_id ) SELECT OBJECT_NAME(cte.object_id) AS ReferringKey, SCHEMA_NAME(cte.schema_id) AS ReferringSchema, OBJECT_NAME(cte.parent_object_id) as ReferringTable, SCHEMA_NAME(cte.referenced_schema_id) AS ReferencedSchema, OBJECT_NAME(cte.referenced_object_id) as ReferencedTable, cte.referenced_object_id, cte.Depth FROM cteORDER BY Depth DESC,ReferencedSchema, ReferencedTable, ReferringKey; There are two commented lines that I’ll come back to in a moment. Running this on my test database I get: Here, the column “Depth” represents the level a table is with respect to one referring to it. In this case, the table “morefacts” refers to the table “facts”. So “morefacts” is at ground level (Depth=0) and “facts” is in the first basement (Depth = -1). With such a report I know I need to load the deepest levels first, then those above them and so on until I reach ground level. Now, let’s look at the two commented lines. The first one, if uncommented, lets me just look at the references from a specific table: 1 AND fk.parent_object_id = OBJECT_ID(N'morefacts', N'U') Running that produces a smaller report: No surprise there. The second commented line is trickier. On my SQL Server instance I also have the sample database WideWorldImporters installed. Let’s try the query with both lines commented on that database. I get an error: Msg 530, Level 16, State 1, Line 1 The statement terminated. The maximum recursion 100 has been exhausted before statement completion. The problem is that this database contains a table that is self-referential: This comes from the fact that the People table contains a hierarchy. Hierarchies can be used to show people in a reporting structure, where an employee points to their manager in the same table. Here it actually looks like a mistake! The foreign key says, “Be sure that a person in the table really is also in the table”. Of course that is always true! For our discussion though, it means that we are chasing our own tail during the recursive part of the query. If I uncomment the second comment: 1 AND fk.parent_object_id <> cte.referenced_object_id The problem will disappear: I just pasted part of the output. It is actually quite a bit longer, which you can verify for yourself.

Bottom up

The query we’ve been using takes a top-down approach. The problem we had with the People table suggests another approach. Can we find tables that refer to it? We can! We’ll use a bottom up approach. Actually the query changes very little: 12345678910111213141516171819202122232425262728293031323334353637383940 WITH cte AS (SELECT DISTINCT fk.object_id, fk.schema_id, fk.parent_object_id, t.schema_id AS referenced_schema_id, fk.referenced_object_id, 1 AS Level FROM sys.foreign_keys AS fk JOIN sys.tables AS t ON fk.referenced_object_id = t.object_id WHERE fk.type = 'F' AND fk.referenced_object_id = OBJECT_ID(N'Application.People', N'U') UNION ALL SELECT fk.object_id, fk.schema_id, fk.parent_object_id, t.schema_id, fk.referenced_object_id, cte.Level + 1 FROM sys.foreign_keys AS fk JOIN sys.tables AS t ON fk.referenced_object_id = t.object_id JOIN CTE ON fk.referenced_object_id = cte.parent_object_id WHERE fk.type = 'F' AND fk.parent_object_id <> cte.referenced_object_id ) SELECT DISTINCT OBJECT_NAME(cte.object_id) AS ReferringKey, SCHEMA_NAME(cte.schema_id) AS ReferringSchema, OBJECT_NAME(cte.parent_object_id) as ReferringTable, SCHEMA_NAME(cte.referenced_schema_id) AS ReferencedSchema, OBJECT_NAME(cte.referenced_object_id) as ReferencedTable, Level FROM cteORDER BY Level, ReferencedSchema, ReferencedTable, ReferringKey;RETURN; I’ve highlighted the changes. Basically, we start with tables that refer to Application. People and work up from there. This query yields the desired result for the WideWorldImporters database, though they are too big to post here (325 lines). The Level goes all the way up to 10, indicating a little of the complexity of the data model used here.

Summary

This brief excursion into recursive CTEs applied to system views shows how easy it can be to tease out the relationships between objects in a SQL Server database. If you’re new to common table expressions, especially the recursive variant, this gives you a simple example to understand how they work. Don’t use them for everything, however! Sometimes developers are tempted to use recursive CTEs in place of cursors or while loops, thinking that there will be some performance advantage. Usually, those hopes are dashed! Internally, recursive CTEs are processed “Row By Agonizing Row”, or RBAR, a term created by Jeff Moden, a veritable super-DBA in the Microsoft SQL Server space. If you’re new to system catalog views, let this serve as the briefest of introductions to a large topic! Author Recent Posts Gerald BrittonGerald Britton is a Senior SQL Server Solution Designer, Author, Software Developer, Teacher and a Microsoft Data Platform MVP. He has many years of experience in the IT industry in various roles.

Gerald specializes in solving SQL Server query performance problems especially as they relate to Business Intelligence solutions. He is also a co-author of the eBook "Getting Started With Python" and an avid Python developer, Teacher, and Pluralsight author.

You can find him on LinkedIn, on Twitter at twitter.com/GeraldBritton or @GeraldBritton, and on Pluralsight

View all posts by Gerald Britton Latest posts by Gerald Britton (see all) Snapshot Isolation in SQL Server - August 5, 2019 Shrinking your database using DBCC SHRINKFILE - August 16, 2018 Partial stored procedures in SQL Server - June 8, 2018

Related posts

Managing untrusted foreign keys How to Index Foreign Key Columns in SQL Server Ready, SET, go – How does SQL Server handle recursive CTE’s SQL Server Business Intelligence – Using recursive CTE and persisted computed columns to create a calendar table CTEs in SQL Server; Querying Common Table Expressions 9,634 Views

Follow us

Popular

SQL Convert Date functions and formats SQL Variables: Basics and usage SQL PARTITION BY Clause overview Different ways to SQL delete duplicate rows from a SQL Table How to UPDATE from a SELECT statement in SQL Server SQL Server functions for converting a String to a Date SELECT INTO TEMP TABLE statement in SQL Server SQL WHILE loop with simple examples How to backup and restore MySQL databases using the mysqldump command CASE statement in SQL Overview of SQL RANK functions Understanding the SQL MERGE statement INSERT INTO SELECT statement overview and examples SQL multiple joins for beginners with examples Understanding the SQL Decimal data type DELETE CASCADE and UPDATE CASCADE in SQL Server foreign key SQL Not Equal Operator introduction and examples SQL CROSS JOIN with examples The Table Variable in SQL Server SQL Server table hints – WITH (NOLOCK) best practices

Trending

SQL Server Transaction Log Backup, Truncate and Shrink Operations Six different methods to copy tables between databases in SQL Server How to implement error handling in SQL Server Working with the SQL Server command line (sqlcmd) Methods to avoid the SQL divide by zero error Query optimization techniques in SQL Server: tips and tricks How to create and configure a linked server in SQL Server Management Studio SQL replace: How to replace ASCII special characters in SQL Server How to identify slow running queries in SQL Server SQL varchar data type deep dive How to implement array-like functionality in SQL Server All about locking in SQL Server SQL Server stored procedures for beginners Database table partitioning in SQL Server How to drop temp tables in SQL Server How to determine free space and file size for SQL Server databases Using PowerShell to split a string into an array KILL SPID command in SQL Server How to install SQL Server Express edition SQL Union overview, usage and examples

Solutions

Read a SQL Server transaction logSQL Server database auditing techniquesHow to recover SQL Server data from accidental UPDATE and DELETE operationsHow to quickly search for SQL database data and objectsSynchronize SQL Server databases in different remote sourcesRecover SQL data from a dropped table without backupsHow to restore specific table(s) from a SQL Server database backupRecover deleted SQL data from transaction logsHow to recover SQL Server data from accidental updates without backupsAutomatically compare and synchronize SQL Server dataOpen LDF file and view LDF file contentQuickly convert SQL code to language-specific client codeHow to recover a single table from a SQL Server database backupRecover data lost due to a TRUNCATE operation without backupsHow to recover SQL Server data from accidental DELETE, TRUNCATE and DROP operationsReverting your SQL Server database back to a specific point in timeHow to create SSIS package documentationMigrate a SQL Server database to a newer version of SQL ServerHow to restore a SQL Server database backup to an older version of SQL Server

Categories and tips

►Auditing and compliance (50) Auditing (40) Data classification (1) Data masking (9) Azure (295) Azure Data Studio (46) Backup and restore (108) ►Business Intelligence (482) Analysis Services (SSAS) (47) Biml (10) Data Mining (14) Data Quality Services (4) Data Tools (SSDT) (13) Data Warehouse (16) Excel (20) General (39) Integration Services (SSIS) (125) Master Data Services (6) OLAP cube (15) PowerBI (95) Reporting Services (SSRS) (67) Data science (21) ▼Database design (233) Clustering (16) Common Table Expressions (CTE) (11) Concurrency (1) Constraints (8) Data types (11) FILESTREAM (22) General database design (104) Partitioning (13) Relationships and dependencies (12) Temporal tables (12) Views (16) ►Database development (418) Comparison (4) Continuous delivery (CD) (5) Continuous integration (CI) (11) Development (146) Functions (106) Hyper-V (1) Search (10) Source Control (15) SQL unit testing (23) Stored procedures (34) String Concatenation (2) Synonyms (1) Team Explorer (2) Testing (35) Visual Studio (14) DBAtools (35) DevOps (23) DevSecOps (2) Documentation (22) ETL (76) ►Features (213) Adaptive query processing (11) Bulk insert (16) Database mail (10) DBCC (7) Experimentation Assistant (DEA) (3) High Availability (36) Query store (10) Replication (40) Transaction log (59) Transparent Data Encryption (TDE) (21) Importing, exporting (51) Installation, setup and configuration (121) Jobs (42) ►Languages and coding (686) Cursors (9) DDL (9) DML (6) JSON (17) PowerShell (77) Python (37) R (16) SQL commands (196) SQLCMD (7) String functions (21) T-SQL (275) XML (15) Lists (12) Machine learning (37) Maintenance (99) Migration (50) Miscellaneous (1) ►Performance tuning (869) Alerting (8) Always On Availability Groups (82) Buffer Pool Extension (BPE) (9) Columnstore index (9) Deadlocks (16) Execution plans (125) In-Memory OLTP (22) Indexes (79) Latches (5) Locking (10) Monitoring (100) Performance (196) Performance counters (28) Performance Testing (9) Query analysis (121) Reports (20) SSAS monitoring (3) SSIS monitoring (10) SSRS monitoring (4) Wait types (11) ►Professional development (68) Professional development (27) Project management (9) SQL interview questions (32) Recovery (33) Security (84) Server management (24) SQL Azure (271) SQL Server Management Studio (SSMS) (90) SQL Server on Linux (21) ►SQL Server versions (177) SQL Server 2012 (6) SQL Server 2016 (63) SQL Server 2017 (49) SQL Server 2019 (57) SQL Server 2022 (2) ►Technologies (334) AWS (45) AWS RDS (56) Azure Cosmos DB (28) Containers (12) Docker (9) Graph database (13) Kerberos (2) Kubernetes (1) Linux (44) LocalDB (2) MySQL (49) Oracle (10) PolyBase (10) PostgreSQL (36) SharePoint (4) Ubuntu (13) Uncategorized (4) Utilities (21) Helpers and best practices BI performance counters SQL code smells rules SQL Server wait types © 2022 Quest Software Inc. ALL RIGHTS RESERVED. GDPR Terms of Use Privacy
Share:
0 comments

Comments (0)

Leave a Comment

Minimum 10 characters required

* All fields are required. Comments are moderated before appearing.

No comments yet. Be the first to comment!

Recursive CTEs and Foreign Key References in SQL Server | Trend Now | Trend Now