T-SQL window functions make creating a lot of queries easier, and they typically outperform previous approaches. Using the LAG function, for example, is far superior to doing a self-join. To improve overall performance, you must first comprehend the notion of framing and how window functions rely on sorting to provide results.
T-SQL Window functions, first introduced in 2005 and improved in 2012, are fantastic improvements to the T-SQL language. We will outline everything you need to know to obtain good results while utilizing these functions in this post.
The OVER clause and sorting
PARTITION BY and ORDER BY are two choices in the OVER clause that might induce sorting. All window functions allow PARTITION BY. However, it's not required. For the majority of the functions, ORDER BY is necessary. The data will be sorted based on the OVER clause, which might be the query's speed bottleneck, depending on what you attempt to do.The OVER clause's Sequence BY option is necessary so that the database engine can "align up" the rows so that the function may be applied in the right order. Let's imagine you wish to use the ROW NUMBER function in the order of SalesOrderID.
If you want the function to be applied in decreasing the order of TotalDue, the results will be different.
Let's see what the T-SQL query would look like in this scenario :
USE AdventureWorks<version>; GO SELECT SalesOrderID, TotalDue, ROW_NUMBER() OVER(ORDER BY SalesOrderID) AS RowNum FROM Sales.SalesOrderHeader; SELECT SalesOrderID, TotalDue, ROW_NUMBER() OVER(ORDER BY TotalDue DESC) AS RowNum FROM Sales.SalesOrderHeader; The results from the above queries would look like the below : No sorting is required because the first query uses the cluster key as the ORDER BY option. The second query includes a time-consuming sort procedure. The ORDER BY phrase in the OVER clause is unrelated to the ORDER BY clause in the overall query, which may be quite different. The
only method to avoid sorting's speed hit is constructing a dedicated
index for the OVER clause. Itzik's book Microsoft SQL Server 2012
High-Performance T-SQL Using Window Functions explains how to use window
functions in high-performance T-SQL. The POC index is recommended by
Ben-Gan. (P)PARTITION BY, (O)ORDER BY, and (c)covering are the acronyms
for (P)PARTITION BY, (O)ORDER BY, and (c)covering. He
suggests placing any filtering columns in the key before the PARTITION BY and ORDER BY columns. Then, as included columns, add any extra
columns required to construct a covering index. You'll need to test how such an index affects your query and total workload, just like
everything else. Of
course, you won't be able to add an index to every query you create,
but if the efficiency of a query that employs a window function is
critical, you may follow this suggestion. Here's a list of indexes that will help you enhance your previous query: CREATE NONCLUSTERED INDEX test ON Sales.SalesOrderHeader (CustomerID, SalesOrderID) INCLUDE (TotalDue); When you restart the query, you'll see that the sort operation is no longer included in the execution plan: FRAMINGFraming
is, in my opinion, the hardest topic to grasp while studying T-SQL
window methods. See the introduction to T-SQL window functions for
further information on the syntax. The following items require framing: ORDER BY window aggregates, such as FIRST VALUE LAST VALUE, are used for running totals or moving averages. Fortunately, framing isn't necessary most of the time, but it's all too simple to
skip the frame and go with the default. RANGE BETWEEN UNBOUNDED
PRECEDING AND CURRENT ROW is always the default frame. Speed will suffer while you will obtain the right results as long as the ORDER BY option contains a unique column or collection of columns. Here's an example of a comparison between the default and proper frames: SET STATISTICS IO ON; GO SELECT CustomerID, SalesOrderID, TotalDue, SUM(TotalDue) OVER(PARTITION BY CustomerID ORDER BY SalesOrderID) AS RunningTotal FROM Sales.SalesOrderHeader; SELECT CustomerID, SalesOrderID, TotalDue, SUM(TotalDue) OVER(PARTITION BY CustomerID ORDER BY SalesOrderID ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS RunningTotal FROM Sales.SalesOrderHeader; Do let me know the output of the above query. You guys can compare that in the comment section and let me know. The
outcomes are the same, but the execution is vastly different.
Unfortunately, the implementation plan does not give you the truth in this scenario. Each query consumed 50% of the resources, according to
the report. The I/O operations on the first query will be in thousands, while in the second query, they would be 0! Yes, you heard that right! ConclusionT-SQL
window functions have been touted as high-performance options. They
make creating inquiries easier in my perspective, but you must
understand them well to achieve high performance. Although indexing can
help, you can't generate an index for every query you write. Framing is a
difficult concept to grasp, but it is critical if you need to scale up
to enormous tables. Other SQL and Database Articles you may like:
Thanks for reading this article so far. IF you like it then please share and if you have any questions, |
No comments:
Post a Comment