다중 명령문 테이블 값 함수 대 인라인 테이블 값 함수
다음과 같은 경우를 보여줄 몇 가지 예 :
인라인 테이블 평가
CREATE FUNCTION MyNS.GetUnshippedOrders()
RETURNS TABLE
AS
RETURN SELECT a.SaleId, a.CustomerID, b.Qty
FROM Sales.Sales a INNER JOIN Sales.SaleDetail b
ON a.SaleId = b.SaleId
INNER JOIN Production.Product c ON b.ProductID = c.ProductID
WHERE a.ShipDate IS NULL
GO
다중 명세서 테이블 평가
CREATE FUNCTION MyNS.GetLastShipped(@CustomerID INT)
RETURNS @CustomerOrder TABLE
(SaleOrderID INT NOT NULL,
CustomerID INT NOT NULL,
OrderDate DATETIME NOT NULL,
OrderQty INT NOT NULL)
AS
BEGIN
DECLARE @MaxDate DATETIME
SELECT @MaxDate = MAX(OrderDate)
FROM Sales.SalesOrderHeader
WHERE CustomerID = @CustomerID
INSERT @CustomerOrder
SELECT a.SalesOrderID, a.CustomerID, a.OrderDate, b.OrderQty
FROM Sales.SalesOrderHeader a INNER JOIN Sales.SalesOrderHeader b
ON a.SalesOrderID = b.SalesOrderID
INNER JOIN Production.Product c ON b.ProductID = c.ProductID
WHERE a.OrderDate = @MaxDate
AND a.CustomerID = @CustomerID
RETURN
END
GO
한 유형 (인라인 또는 다중 명령문)을 다른 유형보다 사용하는 것이 유리합니까? 하나가 다른 것보다 낫거나 차이가 순전히 구문적인 시나리오가 있습니까? 두 가지 예제 쿼리가 다른 작업을 수행한다는 것을 알고 있지만 그런 식으로 작성 해야하는 이유가 있습니까?
그들에 대한 독서와 장점 / 차이는 실제로 설명되지 않았습니다.
Matt의 의견을 조사하면서 본인의 원래 진술을 수정했습니다. 그는 맞습니다. 둘 다 SELECT 문을 단순히 실행하더라도 인라인 테이블 값 함수 (ITVF)와 다중 명령문 테이블 값 함수 (MSTVF)간에 성능에 차이가있을 것입니다. SQL Server는 ITVF를VIEW
문제의 테이블에 대한 최신 통계를 사용하여 실행 계획을 계산한다는 점에서 MSTVF는 SELECT 문의 전체 내용을 테이블 변수에 채우고 조인하는 것과 같습니다. 따라서 컴파일러는 MSTVF의 테이블에서 테이블 통계를 사용할 수 없습니다. 따라서 ITVF는 MSTVF보다 성능이 우수 할 것입니다. 내 테스트에서 완료 시간의 성능 차이는 무시할 수 있었지만 통계 측면에서 볼 때 눈에 띄었습니다.
귀하의 경우 두 기능은 기능적으로 동일하지 않습니다. MSTV 함수는 호출 될 때마다 추가 쿼리를 수행하고 가장 중요한 것은 고객 ID를 필터링합니다. 큰 쿼리에서 옵티마이 저는 전달 된 각 customerId에 대해 함수를 호출해야하므로 다른 유형의 조인을 이용할 수 없습니다. 그러나 MSTV 기능을 다음과 같이 다시 쓴 경우 :
CREATE FUNCTION MyNS.GetLastShipped()
RETURNS @CustomerOrder TABLE
(
SaleOrderID INT NOT NULL,
CustomerID INT NOT NULL,
OrderDate DATETIME NOT NULL,
OrderQty INT NOT NULL
)
AS
BEGIN
INSERT @CustomerOrder
SELECT a.SalesOrderID, a.CustomerID, a.OrderDate, b.OrderQty
FROM Sales.SalesOrderHeader a
INNER JOIN Sales.SalesOrderHeader b
ON a.SalesOrderID = b.SalesOrderID
INNER JOIN Production.Product c
ON b.ProductID = c.ProductID
WHERE a.OrderDate = (
Select Max(SH1.OrderDate)
FROM Sales.SalesOrderHeader As SH1
WHERE SH1.CustomerID = A.CustomerId
)
RETURN
END
GO
쿼리에서 옵티마이 저는 해당 기능을 한 번 호출하여 더 나은 실행 계획을 작성할 수 있지만 매개 변수가없는 동등한 ITVS 또는 a보다 나은 것은 아닙니다 VIEW
.
ITVFs should be preferred over a MSTVFs when feasible because the datatypes, nullability and collation from the columns in the table whereas you declare those properties in a multi-statement table valued function and, importantly, you will get better execution plans from the ITVF. In my experience, I have not found many circumstances where an ITVF was a better option than a VIEW but mileage may vary.
Thanks to Matt.
Addition
Since I saw this come up recently, here is an excellent analysis done by Wayne Sheffield comparing the performance difference between Inline Table Valued functions and Multi-Statement functions.
Internally, SQL Server treats an inline table valued function much like it would a view and treats a multi-statement table valued function similar to how it would a stored procedure.
When an inline table-valued function is used as part of an outer query, the query processor expands the UDF definition and generates an execution plan that accesses the underlying objects, using the indexes on these objects.
For a multi-statement table valued function, an execution plan is created for the function itself and stored in the execution plan cache (once the function has been executed the first time). If multi-statement table valued functions are used as part of larger queries then the optimiser does not know what the function returns, and so makes some standard assumptions - in effect it assumes that the function will return a single row, and that the returns of the function will be accessed by using a table scan against a table with a single row.
Where multi-statement table valued functions can perform poorly is when they return a large number of rows and are joined against in outer queries. The performance issues are primarily down to the fact that the optimiser will produce a plan assuming that a single row is returned, which will not necessarily be the most appropriate plan.
As a general rule of thumb we have found that where possible inline table valued functions should be used in preference to multi-statement ones (when the UDF will be used as part of an outer query) due to these potential performance issues.
There is another difference. An inline table-valued function can be inserted into, updated, and deleted from - just like a view. Similar restrictions apply - can't update functions using aggregates, can't update calculated columns, and so on.
Your examples, I think, answer the question very well. The first function can be done as a single select, and is a good reason to use the inline style. The second could probably be done as a single statement (using a sub-query to get the max date), but some coders may find it easier to read or more natural to do it in multiple statements as you have done. Some functions just plain can't get done in one statement, and so require the multi-statement version.
I suggest using the simplest (inline) whenever possible, and using multi-statements when necessary (obviously) or when personal preference/readability makes it wirth the extra typing.
look at Comparing Inline and Multi-Statement Table-Valued Functions you can find good descriptions and performance benchmarks
I have not tested this, but a multi statement function caches the result set. There may be cases where there is too much going on for the optimizer to inline the function. For example suppose you have a function that returns a result from different databases depending on what you pass as a "Company Number". Normally, you could create a view with a union all then filter by company number but I found that sometimes sql server pulls back the entire union and is not smart enough to call the one select. A table function can have logic to choose the source.
Another case to use a multi line function would be to circumvent sql server from pushing down the where clause.
For example, I have a table with a table names and some table names are formatted like C05_2019 and C12_2018 and and all tables formatted that way have the same schema. I wanted to merge all that data into one table and parse out 05 and 12 to a CompNo column and 2018,2019 into a year column. However, there are other tables like ACA_StupidTable which I cannot extract CompNo and CompYr and would get a conversion error if I tried. So, my query was in two part, an inner query that returned only tables formatted like 'C_______' then the outer query did a sub-string and int conversion. ie Cast(Substring(2, 2) as int) as CompNo. All looks good except that sql server decided to put my Cast function before the results were filtered and so I get a mind scrambling conversion error. A multi statement table function may prevent that from happening, since it is basically a "new" table.
if you are going to do a query you can join in your Inline Table Valued function like:
SELECT
a.*,b.*
FROM AAAA a
INNER JOIN MyNS.GetUnshippedOrders() b ON a.z=b.z
it will incur little overhead and run fine.
if you try to use your the Multi Statement Table Valued in a similar query, you will have performance issues:
SELECT
x.a,x.b,x.c,(SELECT OrderQty FROM MyNS.GetLastShipped(x.CustomerID)) AS Qty
FROM xxxx x
because you will execute the function 1 time for each row returned, as the result set gets large, it will run slower and slower.
'Programming' 카테고리의 다른 글
Django Admin-헤더 'Django Administration'텍스트 변경 (0) | 2020.05.15 |
---|---|
기계적 인조 인간. (0) | 2020.05.15 |
backbone.js에 기반한 많은 프레임 워크의 실제 강점과 약점은 무엇입니까? (0) | 2020.05.15 |
git checkout --track origin / branch와 git checkout -b branch origin / branch의 차이점 (0) | 2020.05.15 |
하나의 JavaScript로 작성된 함수를 다른 JS 파일로 호출 할 수 있습니까? (0) | 2020.05.15 |