Home Uncategorized Scalar functions, inlining, and performance: An entertaining title for a boring post

    Scalar functions, inlining, and performance: An entertaining title for a boring post

    1290
    40

    Scalar.  Function.

    Wow.

    Could any other combination of words evoke the same feeling of encapsulation, information hiding, and simplification of client code?  After years spent developing software in the procedural and OO worlds, it can be difficult–perhaps, even impossible–to migrate over to working with SQL Server and not consider how to architect your data access logic using some of the same techniques you’d use in the application tier.

    In short: Why would you ever write the same piece of logic more than once?  Answer: You wouldn’t (damn it!).  And so Microsoft bestowed upon the SQL Server community, in SQL Server 2000, the ability to write scalar user-defined functions.  And they could have been such beautiful things…

    But alas, reality can be painful, and as developers tried these new tools they were struck with a strange feeling of sadness as their applications buckled under the weight of what otherwise would have been a wonderful idea. As it turned out, putting all but the simplest of logic into these scalar functions was a recipe for disaster. Why?  Because they’re essentially cursors waiting to happen (but they don’t look like cursors, so you may not know… until it’s too late.)

    The central problem is that when you wrap logic in a multistatement UDF, the query optimizer just can’t unwrap it too easily. And so there’s really only one way to evaluate a scalar UDF: call it once per row. And that is really nothing more than a cursor.

    Seeing this behavior in action is easy enough; consider the following scalar function that some poor sap DBA working for AdventureWorks might be compelled to create:

    [sql]CREATE FUNCTION GetMaxProductQty_Scalar
    (
    @ProductId INT
    )
    RETURNS INT
    AS
    BEGIN
    DECLARE @maxQty INT

    SELECT @maxQty = MAX(sod.OrderQty)
    FROM Sales.SalesOrderDetail sod
    WHERE sod.ProductId = @ProductId

    RETURN (@maxQty)
    END
    [/sql]
    Simple enough, right?  Let’s pretend that AdventureWorks has a bunch of reports, each of which requires maximum quantity sold per product.  So the DBA, thinking he can save himself some time and keep everything centralized (and that is a good idea), puts all of the logic into a scalar UDF.  Now, when he needs this logic, he can just call the UDF.  And if the logic has a bug, or needs to be changed, he can change it in exactly one place.  And so life is great… Or is it?

    Let’s take a look at a sample query:

    [sql]SELECT
    ProductId,
    dbo.GetMaxProductQty_Scalar(ProductId)
    FROM Production.Product
    ORDER BY ProductId
    [/sql]
    This query does nothing more than get the max quantity sold for each product in the Productin.Product table. And a look at the execution plan or the STATISTICS IO output might indicate that there’s nothing too interesting going on here: The execution plan shows an index scan (to be expected, with no WHERE clause), followed by a compute scalar operation (the call to the UDF). And STATISTICS IO shows a mere 16 reads.

    So why is this query so problematic? Because the real issue is hiding just beneath the surface.  The execution plan and STATISTICS IO didn’t consider any of the code evaluated within the UDF! To see what’s really going on, fire up SQL Server Profiler, turn on the SQL:BatchCompleted event, and make sure you’re showing the Reads column. Now run the query again and you’ll see that this seemingly-innocent block of T-SQL is, in fact, using 365,247 logical reads. Quite a difference!

    Each of those “compute scalar” operations is really a call to the UDF, and each of the calls to the UDF is really a new query.  And all of those queries (all 504 of them — the number of products in the Product table) add up to massive I/O.  Clearly not a good idea in a production environment.

    But luckily, we’re not done here yet (or this would be a very boring post). Because while the performance penalty is a major turnoff, I really do love the encapsulation afforded by scalar UDFs.  I want them (or a similar tool) in my toolbox… And so I got to thinking.

    The answer to my dilemma, as it turns out, is to not use scalar UDFs at all, but rather to use inline table-valued UDFs and treat them like scalars. This means that queries get slightly more complex than with scalar UDFs, but because the funtions are inlined (treated like macros) they’re optimized along with the rest of the query. Which means, no more under-the-cover cursors.

    Following is a modified version of the scalar UDF posted above:

    [sql]CREATE FUNCTION GetMaxProductQty_Inline
    (
    @ProductId INT
    )
    RETURNS TABLE
    AS
    RETURN
    (
    SELECT MAX(sod.OrderQty) AS maxqty
    FROM Sales.SalesOrderDetail sod
    WHERE sod.ProductId = @ProductId
    )
    [/sql]
    This function is no longer actually scalar–in fact, it now returns a table. It just so happens that the table has exactly one column and exactly one row, and uses the same logic as the scalar UDF shown above. So it’s still scalar enough for my purposes.

    The query shown above, used to retrieve the maximum quantity sold for each product, will not quite work with this UDF as-is. Trying to substitute in the new UDF will result in nothing more than a variant on an “object not found” error.  Instead, you need actually treat this function like  it returns a table (due to the fact that it does).  And that means, in this case, a subquery:

    [sql]SELECT
    ProductId,
    (
    SELECT MaxQty
    FROM dbo.GetMaxProductQty_Inline(ProductId)
    ) MaxQty
    FROM Production.Product
    ORDER BY ProductId
    [/sql]
    So there it is. We’re now treating the table-valued UDF more or less just like a scalar UDF.  And the difference in I/O results is really quite astounding: 1267 logical reads in this case. Meaning that the scalar UDF solution is around 288 times more I/O intensive!

    The question being, of course, was it worth it? The whole thing could have been written as one query, without the need for any UDFs at all. And the final query in this case is quite a bit more complex than the previous version, in addition to the fact that the encapsulation breaks down to some degree by forcing the caller to have some knowledge of how the UDF actually works. But I do feel that this sacrifice is warranted in some cases. Although the “greatest quantity sold” example shown here is simplistic, imagine other situations in which the same code fragments or logic are used over and over, due to lack of a good way of standardizing and centralizing them.  I know I’ve seen that a lot in my work, and some examples I can think of have included complex logic that might very well have been easier to maintain in a UDF.

    This technique may not be perfect for every case, and it certainly has its tradeoffs. But it may be a useful trick to keep in the back of your mind for a rainy day in the data center when someone’s scalar UDF solution starts breaking down and you need a fix that doesn’t require a massive code rewrite.

    40 COMMENTS

    1. Hey Adam – nicely written post

      Another issue that complicates measuring IO from UDFs is that the Profiler Reads output contains Proc Cache lookups – which are also 8K IOs, whereas statistics io doesn’t. In SQL 2000 UDFs, the Proc Cache was looked up per-invocation of the UDF (per row), potentially skewing the values returned by Reads depending on the size of the cache. I’ve seen scenarios where the same statement (involving a UDF) recorded signficantly variant Reads during peak processing periods (when the cache is bloated) vs during average processing periods (when the cache has reduced to a smaller size).

      In SQL 2005, this has been sensibly scaled back so that the proc cache is only looked up for the first invocation in a rowset, improving UDF performance somewhat by eliminating the repetitive cache lookups performed in SQL 2000..

      Cheers
      Greg

    2. Nice one.

      When I started with SQL 2000 I was fascinated with its UDF feature, I used to do this way but my boss told to change it due to performance reason. As much as possible I do avoid performing a query inside the function, but in cases where it can’t be avoided the best way is to optimized the query inside the function.

    3. Great Post Adam. Fascinating details of UDF. I always wondered how a Scalar UDF be converted to a Inline table valued UDF and this post actually shows that.

    4. great article, confirming what I’ve been seeing…really a surprise to me.  Good suggestions for handling as table functions…will try that.

    5. Thanks for this post Adam; I’m experiencing this problem at the moment and the article gives me great pointers on how to solve it to everyone’s satisfaction.

    6. Adam –
      Since you’re using a TVF, you could "improve readability" with a CROSS APPLY statement. Since it’s a single value, the perf penalties that come in from CROSS APPLY don’t seem to pop up…
      So it would look like
      SELECT    ProductId,
               MaxQty
      FROM Production.Product
      CROSS APPLY dbo.GetMaxProductQty_Inline(ProductId) MaxQT
      ORDER BY ProductId

    7. Adam, this is a great post. I only want to add one practical detail to your post which support it.
      Most of the time ( or in my experience of 15yr – all the time) programmers are not trained and do not care about performance tunning. However, application usually should be maintained for 3-5 years of it’s life span.
      This is usually overlooked. Finally, dba – person who did not develop the application should start tunning it and here is a big problem having UDF’s and views.
      When you say that UDF may bring benefits, did you take in count the fact that code which uses UDF’s and views is almoust impossible to analize for performance improvements?
      Try to traverse a code which use UDF which goes aganist view, which uses another 2 views or so!
      This is a real price for ‘sleek’ UDF use. That’s the real difference between experienced seasoned programmer who had met and learned this by his own experience and talanted 2-3 yr. programmer who measures his work by complexity of his queries and use of funcy new features like UDF.
      I bet real good code is measured by it’s ability to be troubleshouted easily too. And UDF’s and views are sure means not to achieve that.

    8. Thank you!  I have succesfully used this optimalisation to speed up our ETL process.  We were doing quite a lot of date calculations via functions: having a date format yyyymmdd but needing date calculations like "substract 6 hours" made scalar UDFs a logical choice.  It is now implemented using ‘inline’ table-valued functions.  Speeds up the processing!
      Thanks again, excellent post.

    9. We just went thru this issue and when I went to do the research for a possible fis, I found your post.
      Invaluable!
      Thanks very much.

    10. I’m on a project migrating from Oracle to SQL Server. It turns out a lot of our reports call scalar UDFs for most of the projected columns. In Oracle, a typical query takes seconds. In SQL Server, 30 minutes.
      Clearly, it’s not that SQL Server can’t solve the perf problem they have with scalar UDFs, but that they haven’t prioritized it over other features.
      So now I’ve found your great article (and a number of other similar articles on Cross Apply), that’s what we will have to do.

    11. Great post. I would like to say though that Scalar functions can still be very useful, AS LONG AS you aren’t using them in an SQL expression with loads of rows.
      For example I have some scalar functions such as fn_Daysearch, which returns a date for a day previous or future, like last Wednesday or next Monday. I typically only call these functions once at the start of a stored proc, usually passing in a proc parameter and using the return value in the WHERE clause of my main query. As a rule if it’s only getting called once or a few times, then I’ll go scalar.
      Also Matts CROSS APPLY above is a perfectly good option, although I’d prefer using MAX(sod.OrderQty) OVER (PARTITION BY sod.ProductId) myself.

    12. Dave, agreed. But make sure that you assign the results of your scalar UDF to a variable, and then use that in the WHERE clause. Otherwise, you A) don’t know that it will only be called once, and B) may encounter some other issues (T-SQL scalar UDFs will inhibit parallelism).

    13. Great information!.  I used this to streamline our ETL process.  I am now saving an average of 10 hours process time per day.  Unbelievable.  Thanks for the post.

    14. Adam, it’s amazing that you wrote this over 6 years ago and it’s still relevant (amazingly good for you or bad for MS?).
      We recently had an issue with a scalar UDF.  Being aware that scalar functions could cause problems we were careful to test before rolling to production.  Total run time on test machines was 5 minutes.  Production was a faster box;  we expected faster results.  Wrong. Production ran for 45 minutes before we had to kill it.  
      Subsequent "tests" in production showed nearly 2000 reads in production for a single run of the UDF and 3 in test (not kidding).  Same configuration, same settings, same db, same everything.  Almost.  
      In production we had two server traces running that were capturing statement-level events.  While both traces were filtering out most events, the server still collected them before discarding.  
      We stopped the traces, ran the UDF and it now shows 3 reads in production now…
      This URL showed us the light:
      http://sqlblog.com/blogs/linchi_shea/archive/2009/06/15/performance-impact-sql-trace-and-user-defined-scalar-functions.aspx

    15. What I noticed is that an application that we develop, when using UDFs, increases a lot the RowCounts compared to when it’s not using. I modified a query to check it and the query had to return 1250 rows, the RowCount using UDF was 5022 and, without it, the RowCount dropped to 1255. The reads count didn’t differ so much between using and not using UDF, using it it was about 30 more reads than without it.

    16. Had an interesting experience yesterday looking at the index usage across a new-ish database.  The major tables showed usage of about 3,000,000.  But a couple of small control tables showed usage of about 240,000,000.  For a modest little app, 240m is a big number.  So, what is going on?
      Well, the index usage stats basically count plans that touch the index, not the number of rows touched. So, 240m plans???  Well yes – when you use a UDF in the where clause!  Each invocation is a *plan*, not just a statement, not just a fncall.
      So it’s MUCH worse than a cursor.
      And I will soon be a hero at work when we eliminate those UDFs, which are really there more to be pretty than because of need.  And they are pretty, dammit.  Has anyone at Microsoft ever talked about changing their scalar UDF implementation to be more like the TVFs, in that the optimizer can look inside?  Microsoft basically botched the whole UDF project in SQL 2000, they got confused about stochastic and non-stochastic and never did figure out what they were really doing, … at least that’s my reading on it.

    17. Adam, i’ve been following you post for quite a long and find them of great help.
      But in my production environment we have a great UDF usage(which are multi statement table values function). They have great bussiness logic in them but they are IO HOGS. To improve in IO i mostly removed the code into the calling procedure or created an inline table function. So i suggest the developer that avoid using multi statement table valued function as much as possible.
      Your commetnts on that and reason of them hogging the IO
      Regards
      Asit

    18. @Asit: Multistatement UDFs use involve temp tables, so all of that I/O is almost certainly due to tempdb activity.
      –Adam

    19. Rewrite the original scalar UDF to the code below. It will make a difference.
      CREATE FUNCTION GetMaxProductQty_Scalar
      (
         @ProductId INT
      )
      RETURNS INT
      AS
      BEGIN
      RETURN (
      SELECT TOP 1
      MAX(sod.OrderQty)
      FROM
      Sales.SalesOrderDetail sod
      WHERE
      sod.ProductId = @ProductId
      );
      END;

    20. Adrian:
      Unfortunately doing that does not, in fact, make any difference whatsoever. The query plans will still be identical, performance implications will still be identical, and the query optimizer will still be unable to make use of various features, including parallelism.
      –Adam

    21. Very interesting and to the point, but not complete for my case. So, is there someone who can help me to higher the performance when you have to deal with a function that aggregates the results of a select in 1 line. Eg
      CREATE FUNCTION [dbo].[Get_PaymentLines]
      (@client nvarchar(255)
      ,@language nvarchar(255)
      )
      RETURNS nvarchar(500)
      AS
      BEGIN
       Declare
         @dummy nvarchar(500),
         @Line nvarchar(255)
         SET @Dummy = ”
         DECLARE Payment_cursor CURSOR
         FOR SELECT convert(nvarchar(2), line) + ‘ ‘ +  
                    convert(nvarchar(10), convert(money,amount))
             FROM payments p
      WHERE  (p.client = @client and p.language = @language)
         OPEN Payment_Cursor
         FETCH NEXT FROM Payment_Cursor INTO @Line
         WHILE @@FETCH_STATUS = 0
         BEGIN
           IF @dummy = ” SET @dummy = @Line ELSE SET @dummy = @dummy + ‘ ‘ + @Line
           FETCH NEXT FROM Payment_Cursor INTO @Line
         END
         CLOSE  Payment_Cursor
         DEALLOCATE  Payment_Cursor
         RETURN @Dummy
      END
      Because applying this kind of functions for a couple of hundred lines is very slow on SQL server.
      thanks

    22. @MarcVB:
      The "aggregation" in your case is string concatenation. And although SQL Server does not include a string concat aggregate, we can fake it very easily by creating an XML document that lacks tags:
      SELECT
      x + ‘ ‘ AS [text()]
      FROM
      (
      SELECT ‘a’
      UNION  ALL
      SELECT ‘b’
      ) AS y (x)
      FOR XML PATH(”)
      In conjunction with STUFF and a few other tricks you can better control the output. Google around for more information.

    23. @Adam:
      I was able to improve the performance of simply selecting the value returned from my scalar udf by changing it to a table udf. But what if I need to then join to another table based on the value returned from my udf? This appears to still cause a major performance hit if I try to do more than simply select the value from the udf, and try to actually use it in my table join criteria. Does your work around only help with selecting the data and if I need to use it in the table join criteria I will still face the performance woe’s of udf’s?
      Thanks for your ariticle.  Very helpful.

    24. Adam,
      Correction, I am seeing a performance improvement even when using the udf table function in my join criteria.  I just realized the same query went from 3:41 to 1:31 by changing the function from scalar to table. Didn’t realize this earlier since it it still taking over a minute and had not actually timed it yet.

    25. Very interesting post. I’m using MS SQL 2008. I’ve been researching UDFs for a while trying to optimise my queries and discovered that everybody call UDFs evil. What you and other authors say about UDF is true and helped me to understand how it all works, but still in my particular case use of UDFs made my query run significantly faster.
      In short, I have a relatively large table with logs – around 10 million rows. These logs are events that happen at approximately 2000 computers. Essentially, TableLogs has these columns:
      ID int not null (primary key, identity)
      ComputerID int not null (foreign key to the table of computers)
      EventDateTime datetime not null (when it happened)
      EventDurationMSec int not null (event duration in milliseconds)
      Usually event duration is 10 – 20 seconds, but when things go wrong it is less than 5 seconds. Events normally should happen non-stop. If there are no events for a while for a given computer – there is a problem with the computer.
      So, I have a monitoring screen that shows 2000 rows – one row for each computer. For each computer I show the timestamp of the last event that was longer than 5 seconds. Result is ordered by the timestamp desc and at the top of the list are computers that have problems.
      I could use this simple query:
      select
       ComputerID
       , max(EventDateTime) as LastTimeUp
      from
       TableLogs
      where
       ComputerID = @ParamComputerID
       EventDurationMSec >= 5000
      group by ComputerID
      order by LastTimeUp desc;
      But, no matter what indexes I created it was still scanning the whole table (or index, but still it was a scan with plenty of reads).
      Then I created scalar UDF with one parameter GetLastTimeUp(@ParamComputerID):
      select @Result = max(EventDateTime)
      from TableLogs
      where ComputerID = @ParamComputerID AND EventDurationMSec >= 5000;
      I created filtered index
      CREATE NONCLUSTERED INDEX [IX_ComputerID_EventDurationMSec] ON [dbo].[TableLogs]
      (
      [ComputerID] ASC,
      [EventDateTime] ASC
      )
      WHERE ([EventDurationMSec]>=(5000))
      ;
      And changed query to:
      select
      ComputerID
      ,  dbo.GetLastTimeUp(ComputerID) as LastTimeUp
      from TableComputers
      order by LastTimeUp desc;
      Each call to UDF now is not a scan of an index, it is an instant seek of a filtered index.
      Yes, there are 2000 calls and 2000 seeks, but it is way better than scanning through 10 million of all old events, while I need only the latest event.
      I would appreciate it if you could comment this. Thank you.

    26. Hi Vladimir,
      There is no reason you can’t get that same seek behavior with a join, subquery, or derived table expression. Did you try a FORCESEEK hint? I wonder if your indexing scheme isn’t quite in line with the way the query processor works? (do you have, e.g., one filtered index per computer ID or something like that? I guess you can’t have 2000 filtered indexes — what are you filtering on?)
      –Adam

    27. Sorry, Vladimir, I should have read your post more carefully. Now I see your filtered index.
      I think a FORCESEEK hint would work fine here. You might also try replacing the MAX() with TOP(1) … ORDER BY EventDateTime DESC. That optimizes slightly differently.
      I’m certain that a UDF is not needed here, and I’m certain that you’ll get even BETTER performance without using one.

    28. Hi Adam,
      Thank you for reply. I didn’t know about FORCESEEK hint – apparently it was added in 2008. I’ll check it out.
      At first I tried to make queries without the hint. I created test tables and UDF in tempdb and filled them with some data:
      —————————————————————————
      USE [tempdb]
      GO
      SET ANSI_NULLS ON
      GO
      SET QUOTED_IDENTIFIER ON
      GO
      CREATE TABLE [dbo].[TableComputers](
      [ID] [int] IDENTITY(1,1) NOT NULL,
      [Name] [nvarchar](255) NOT NULL,
      CONSTRAINT [PK_TableComputers] PRIMARY KEY CLUSTERED
      (
      [ID] ASC
      )WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
      ) ON [PRIMARY]
      GO
      USE [tempdb]
      GO
      /****** Object:  Table [dbo].[TableLogs]    Script Date: 07/13/2013 19:26:37 ******/
      SET ANSI_NULLS ON
      GO
      SET QUOTED_IDENTIFIER ON
      GO
      CREATE TABLE [dbo].[TableLogs](
      [ID] [int] IDENTITY(1,1) NOT NULL,
      [ComputerID] [int] NOT NULL,
      [EventDateTime] [datetime] NOT NULL,
      [EventDurationMSec] [int] NOT NULL,
      CONSTRAINT [PK_TableLogs] PRIMARY KEY CLUSTERED
      (
      [ID] ASC
      )WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
      ) ON [PRIMARY]
      GO
      ALTER TABLE [dbo].[TableLogs]  WITH CHECK ADD  CONSTRAINT [FK_TableLogs_TableComputers] FOREIGN KEY([ComputerID])
      REFERENCES [dbo].[TableComputers] ([ID])
      GO
      ALTER TABLE [dbo].[TableLogs] CHECK CONSTRAINT [FK_TableLogs_TableComputers]
      GO
      USE [tempdb]
      GO
      /****** Object:  Index [IX_ComputerID]    Script Date: 07/13/2013 19:28:16 ******/
      CREATE NONCLUSTERED INDEX [IX_ComputerID] ON [dbo].[TableLogs]
      (
      [ComputerID] ASC
      )WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
      GO
      USE [tempdb]
      GO
      /****** Object:  Index [IX_ComputerID_Events]    Script Date: 07/13/2013 19:28:27 ******/
      CREATE NONCLUSTERED INDEX [IX_ComputerID_Events] ON [dbo].[TableLogs]
      (
      [ComputerID] ASC,
      [EventDateTime] ASC
      )
      WHERE ([EventDurationMSec]>=(5000))
      WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
      GO
      — generate computers
      — 8 rows
      INSERT INTO [tempdb].[dbo].[TableComputers]
      ([Name])
      VALUES
      (‘ComputerName’)
      ,(‘ComputerName’)
      ,(‘ComputerName’)
      ,(‘ComputerName’)
      ,(‘ComputerName’)
      ,(‘ComputerName’)
      ,(‘ComputerName’)
      ,(‘ComputerName’)
      ;
      — generate 2048 rows
      DECLARE @VarCount int = 1;
      WHILE @VarCount < 9
      BEGIN
      INSERT INTO [tempdb].[dbo].[TableComputers]
      ([Name])
      SELECT
      [Name]
      FROM [tempdb].[dbo].[TableComputers]
      ;
      SET @VarCount = @VarCount + 1;
      END;
      — generate events
      — 16 rows
      INSERT INTO [tempdb].[dbo].[TableLogs]
      ([ComputerID]
      ,[EventDateTime]
      ,[EventDurationMSec])
      VALUES
      (1,’2000-01-01′,2000)
      ,(1,’2000-01-02′,11000)
      ,(1,’2000-01-03′,12000)
      ,(1,’2000-01-04′,13000)
      ,(1,’2000-01-05′,14000)
      ,(1,’2000-01-06′,15000)
      ,(1,’2000-01-07′,16000)
      ,(1,’2000-01-08′,17000)
      ,(1,’2000-01-09′,10000)
      ,(1,’2000-01-10′,11000)
      ,(1,’2000-01-11′,12000)
      ,(1,’2000-01-12′,13000)
      ,(1,’2000-01-13′,14000)
      ,(1,’2000-01-14′,15000)
      ,(1,’2000-01-15′,16000)
      ,(1,’2000-01-16′,17000)
      ;
      — generate 8192 rows
      DECLARE @VarCount int = 1;
      WHILE @VarCount < 10
      BEGIN
      INSERT INTO [tempdb].[dbo].[TableLogs]
      ([ComputerID]
      ,[EventDateTime]
      ,[EventDurationMSec])
      SELECT
      [ComputerID]
      ,DATEADD(minute, ROW_NUMBER() OVER(ORDER BY ID), [EventDateTime])
      ,[EventDurationMSec] + RAND()*1000
      FROM [tempdb].[dbo].[TableLogs]
      ;
      SET @VarCount = @VarCount + 1;
      END;
      — generate 16769024 rows
      INSERT INTO [tempdb].[dbo].[TableLogs]
      ([ComputerID]
      ,[EventDateTime]
      ,[EventDurationMSec])
      SELECT
      [dbo].[TableComputers].[ID]
      ,[dbo].[TableLogs].[EventDateTime]
      ,[dbo].[TableLogs].[EventDurationMSec]
      FROM
      [tempdb].[dbo].[TableComputers]
      CROSS JOIN [tempdb].[dbo].[TableLogs]
      WHERE [dbo].[TableComputers].[ID] > 1
      ;
      USE [tempdb]
      GO
      /****** Object:  UserDefinedFunction [dbo].[GetLastTimeUp]    Script Date: 07/13/2013 19:30:17 ******/
      SET ANSI_NULLS ON
      GO
      SET QUOTED_IDENTIFIER ON
      GO
      CREATE FUNCTION [dbo].[GetLastTimeUp]
      (
      — Add the parameters for the function here
      @ParamComputerID int
      )
      RETURNS datetime
      AS
      BEGIN
      — Declare the return variable here
      DECLARE @Result datetime;
      — Add the T-SQL statements to compute the return value here
      SELECT
      @Result = MAX(EventDateTime)
      FROM    
      dbo.TableLogs
      WHERE  
      (ComputerID = @ParamComputerID)
      AND (EventDurationMSec >= 5000)
      ;
      — Return the result of the function
      RETURN @Result;
      END
      GO
      —————————————————————————
      After I ran the scripts I rebuilt all indexes on the tables to update statistics and defragment indexes.
      And I was comparing the following queries. I used profiler to get the number of reads (SQL:BatchCompleted):
      (1)
      SELECT
         ComputerID
         , MAX(EventDateTime) as LastTimeUp
      FROM
         TableLogs
      WHERE
         EventDurationMSec >= 5000
      GROUP BY ComputerID
      ;
      GO
      (2)
      SELECT
         ID
         , dbo.GetLastTimeUp(ID) as LastTimeUp
      FROM
         dbo.TableComputers
      ;
      GO
      Results (reads from profiler):
      (1): 37,724
      (2):  6,159
      The plan for the first query shows that there is a scan of IX_ComputerID_Events.
      The plan for UDF shows that there is a seek on IX_ComputerID_Events.
      Index IX_ComputerID_Events has 37139 pages and all of them were read in (1) + some overhead.
      Each call of UDF requires 3 reads, so 3*2048 = 6144 + some overhead.
      Strictly speaking, I wasn’t comparing equivalent queries.
      The first query reads only from big TableLogs, which has 16 million rows. Result set is a list of all ComputerIDs that have events. As far as I understand, the only way to get all ComputerIDs is to scan the index. It has to read all index pages.
      In the second query I use the TableComputers, which has only 2048 rows.
      It is a kind of cheating (or optimisation), because there is no need to build a list of ComputerIDs, they are already in the small table. Also, results are different. In the second case I get a row for each ComputerID, even if it doesn’t have events (which is what I want for real data).
      So, equivalent without UDF would be something like this:
      (3)
      SELECT
         TableComputers.ID
         , (
         SELECT
             MAX(EventDateTime)
         FROM
             TableLogs
         WHERE
             EventDurationMSec >= 5000
             AND TableLogs.ComputerID = TableComputers.ID
         ) as LastTimeUp
      FROM TableComputers
      ;
      GO
      Profiler results: 37,764
      How would you rewrite the query (3) to get performance (reads) comparable to UDFs (2)?
      I don’t know how to make use of TOP(1) instead of MAX in this case (3). I can understand how to do it in UDF, but not in subquery.
      When I tried the FORCESEEK hint:
      (4)
      SELECT
         TableComputers.ID
         , (
         SELECT
             MAX(EventDateTime)
         FROM
             TableLogs WITH (FORCESEEK)
         WHERE
             EventDurationMSec >= 5000
             AND TableLogs.ComputerID = TableComputers.ID
         ) as LastTimeUp
      FROM TableComputers
      ;
      GO
      The plan didn’t use index IX_ComputerID_Events at all. It did seek on IX_ComputerID and lookup with predicate EventDurationMSec >= 5000. Obviously it is terrible.
      Profiler results: 53,508,543 reads
      Thank you for the comments.
      I would really love to find a solution better than UDF.

    29. Hi Adam,
      I tried a hint to force use the IX_ComputerID_Events together with FORCESEEK, but still it was far from UDF and worse than simple (3).
      Then I thought about your suggestion to use TOP() instead of MAX. At first I didn’t realise that it is possible to use it in subquery. So, I changed (3) into:
      (5)
      SELECT
      TableComputers.ID
      , (
      SELECT TOP(1)
      EventDateTime
      FROM
      TableLogs
      WHERE
      EventDurationMSec >= 5000
      AND TableLogs.ComputerID = TableComputers.ID
      ORDER BY EventDateTime DESC
      ) as LastTimeUp
      FROM TableComputers
      ;
      And this query has the plan that I was looking for – it is a Nested loop left join between TableComputers and seek&top of IX_ComputerID_Events index. Actual plan shows that actual number of rows and number of executions of Seek is 2048 – one for each computer ID.
      Profiler revealed that number of reads is much better than (3) with MAX:
      (3) 37,762
      (5)  6,797
      Variant (2) with UDF has 6,157 reads. Still slightly better.
      I can only guess why.
      With UDF (2) there is scan of TableComputers (11 pages), then 2048 calls of UDF, 3 reads each:
      2048*3+11 = 6155 (almost 6157 that profiler tells me)
      With (5) there are 2048 seeks, 3 reads each = 6144, so 6797-6144 = 653 reads must be these nested loops that join TableComputers and results of index seek. With UDF there is no join.
      Thank you for your suggestions and comments. You really helped me to better understand internals of SQL server.

    30. Hi Vladimir,
      I’m glad the suggestion worked for you! Sounds like you’re on the right track.
      Two comments:
       A) Don’t believe everything Profiler tells you about the UDF reads. I can’t recall the specifics offhand but I’ve seen mis-reporting issues in the past. Google around and you’ll find numerous threads on the topic on various forums.
       B) Don’t forget to consider CPU time as well.
      –Adam

    31. I know this is an ages-old post, but holy cow did it help me with something I’ve been working on.  Thank you for the walkthrough of the issue and solution!

    32. Microsoft didn’t fix this in 2016 yet.
      go sign the feedback for them to.
      https://connect.microsoft.com/SQLServer/Feedback/Details/273443
      I am wanting to use UDF’s for Dimensional loading of some 50 lookups into a procedure based upon a code for example on a single table(attributes).
      there is a lot of overhead with Scalars, Inline didn’t save me much time at all(25%)
      In memory only saves time on the actual query in the function so no gains there since I have a index that works.
      With 50 UDF lookups on a single select the inline is not pretty to look at… So I wish I could use Scalar and I may just do that and suck it up for readability since my table will never get big(over 50k) rows.
      I will limit them also by an update date so I can probably take it way down.
      There is a lot to be said for readability and simplicity.  Its nice when I can send a query to my boss for a dimension and he can read it and make sense of it.

    33. Thank you so much for this post. Using the scalar function took 30 minutes to run. After converted to table function, it took less than 7 seconds.

    LEAVE A REPLY

    Please enter your comment!
    Please enter your name here