Dec 27, 2014

Aggregate values Group BY max sum query performance | Optimization | Tuning SQL SERVER



Query performance tuning and optimization of GROUP BY, MAX, and SUM etc. Aggregate values from same set of tables SQL Server


We often come across a scenario where we need multiple aggregate values from a single table or from multiple tables using join query. For example:

DECLARE @ntPostCount5 AS INT
DECLARE @ntPostCount5_10 AS INT
DECLARE @ntPostCount10 AS INT

SELECT @ntPostCount5 = COUNT(*)
FROM tblPost 
WHERE ntPostType < 5

SELECT @ntPostCount5_10 =COUNT(*)
FROM tblPost 
WHERE ntPostType BETWEEN 5 AND 10

SELECT @ntPostCount10 = COUNT(*)
FROM tblPost 
WHERE ntPostType > 10

SELECT
     @ntPostCount5 ntPostCount5,
     @ntPostCount5_10 ntPostCount5_10,
     @ntPostCount10 ntPostCount10

Total logical read: 724
Some developers also write below equivalent script:

SELECT
     COUNT(CASE WHEN ntPostType < 5 THEN ntPostID END) AS ntPostCount5,
     COUNT(CASE WHEN ntPostType BETWEEN 5 AND 10 THEN ntPostID END) AS ntPostCount5_10,
     COUNT(CASE WHEN ntPostType > 5 THEN ntPostID END) AS ntPostCount10
FROM tblPost
 Total logical read: 746

From logical read point of view first script seems little better than second. But script is referencing the same table 3 times while second script only one! This difference is due to first script is able to use index efficiently (Index seek). But here I am going to recommend you follow approach two that is second script. Why? Let assume a scenario where there are not proper indexes on tables or tables have proper indexes but filter condition are not able use to index efficiently.  For example:

Approach one:
DECLARE @ntPostCount5_6 AS INT
DECLARE @ntPostCount7_8 AS INT
DECLARE @ntPostCount9_10 AS INT

SELECT @ntPostCount5_6 = COUNT(*)
FROM tblPost 
WHERE ntPostType <> 5 OR ntPostType <> 6

SELECT @ntPostCount7_8 =COUNT(*)
FROM tblPost 
WHERE ntPostType <> 7 OR ntPostType <> 8

SELECT @ntPostCount9_10 = COUNT(*)
FROM tblPost 
WHERE ntPostType <> 9 OR ntPostType <> 10

SELECT
     @ntPostCount5_6 ntPostCount5_6,
     @ntPostCount7_8 ntPostCount7_8,
     @ntPostCount9_10 ntPostCount9_10


Total logical read: 2238


Approach Two

SELECT
     COUNT(CASE WHEN ntPostType <> 5 OR ntPostType <> 6 THEN ntPostID END) AS ntPostCount5_6,
     COUNT(CASE WHEN ntPostType <> 7 OR ntPostType <> 8 THEN ntPostID END) AS ntPostCount7_8,
     COUNT(CASE WHEN ntPostType <> 9 OR ntPostType <> 10 THEN ntPostID END) AS ntPostCount9_10
FROM tblPost
Total logical read: 746

In this scenario approach two in much better than approach one.  So here conclusion is if all individual queries are able to use index efficiently then follow approach one otherwise approach two would much better. Before taking any decision must compare the query cost.

No comments:

Post a Comment