LEAD, LAG or get OVER the way

Who needs cursors?

February 10, 2018

Introduction

This past week gave me a chance to finally learn the OVER clause, something in SQL Server since 2012, but that I hadn’t known I needed until now. It has quickly become one of the favorite items in my SQL toolkit. How did I live so long without it?

Thanks to Jason Koopmans for his helpful comments on this topic.

Running Totals

This is part of the same work I wrote about last week, where I started using a technique to turn data from our transactional systems into more of a “time series.” But of course, one of the first things people want to have once you get something like a time series are running totals, for things like “velocity” charting (my boss is big on velocity - bad physics jokes elided).

Turns out running totals in SQL are easy now, and the following is lifted straight from StackOverflow. Let’s say you have a date, LastTouched, and a numeric column, in this case, Total, and you want to have a running total of Total as LastTouched increases:

Running totals
LastTouched Total RunningTotal
2018-02-01 5 5
2018-02-02 3 8
2018-02-06 4 12

Diff’ing Row Versions

In the same week, for an unrelated project, I was asked to show differences between rows in a history table. For some applications we have mirror history tables that get the entire row’s data inserted on each UPDATE via triggers (I know, how 1990s). But one of the primary uses for this data is to see precisely what changed on each update. It turns out that LAG and LEAD (both also introduced in SQL Server 2012) can be used for precisely this purpose.

To do side-by-side “diffs,” with pairs of columns showing current and previous values, you can simply do something like this:

Which yields results like:

Side-by-side diffs
Id ModifiedOnDate Customer SaleAmount PreviousSaleAmount GlassesSold PreviousGlassesSold and so on…
1 2018-02-01 Fred 5.00 0.00 5 0 and so on…
1 2018-02-02 Fred 3.00 5.00 3 5 and so on…
1 2018-02-06 Fred 4.00 3.00 4 3 and so on…


Pretty cool. But it makes a really wide result set, with 2x the number of columns as the original, and it is still hard to “eyeball” and see what specifically changed from one row to the next. So instead, we want a “version history.” One way is to show a row per changed value, grouped by the modified timestamp:

Note casting the numeric (and any DATETIME) columns to VARCHAR. This is to make all the “diff” rows play nicely in the UNIONs regardless of each column’s data type.

The above view then produces results similar to this:

Version history
Id ModifiedOnDate Customer Field CurrentValue PreviousValue
1 2018-02-01 Fred SaleAmount 5.00 0
1 2018-02-01 Fred GlassesSold 5 0
1 2018-02-02 Fred SaleAmount 3.00 5.00
1 2018-02-02 Fred GlassesSold 3 5
1 2018-02-06 Fred SaleAmount 4.00 3.00
1 2018-02-06 Fred GlassesSold 4 3


Creating all those UNION statements may be a bit onerous, and I am still thinking of a better way to do the same in SQL without resorting to a PIVOT, although ultimately that may be the way to go.

Combining Running Totals With LAG

Now back to the first example about running totals:

Running totals
LastTouched Total RunningTotal
2018-02-01 5 5
2018-02-02 3 8
2018-02-06 4 12


Note there are days where there are gaps, i.e., nothing happened on that day, so there isn’t a row for that date in the running total results. For your purposes, that may be OK. But what if you want to have a running total for each day, even if it doesn’t change? Using the “exploded dates” table created in the first post, you can do something like the following, combining using OVER for running totals plus LAG. For this example I assume there will never be a gap of more than 30 days:

WITH RunningSalesTotals
AS
(
    SELECT
        LastTouched,
        Total,
        SUM(Total) OVER(ORDER BY LastTouched ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS RunningTotal
    FROM LemonadesSoldByDate
)
SELECT
    ED.TheDate,
    COALESCE(
        RST.RunningTotal,
        LAG(RST.RunningTotal, 1) OVER(ORDER BY ED.TheDate),
        LAG(RST.RunningTotal, 2) OVER(ORDER BY ED.TheDate),
        LAG(RST.RunningTotal, 3) OVER(ORDER BY ED.TheDate),
        LAG(RST.RunningTotal, 4) OVER(ORDER BY ED.TheDate),
        LAG(RST.RunningTotal, 5) OVER(ORDER BY ED.TheDate),
        LAG(RST.RunningTotal, 6) OVER(ORDER BY ED.TheDate),
        LAG(RST.RunningTotal, 7) OVER(ORDER BY ED.TheDate),
        LAG(RST.RunningTotal, 8) OVER(ORDER BY ED.TheDate),
        LAG(RST.RunningTotal, 9) OVER(ORDER BY ED.TheDate),
        LAG(RST.RunningTotal, 10) OVER(ORDER BY ED.TheDate),
        LAG(RST.RunningTotal, 11) OVER(ORDER BY ED.TheDate),
        LAG(RST.RunningTotal, 12) OVER(ORDER BY ED.TheDate),
        LAG(RST.RunningTotal, 13) OVER(ORDER BY ED.TheDate),
        LAG(RST.RunningTotal, 14) OVER(ORDER BY ED.TheDate),
        LAG(RST.RunningTotal, 15) OVER(ORDER BY ED.TheDate),
        LAG(RST.RunningTotal, 16) OVER(ORDER BY ED.TheDate),
        LAG(RST.RunningTotal, 17) OVER(ORDER BY ED.TheDate),
        LAG(RST.RunningTotal, 18) OVER(ORDER BY ED.TheDate),
        LAG(RST.RunningTotal, 19) OVER(ORDER BY ED.TheDate),
        LAG(RST.RunningTotal, 20) OVER(ORDER BY ED.TheDate),
        LAG(RST.RunningTotal, 21) OVER(ORDER BY ED.TheDate),
        LAG(RST.RunningTotal, 22) OVER(ORDER BY ED.TheDate),
        LAG(RST.RunningTotal, 23) OVER(ORDER BY ED.TheDate),
        LAG(RST.RunningTotal, 24) OVER(ORDER BY ED.TheDate),
        LAG(RST.RunningTotal, 25) OVER(ORDER BY ED.TheDate),
        LAG(RST.RunningTotal, 26) OVER(ORDER BY ED.TheDate),
        LAG(RST.RunningTotal, 26) OVER(ORDER BY ED.TheDate),
        LAG(RST.RunningTotal, 28) OVER(ORDER BY ED.TheDate),
        LAG(RST.RunningTotal, 29) OVER(ORDER BY ED.TheDate),
        LAG(RST.RunningTotal, 30) OVER(ORDER BY ED.TheDate),
        0) AS RunningTotalClosed -- 0 if there are NULLs at the start of the series
FROM ExplodedDates ED
LEFT OUTER JOIN RunningSalesTotals RST ON
        ED.TheDate = RST.LastTouched
WHERE ED.TheDate >= '2018-02-01'
        AND ED.TheDate <= '2018-02-07'
Running totals with no date gaps
LastTouched RunningTotal
2018-02-01 5
2018-02-02 8
2018-02-03 8
2018-02-04 8
2018-02-05 8
2018-02-06 12
2018-02-07 12