10 Queries with Dates
11 Intersecting Dates
Let’s say you have a table of subscriptions that all different start and end dates. How would you filter down a list of subscriptions to show those that were active within a time range? In other words, how do you find rows that have intersecting dates?
For example, here are all subscribers that had active subscriptions in 2023:
select *
from subscribers
where subscription_valid_from <= '2023-12-31'
and subscription_valid_to >= '2023-01-01'This query works when thinking in terms of sets. A subscription whose start date is later than our range’s end date is not in scope (i.e. date_from > ‘2023-12-31’). So we can write the inverse of this, i.e. date_from <= ‘2023-12-31’. The same goes for subscriptions that end before our range of interest.
12 Counting Active Date Ranges
If you have a SCD2 type dimension like subscriptions, a common question might be to provide the number of active subscriptions for each day, week, month or year. It’s best to use a calendar table like this:
select
date,
count(*) as number_of_subscribers
from calendar
inner join subscribers
on date >= subscription_valid_from
and date <= subscription_valid_to
group by date
You can also build a query without using a calendar table:
with d as (
select validfrom as dte, 1 as inc
from t
union all
select validto, -1
from t
)
select dte, sum(sum(inc)) over (order by dte)
from d
group by dte
order by dte;13 Calculating date ranges based on gaps
Let’s say we have subscriptions but we need to show a start date and an end date of gaps between subscriptions. For example, if I subscribed from 2023-01-01 to 2023-05-31 and then from 2023-07-01 to 2023-12-31, I would want to return a row that said I was not a subscriber from 2023-06-01 to 2023-06-30.
SELECT
seqval + 1 AS start_range,
(
SELECT
MIN(B.seqval)
FROM dbo.NumSeq AS B
WHERE B.seqval > A.seqval
) - 1 AS end_range
FROM dbo.NumSeq AS A
WHERE NOT EXISTS (
SELECT * FROM dbo.NumSeq AS B
WHERE B.seqval = A.seqval + 1)
AND seqval < (SELECT MAX(seqval) FROM dbo.NumSeq);This solution is based on subqueries. In order to understand it you should first focus on the filtering activity in the WHERE clause and then proceed to the activity in the SELECT list. The purpose of the NOT EXISTS predicate in the WHERE clause is to filter only points that are a point before a gap. You can identify a point before a gap when you see that for such a point, the value plus 1 doesn’t exist in the sequence. The purpose of the second predicate in the WHERE clause is to filter out the maximum value from the sequence because it represents the point before infinity, which does not concern us.
14 Sessionization
15 Islands Problem
SELECT
MIN(seqval) AS start_range,
MAX(seqval) AS end_range
FROM (
SELECT
seqval,
seqval - ROW_NUMBER() OVER (ORDER BY seqval) AS grp
FROM dbo.NumSeq
) AS D GROUP BY grp;