9 Joins
9.1 Counting number of subscribers
If you’re working with subscription data, one frequent question is “how many subscribtions we had each month?”. This is a surprisingly difficult question to answer because you need to count the number of subscribers at the end of each month. This is a great example of a problem that is easy to solve in SQL but cumbersome on the BI end1.
The query is a two step process:
- We join a list of months to the subscriptions table in order to “explore” the table at the level of months. Instead of having one row be a subscription, one row is a subscription in a given month. For example, a subscription that was active for three months would show up as three rows.
- Aggregation - now that the data is at the level of months, we can count the number of subscriptions in each month.
with months as (
select distinct DATE(date, 'start of month') date
from calendar
)
select date, count(*)
from months
inner join subscriptions
on date between date_from and date_to
group by date9.2 Rolling “As of” Joins
Rolling joins are essentially non-equi joins that only return the “closest” row. R users may be familliar with this technique as Rolling Joins, DuckDB and Snowflake refers to these as “ASOF” joins: https://duckdb.org/docs/stable/guides/sql_features/asof_join.html
These operations are incredibly useful when working with attribution. For example, two customer representatives performed a sale and marked it in the CRM but you need to attribute the sale to the sale that was performed first.