9  Joins

9.1 Counting number of subscribers

If you’re working with subscription data, one frequent question is “how many subscribtions we had each month?”. This is a surprisingly difficult question to answer because you need to count the number of subscribers at the end of each month. This is a great example of a problem that is easy to solve in SQL but cumbersome on the BI end1.

The query is a two step process:

  • We join a list of months to the subscriptions table in order to “explore” the table at the level of months. Instead of having one row be a subscription, one row is a subscription in a given month. For example, a subscription that was active for three months would show up as three rows.
  • Aggregation - now that the data is at the level of months, we can count the number of subscriptions in each month.

with months as (
            select distinct DATE(date, 'start of month') date
            from calendar
           )
           select date, count(*)
           from months
           inner join subscriptions
            on date between date_from and date_to
           group by date

9.2 Rolling “As of” Joins

Rolling joins are essentially non-equi joins that only return the “closest” row. R users may be familliar with this technique as Rolling Joins, DuckDB and Snowflake refers to these as “ASOF” joins: https://duckdb.org/docs/stable/guides/sql_features/asof_join.html

These operations are incredibly useful when working with attribution. For example, two customer representatives performed a sale and marked it in the CRM but you need to attribute the sale to the sale that was performed first.


  1. In Power BI, it’s a four step process.↩︎