11  Pivoting and Unpivoting

At this point it’s likely you’re using a database that supports pivoting and unpivoting but it’s good to know how to do it yourself.

12 Pivoting

The most basic way to pivot is to use a CASE statement for each column you want to pivot.


select 
  ts_id,
  sum(case when ts_year = 2020 then ts_value end) as ts_2020,
  sum(case when ts_year = 2021 then ts_value end) as ts_2021
from yearly_values_long
group by ts_id

13 Advanced Pivoting

Why I like pivoting in SQL is that I can create arbitrary case when statements to control how my data is pivoted:


select 
  ts_id,
  sum(case when ts_year = 2020 and ts_value > 0.5 then ts_value end) as ts_2020,
  sum(case when ts_year = 2021 and ts_value > 0.5 then ts_value end) as ts_2021
from yearly_values_long
group by ts_id

14 Unpivoting

I wish you don’t ever need to unpivot manually in SQL. A universal way to unpivot in SQL is to take each column of interest and do a UNION ALL.


select 
  ts_id, 
  '2020' as ts_year, 
  [2020] as ts_value
from yearly_values_wide
union all
select 
  ts_id, 
  '2021' as ts_year, 
  [2021] as ts_value
from yearly_values_wide

15 References

https://sqlperformance.com/2019/09/t-sql-queries/t-sql-pitfalls-pivoting-unpivoting

16 Pivoting and Unpivoting

Pivoting refers to a transformation where you take two columns: one containing values that will represent the names of your columns and another containing the values that you want those columns to contain. In a sense, we’re making our data wider (dplyr’s R refers to pivoting and unpivoting as pivot_wider and pivot_longer, respectively). Accordingly, unpivoting collects your column names into one column and the contents of those columns into another.

Chances are that your tooling allows you to perform these operations. Unless, of course, you are an unlucky fellow or you prefer to roll your own code.

16.1 The Data

The data we’re using looks like this:

select *
from sport_interests 
limit 5
5 records
Name Activity Number
Charlie Dancing 65.35
Charlie Basketball 38.10
Jack Basketball 36.19
Bob Hiking 53.84
Frank Running 74.29

16.2 Pivoting

Pivoting in SQL looks like this:

select 
  name,
  sum(case when Activity = 'Running' then Number end) as running,
  sum(case when Activity = 'Tennis' then Number end) as tennis
from sport_interests
group by name
Displaying records 1 - 10
Name running tennis
David NA NA
Jack NA NA
Isabel NA 60.97
Alice NA NA
Bob NA NA
Emma 29.54 86.50
Henry 96.27 NA
Charlie NA NA
Frank 101.59 NA
Grace NA NA

If you need to pivot into a lot of columns, you are going to have a bad time. First, you need to specify each column that is not being pivoted into the select statement as well as the group by clause. Second, you need to manually type out each column that appears in the pivot.

Nonetheless, I love pivoting in SQL because you can individually define how each column is pivoted. Maybe you need to average one column, but sum another? Maybe you want to add two conditional statements when you are pivoting Activity = 'Running'? I’ve done my fair share of intricate SQL pivots that would have taken me much more time in other languages.

16.3 Unpivoting

Unpivoting is a little tricky in SQL. Essentially, we’ll be creating multiple select statements for each column we would like to unpivot, with each select statement containing a different column:

select 
  name,
  'Running' as Activity,
  running as Number
from sport_interests_pivot

UNION ALL

select 
  name,
  'Tennis' as Activity,
  tennis as Number
from sport_interests_pivot
Displaying records 1 - 10
Name Activity Number
Charlie Running NA
Jack Running NA
Henry Running 96.27
David Running NA
Frank Running 101.59
Grace Running NA
Bob Running NA
Emma Running 29.54
Isabel Running NA
Alice Running NA

I don’t love unpivoting in SQL. When pivoting, a single line of code translates into a column, whereas a single column translates into a whole separate SQL statement. Please use some framework or an engine that supports unpivoting through a function (e.g. Snowflake’s UNPIVOT or dbt’s unpivot macro).