# Building a Pivot Table in SQLite

Suppose we have a `sales` table with product incomes for the years 2020-2023:

``````┌─────────┬──────┬────────┐
│ product │ year │ income │
├─────────┼──────┼────────┤
│ alpha   │ 2020 │ 100    │
│ alpha   │ 2021 │ 120    │
│ alpha   │ 2022 │ 130    │
│ alpha   │ 2023 │ 140    │
│ beta    │ 2020 │ 10     │
│ beta    │ 2021 │ 20     │
│ beta    │ 2022 │ 40     │
│ beta    │ 2023 │ 80     │
│ gamma   │ 2020 │ 80     │
│ gamma   │ 2021 │ 75     │
│ gamma   │ 2022 │ 78     │
│ gamma   │ 2023 │ 80     │
└─────────┴──────┴────────┘
``````

And we want to transform it into a so-called pivot table, where products serve as rows and years serve as columns:

``````┌─────────┬──────┬──────┬──────┬──────┐
│ product │ 2020 │ 2021 │ 2022 │ 2023 │
├─────────┼──────┼──────┼──────┼──────┤
│ alpha   │ 100  │ 120  │ 130  │ 140  │
│ beta    │ 10   │ 20   │ 40   │ 80   │
│ gamma   │ 80   │ 75   │ 78   │ 80   │
└─────────┴──────┴──────┴──────┴──────┘
``````

Some DBMS, like SQL Server, have a custom `pivot` operator to do this. SQLite does not. Still, there are multiple ways to solve the problem. Let’s examine them.

## 1. Filtered totals

Let’s manually extract each year in a separate column and calculate a filtered total income for that year:

``````select
product,
sum(income) filter (where year = 2020) as "2020",
sum(income) filter (where year = 2021) as "2021",
sum(income) filter (where year = 2022) as "2022",
sum(income) filter (where year = 2023) as "2023"
from sales
group by product
order by product;
``````

Here is our pivot table:

``````┌─────────┬──────┬──────┬──────┬──────┐
│ product │ 2020 │ 2021 │ 2022 │ 2023 │
├─────────┼──────┼──────┼──────┼──────┤
│ alpha   │ 100  │ 120  │ 130  │ 140  │
│ beta    │ 10   │ 20   │ 40   │ 80   │
│ gamma   │ 80   │ 75   │ 78   │ 80   │
└─────────┴──────┴──────┴──────┴──────┘
``````

This universal method works in every DBMS, not only SQLite. Even if your DB engine does not support `filter`, you can always resort to using `case`:

``````select
product,
sum(case when year = 2020 then income end) as "2020",
sum(case when year = 2021 then income end) as "2021",
sum(case when year = 2022 then income end) as "2022",
sum(case when year = 2023 then income end) as "2023"
from sales
group by product
order by product;
``````
explain

Human-verified explanation

This SQL query shows the total income of each product for each year.

The first line of the query selects the `product` column from the table.

The next four lines use the `sum` function to calculate the total `income` for each year (2020, 2021, 2022, and 2023) using a `case` statement to only include income values where the `year` matches the specified year. The `as` keyword is used to give each calculated sum a column alias that corresponds to the year.

The `from` clause specifies that the data is being selected from the `sales` table.

The `group by` clause groups the data by `product`, which means that the query will return one row for each unique `product` value in the `sales` table.

Finally, the `order by` clause orders the results by `product` in ascending order.

Using `filter` is probably the easiest way when we know the columns in advance. But what if we don’t?

## 2. Dynamic SQL

Let’s build our query dynamically without hardcoding year values:

1. Extract distinct year values.
2. Generate a `...filter (where year = X) as "X"` query line for each year.
``````with years as (
select distinct year as year
from sales
),
lines as (
select 'select product ' as part
union all
select ', sum(income) filter (where year = ' || year || ') as "' || year || '" '
from years
union all
select 'from sales group by product order by product;'
)
select group_concat(part, '')
from lines;
``````

This query returns the same SQL we wrote manually at the previous step (minus formatting):

``````select product , sum(income) filter (where year = 2020) as "2020" , sum(income) filter (where year = 2021) as "2021" , sum(income) filter (where year = 2022) as "2022" , sum(income) filter (where year = 2023) as "2023" from sales group by product order by product;
``````

Now we have to execute it. For that, let’s use the `eval(sql)` function available as part of the `define` extension:

``````select load_extension('./define');

with years as (
select distinct year as year
from sales
),
lines as (
select 'drop view if exists v_sales; ' as part
union all
select 'create view v_sales as '
union all
select 'select product '
union all
select ', sum(income) filter (where year = ' || year || ') as "' || year || '" '
from years
union all
select 'from sales group by product order by product;'
)
select eval(group_concat(part, ''))
from lines;
``````
explain

Human-verified explanation

This SQL query creates a view named `v_sales` that shows the total income of each product for each year. The query uses a common table expression (CTE) to generate a list of distinct years from the `sales` table. Then, it defines another CTE named `lines` that contains a series of unioned select statements to generate the SQL commands to create the view.

The first select statement in `lines` generates a `drop view if exists` command to ensure that any previous version of the view is removed before creating the new one.

The second select statement generates a `create view` command to create the `v_sales` view.

The third select statement generates a `select` command to select the product column from the `sales` table.

The fourth select statement generates a `sum` command that sums the income for each product for a given year. It uses the `filter` clause to filter the sum by year. The year value is passed in from the `years` CTE using string concatenation.

The fifth select statement specifies the `from` and `group by` clauses for the SQL statement. It groups the results by product and orders them by product.

Finally, the last select statement concatenates all the SQL commands generated in the `lines` CTE using the `group_concat` function and evaluates the resulting SQL string using the `eval` function. The resulting SQL commands create the `v_sales` view with the desired structure.

Here, we are building a `v_sales` view which executes the query we’ve constructed previously. Let’s select the data from it:

``````select * from v_sales;
``````
``````┌─────────┬──────┬──────┬──────┬──────┐
│ product │ 2020 │ 2021 │ 2022 │ 2023 │
├─────────┼──────┼──────┼──────┼──────┤
│ alpha   │ 100  │ 120  │ 130  │ 140  │
│ beta    │ 10   │ 20   │ 40   │ 80   │
│ gamma   │ 80   │ 75   │ 78   │ 80   │
└─────────┴──────┴──────┴──────┴──────┘
``````

Note: extensions do not work in the playground, so you’ll have to use your local SQLite to reproduce this step.

Nice!

## 3. Pivot extension

If dynamic SQL seems too much for you, there is a more straightforward solution — the `pivotvtab` extension.

With it, we only have to provide three selects to build a pivot:

1. Select row values.
2. Select column values.
3. Select cell values.
``````select load_extension('./pivotvtab');

create virtual table v_sales using pivot_vtab (
-- rows
(select distinct product from sales),
-- columns
(select distinct year, year from sales),
-- cells
(select sum(income) from sales where product = ?1 and year = ?2)
);
``````
explain

Human-verified explanation

This SQL query creates a virtual table called `v_sales` using the `pivot_vtab` module from the `pivotvtab` extension. A virtual table is a special type of table that does not store data on disk, but rather generates it on the fly based on a query.

The `pivot_vtab` command takes three arguments:

1. A subquery that specifies the rows of the virtual table. In this case, the subquery selects all distinct values from the `product` column of the `sales` table.
2. A subquery that specifies the columns of the virtual table. In this case, the subquery selects all distinct values from the `year` column of the `sales` table, and duplicates them. This is because the `pivot_vtab` module expects two values for each column, one for the column value and one for the column name. By duplicating the `year` value, we are effectively using it as both the value and name for the column.
3. A subquery that specifies the values of the virtual table. In this case, the subquery selects the sum of the `income` column from the `sales` table for a specific `product` and `year` combination. The `?1` and `?2` placeholders in the subquery represent parameters that will be filled in at runtime with the actual values for `product` and `year`.

Overall, this query creates a virtual table that pivots the data from the `sales` table, with the `product` values as rows, the `year` values as columns, and the sum of `income` as the cell values.

The extension does the rest:

``````select * from v_sales;
``````
``````┌─────────┬──────┬──────┬──────┬──────┐
│ product │ 2020 │ 2021 │ 2022 │ 2023 │
├─────────┼──────┼──────┼──────┼──────┤
│ alpha   │ 100  │ 120  │ 130  │ 140  │
│ beta    │ 10   │ 20   │ 40   │ 80   │
│ gamma   │ 80   │ 75   │ 78   │ 80   │
└─────────┴──────┴──────┴──────┴──────┘
``````

That’s even easier than the `pivot` operator in SQL Server!

## Summary

There are three ways to build a pivot table in SQLite:

1. Using plain SQL and `sum()` + `filter` expressions.
2. Building and executing a dynamic query with the `eval()` function.
3. Utilizing the `pivotvtab` extension.

P.S. Interested in using SQL for data analytics? Check out my book — SQL Window Functions Explained