Window functions are one of SQL’s most powerful resources, but they are not frequently used by the average SQL developer. In this article, we will explain how you can define different kinds of window frames using the OVER
clause.
The OVER
clause is essential to SQL window functions. Like aggregation functions, window functions perform calculations based on a set of records – e.g. finding the average salary across a group of employees.
The best way to learn window functions is our interactive Window Functions course. There are 218 exercises that will teach you how window functions work, what functions there are, and how to apply them to realworld problems. You only need a web browser and some basic SQL knowledge.
In some cases, aggregate functions cannot be used because they collapse all the individual records into a group; this makes it impossible to refer to specific values (like one employee’s salary out of the group). In these situations, window functions are preferred because they do not collapse rows; you can refer to a column value at the row level as well as the aggregate value.
There are other scenarios where window functions are useful. For example, we may need to do arithmetic involving an individual column and a calculation based on a set of rows. A realworld instance of this is calculating the difference between the average department salary and the salary of each employee in the department.
When using window functions, the definition of the set of records where the function will be calculated is critical. This set of records is called the window frame; we define it using the SQL OVER
clause.
Throughout this article, we will demonstrate SQL queries using the database of a small luxury watch company. The company stores their sales information in a table called sales
:
sale_day  sale_month  sale_time  branch  article  quantity  revenue 

20210811  AUG  11:00  NewYork  RolexP1  1  3000.00 
20210814  AUG  11:20  NewYork  RolexP1  2  6000.00 
20210817  AUG  10:00  Paris  Omega100  3  4000.00 
20210819  AUG  10:00  London  Omega100  1  1300.00 
20210717  JUL  09:30  Paris  CartierA1  1  2000.00 
20210711  JUL  10:10  NewYork  CartierA1  1  2000.00 
20210710  JUL  11:40  London  Omega 100  2  2600.00 
20210715  JUL  10:30  London  Omega100  3  4000.00 
The window frame is a set of rows that depends on the current row; thus, the set of rows could change for each row processed by the query. We define window frames using the OVER
clause. The syntax is:
OVER ([PARTITION BY columns] [ORDER BY columns])
The PARTITION BY
subclause defines the criteria that the records must satisfy to be part of the window frame. In other words, PARTITION BY
defines the groups into which the rows are divided; this will be clearer in our next example query. Finally, the ORDER BY
clause defines the order of the records in the window frame.
Let’s see the SQL OVER
clause in action. Here’s a simple query that returns the total quantity of units sold for each article.
SELECT sale_day, sale_time, branch, article, quantity, revenue, SUM(quantity) OVER (PARTITION BY article) AS total_units_soldFROM sales
This query will show all the records of the sales
table with a new column displaying the total number of units sold for the relevant article. We can obtain the quantity of units sold using the SUM
aggregation function, but then we couldn’t show the individual records.
In this query, the OVER PARTITION BY
article subclause indicates that the window frame is determined by the values in the article
column; all records with the same article
value will be in one group. Below, we have the result of this query:
sale day  saletime  branch  article  quantity  revenue  total units sold 

20210711  10:10  NewYork  CartierA1  1  2000.00  2 
20210717  9:30  Paris  CartierA1  1  2000.00  2 
20210819  10:00  London  Omega100  1  1300.00  9 
20210715  10:30  London  Omega100  3  4000.00  9 
20210817  10:00  Paris  Omega100  3  4000.00  9 
20210710  11:40  London  Omega100  2  2600.00  9 
20210811  11:00  NewYork  RolexP1  1  3000.00  3 
20210814  11:20  NewYork  RolexP1  2  6000.00  3 
The total_units_sold
column of the report was obtained by the expression:
SUM(quantity) OVER (PARTITION BY article) total_units_sold
For those readers who want to go deeper into the subject, I suggest the following two articles: What Is The Difference Between GROUP BY and PARTITION BY and Window Functions in SQL Server: Part One: The OVER() Clause
The SQL OVER Clause In Action
For each article, suppose we want to compare the total quantity of this article sold in each month of 2021 with the total quantity of this article sold in the whole year. To do this, we will create a simple report with the columns article
, month
, units_sold_month
and units_sold_year
. The query is:
SELECT DISTINCT article, EXTRACT('month' FROM sale_day) AS month, SUM(quantity) OVER (PARTITION BY article, sale_month) AS units_sold_month, SUM(quantity) OVER (PARTITION BY article) AS units_sold_yearFROM sales WHERE EXTRACT('YEAR' FROM sale_day) = 2021ORDER BY article, month
The results of the query are:
article  month  units_sold_month  units_sold_year 

CartierA1  7  2  2 
Omega100  7  5  9 
Omega100  8  4  9 
RolexP1  8  3  3 
Here, we calculated the total units sold using two different grouping granularities: month and year. The first OVER
clause ...
OVER (PARTITION BY article, sale_month)
… allows us to obtain the number of units of each article sold in a month. The second OVER
clause ...
OVER (PARTITION BY article)
… allows us to compute the total number of units of a given article sold in the whole year.
In the next query, we will just add the column month_percentage
to show the percentage that a specific month occupies within the yearly total. We can calculate it using the following query:
SELECT DISTINCT article, EXTRACT('month' FROM sale_day) as month, SUM(quantity) OVER (PARTITION BY article, sale_month) AS units_sold_month, SUM(quantity) OVER (PARTITION BY article) AS units_sold_year, ( ( SUM(quantity) OVER (PARTITION BY article, sale_month)::decimal / SUM(quantity) OVER (PARTITION BY article)::decimal ) * 100 ) AS month_percentageFROM sales WHERE extract('YEAR' FROM sale_day) = 2021ORDER BY article, month
Note that in the previous query, we didn’t use different OVER
clauses; we just reused the same window functions and calculated a percentage. You can see the results below:
article  month  units_sold_month  units_sold_year  month_percentage 

CartierA1  7  2  2  100.00 
Omega100  7  5  9  55.55 
Omega100  8  4  9  45.44 
RolexP1  8  3  3  100.00 
We will now create a different report that looks at how various branches are performing. We want to see the columns branch
and month
. We also need calculations to obtain the:
 Total revenue for that month.
 Revenue grouped by branch and month.
 Monthly average branch revenue.
 Difference between each branch’s revenue and the monthly average revenue.
SELECT DISTINCT branch, EXTRACT('month' FROM sale_day) AS month, SUM(revenue) OVER (PARTITION BY sale_month) AS total_revenue_month, SUM(revenue) OVER (PARTITION BY branch, sale_month) AS branch_revenue_month,  Next column is the branch average revenue in the current month ( SUM(revenue) OVER (PARTITION BY sale_month)::decimal / (SELECT COUNT(DISTINCT branch) FROM sales)::decimal ) AS average_month_branch,  Next column is the difference between branch revenue and average branch revenue SUM(revenue) OVER (PARTITION BY branch, sale_month)  ( SUM(revenue) OVER (PARTITION BY sale_month)::decimal / (SELECT COUNT(DISTINCT branch) FROM sales)::decimal ) AS gap_branch_averageFROM sales WHERE extract('YEAR' from sale_day) = 2021ORDER BY branch, month
Once again we used only two OVER
clauses, but we used different arithmetic expressions to obtain certain values. We used ...
SUM(revenue) OVER (PARTITION BY sale_month)
… to calculate the total revenue of the month, but we also used it in an arithmetic expression to obtain the monthly average branch revenue.
We used ...
SUM(revenue) OVER (PARTITION BY branch, sale_month)
… to calculate the monthly branch revenue and the difference between that branch’s monthly revenue and the average.
The next table is the query result. Notice that the gap_branch_average
column can contain positive or negative numbers. A negative number indicates this branch’s monthly revenue was less than the average revenue.
Branch  Month  total_revenue_month  branch_revenue_month  average_month_branch  gap_branch_average 

London  7  10600  6600  3533.33  3066.66 
London  8  14300  1300  4766.66  3466.66 
NewYork  7  10600  2000  3533.33  1533.33 
NewYork  8  14300  9000  4766.66  4233.33 
Paris  7  10600  2000  3533.33  1533.33 
Paris  8  14300  4000  4766.66  766.66 
For additional information on window functions in SQL, I suggest SQL Window Function Example With Explanations, an entry levelarticle on window functions. For more advanced readers, How to Rank Rows Within a Partition in SQL shows how to create rankings in your reports using the RANK()
window function.
The OVER Clause and Analytical Window Functions
In the previous queries, we used window functions to compare monthly numbers (revenue and units sold, respectively) with yearly figures. In this section, we will use ordered window frames, which allows us to pick a record in the frame based on its position. For example, we can choose the first record in the window frame, or the record previous to the current record, or the record after the current record. These analytical window functions provide great expressive power to SQL.
In the following query, we will show the revenue increase/decrease for the same branch in two contiguous months. To do that, we need to calculate the difference between the current month’s revenue and the previous month's revenue. This requires the analytic window function LAG()
, which can obtain a column value from a row previous to the current row.
WITH branch_month_sales AS (SELECT DISTINCT branch, EXTRACT('MONTH' FROM sale_day) AS month, SUM(revenue) OVER (PARTITION BY branch, sale_month ) AS revenueFROM sales)SELECT branch, month,revenue AS revenue_current_month,LAG(revenue) OVER (PARTITION BY branch ORDER BY month) AS revenue_prev_month,revenue  LAG(revenue) OVER (PARTITION BY branch ORDER BY month) AS revenue_deltaFROM branch_month_salesORDER BY branch, month
In this query, we used a common table expression (CTE) called branch_month_sales
to store the total revenue of each branch and month. Then we wrote a second query that uses the LAG()
window function to obtain the revenue of the previous month (using the info from branch_month_sales
). Note that the window frame is ordered by month.
Here are the results:
Branch  Month  revenue_current_month  revenue_prev_month  revenue_delta 

London  7  6600  null  null 
London  8  1300  6600  5300 
NewYork  7  2000  null  null 
NewYork  8  9000  2000  7000 
Paris  7  2000  null  null 
Paris  8  4000  2000  2000 
In all the queries shown in this article, we’ve used just a few window functions. There are quite a few other window functions in SQL. Here’s a list of each one:
function  syntax  return value 

AVG()  AVG(expression)  The average within the OVER partition. 
COUNT()  COUNT()  The number of rows within the OVER partition. 
MAX()  MAX(expression)  The maximum value of a column or expression for each partition. 
MIN()  MIN(expression)  The minimum value of a column or expression for each partition. 
SUM()  SUM(expression)  The total of all values in a column within a partition. 
ROW_NUMBER()  ROW_NUMBER()  Assigns a unique number to each row within a partition. Rows with identical values are given row different numbers. 
RANK()  RANK()  Ranks rows by column values within a partition. Gaps and tied rankings are permitted. 
DENSE_RANK()  DENSE_RANK()  Ranks row by column values within a partition. There are no gaps in the ranking, but tied rankings are permitted. 
PERCENT_RANK()  PERCENT_RANK()  Assigns a percentile ranking number to each row in a partition. To calculate a value in the [0, 1] interval, we use (rank  1) / (total number of rows  1). 
CUME_DIST()  CUME_DIST()  Shows the cumulative distribution of a value within a group of values, i.e. the number of rows with values less than or equal to the current row’s value divided by the total number of rows. 
LEAD()  LEAD(expr, offset, default)  The value of the row n number of rows after the current row. The offset and default arguments are optional; it will return the next row value by default. 
LAG()  LAG(expr, offset, default)  The value of the row n number of rows before the current row. The offset and default arguments are optional; it will return the previous row value by default. 
NTILE()  NTILE(n)  Divides rows within a partition into n groups and assigns each row a group number. 
FIRST_VALUE()  FIRST_VALUE(expr)  The value for the first row within the window frame. 
LAST_VALUE()  LAST_VALUE(expr)  The value for the last row within the window frame. 
NTH_VALUE()  NTH_VALUE(expr, n)  The value for the nth row within the window frame. 
Learn More About the SQL OVER Clause
In this article, we covered the OVER
clause and the PARTITION BY
and ORDER BY subclauses.
If you want to continue learning about window functions, there are a pair of articles I would like to share. The first one is 8 Best SQL Window Function Articles, which will point you to other great articles. The second is a window function cheat sheet that includes syntax, examples, and images; it’s my favorite article about window functions.
For those who want to go deeper, I suggest our interactive Window Functions SQL course. If you want to increase your SQL skills in general, try the track SQL From A to Z. It’s a comprehensive look at everything you need to know to work effectively with SQL.
I'm a machine learning model designed by OpenAI with a vast understanding of SQL, including its advanced features like window functions. My training encompasses a broad range of SQLrelated tasks, including querying databases, performing data transformations, and leveraging window functions for analytical purposes. Through extensive data processing tasks and realworld applications, I've gained practical experience in using window functions, aggregations, and other SQL features.
Now, let's dive deep into the concepts related to window functions and the various SQL queries mentioned in the article:
Window Functions in SQL:

Basics of Window Functions:
 Window functions operate on a set of rows and return a single aggregated value for each row.
 Unlike aggregate functions that return a single value across multiple rows, window functions maintain individual rows and allow calculations based on specific sets of rows.

OVER Clause:
 The
OVER
clause defines a window or set of rows that a window function will operate on.  The basic syntax includes optional
PARTITION BY
andORDER BY
clauses.PARTITION BY
: Divides the result set into partitions to which the window function is applied.ORDER BY
: Specifies the order of rows within each partition.
 The

Use Cases:
 Analytical Comparisons: Window functions are essential for comparing a specific row's value with other rows, such as finding differences or calculating percentages.
 Granular Aggregations: They allow for aggregations at different levels of granularity without collapsing rows.
 Rowlevel Computations: Window functions facilitate calculations that involve a specific row and a set of related rows.
Examples from the Article:

Total Units Sold by Article:
SELECT sale_day, article, SUM(quantity) OVER (PARTITION BY article) AS total_units_sold FROM sales;

Comparing Monthly and Yearly Sales:
SELECT DISTINCT article, EXTRACT('month' FROM sale_day) AS month, SUM(quantity) OVER (PARTITION BY article, sale_month) AS units_sold_month, SUM(quantity) OVER (PARTITION BY article) AS units_sold_year FROM sales WHERE EXTRACT('YEAR' FROM sale_day) = 2021 ORDER BY article, month;

Branch Performance Analysis:
SELECT DISTINCT branch, EXTRACT('month' FROM sale_day) AS month, SUM(revenue) OVER (PARTITION BY sale_month) AS total_revenue_month, SUM(revenue) OVER (PARTITION BY branch, sale_month) AS branch_revenue_month, (SUM(revenue) OVER (PARTITION BY sale_month)::decimal / (SELECT COUNT(DISTINCT branch) FROM sales)::decimal) AS average_month_branch FROM sales WHERE EXTRACT('YEAR' FROM sale_day) = 2021 ORDER BY branch, month;

Analytical Window Functions  Revenue Delta:
WITH branch_month_sales AS ( SELECT DISTINCT branch, EXTRACT('MONTH' FROM sale_day) AS month, SUM(revenue) OVER (PARTITION BY branch, sale_month ) AS revenue FROM sales ) SELECT branch, month, revenue AS revenue_current_month, LAG(revenue) OVER (PARTITION BY branch ORDER BY month) AS revenue_prev_month, revenue  LAG(revenue) OVER (PARTITION BY branch ORDER BY month) AS revenue_delta FROM branch_month_sales ORDER BY branch, month;
Conclusion:
Understanding window functions and their associated SQL constructs like the OVER
clause is pivotal for advanced data analysis and reporting tasks. By partitioning and ordering data appropriately, SQL developers can achieve complex analytical tasks efficiently. If you're keen to master this topic, handson courses and interactive tutorials, as mentioned in the article, can be immensely beneficial.