Saturday, March 23, 2024

How to calculate Running Total and Running Average in SQL (with PARTITION BY Example)

Hello guys, calculating running totals and running averages is a common requirement in data analysis and reporting but I have interviewed many people who have no clue on how to do this. If you have also seen this question on interview and struggle to answer then you have come to the right place. SQL provides powerful features like window functions,which can be used to compute these running aggregates or running totals. In this article, we'll explore the concepts of running total and running average in SQL and see the real world examples and explanations to help you master these calculations but before that let's find out what is running total actually means?

What is Running Total in SQL?

A running total, also known as a cumulative sum, represents the sum of a column's values up to the current row. The key to calculating a running total in SQL is leveraging the window functions, specifically the SUM() function along with the OVER() clause.


Let's see an example of calculating Running Total in a Sales Table

Consider a table named Sales with columns order_date and sales_amount. The goal is to calculate the running total of sales_amount to find out how much sale we made in particular date, here is the SQL query to do that:

SELECT

order_date,

sales_amount,

SUM(sales_amount) OVER (ORDER BY order_date) AS running_total

FROM sales;

In this query:


SUM(sales_amount) calculates the sum of sales_amount.

OVER (ORDER BY order_date) defines the window for the running total, ordered by the order_date.

The result will include the order_date, sales_amount, and the running_total for each row as shown in below output:





How to calculate Running Average in SQL? Example

Now, let's see how to calculate running average in SQL. Similar to running totals, running averages provide insights into the evolving average of a column's values as you progress through the rows. The AVG() function, combined with the OVER() clause, is used to compute running averages. If you notice the only difference is we are using AVG() function instead of SUM() function which we used to calculate running total. 

Now, let's see a real world example of computing running average in SQL


Example: Running Average in a Product Ratings Table
Let's Assume a table named product_ratings with columns date and rating. The objective is to calculate the running average of product ratings. 

Here is the SQL query to do that:


SELECT

date,

rating,

AVG(rating) OVER (ORDER BY date) AS running_average

FROM product_ratings;

In this query:


AVG(rating) computes the average of ratings.

OVER (ORDER BY date) specifies the window for the running average, ordered by the date.

The result will display the date, rating, and the running_average for each row,  as shown below:


How to calculate Running Total and Running Average in SQL (with PARTITION BY Example)




Handling Partitions

In both running total and running average calculations, you may encounter scenarios where you want to reset the computation for each group or partition. The PARTITION BY clause allows you to achieve this by specifying the column(s) that define the partitions.

Here is an example to handle partitions:


Example: Running Total within Categories
Suppose you have a table expenses with columns category and amount. To calculate the running total within each category, use the PARTITION BY clause.

Here is the exact SQL query you can use:

SELECT

category,

amount,

SUM(amount) OVER (PARTITION BY category ORDER BY date) AS running_total_category

FROM expenses;


In this query:

PARTITION BY category resets the running total for each category.

ORDER BY date maintains the order within each category.

Here is how the output will look like:




That's all about how to compute running total and running average in SQL. To be honest, calculating running totals and running averages in SQL becomes a straightforward task with the power of window functions. The SUM() and AVG() functions, coupled with the OVER() clause, enable you to perform these calculations efficiently. 

Understanding the syntax and usage of these window functions is crucial for any SQL developer or data analyst working with time-series or sequential data. Practice with real-world scenarios and datasets will enhance your proficiency in leveraging running totals and running averages to gain valuable insights from your data.

If you like this article, you may also like my other SQL tutorials


Thank you for reading this article till the end, all the best with your SQL journey, if you have any questions feel free to ask in comments. 


No comments:

Post a Comment