Definition of SQL Window Functions
- Purpose:
- Window functions are designed to perform calculations across a specific range of rows related to the current row in a dataset.
- Row Identity:
- Unlike aggregate functions, which group rows and return a single result for each group, window functions retain the individual row identities, allowing for detailed data analysis.
- Functionality:
- They enable complex calculations, such as:
- Running totals
- Rankings
- Moving averages
- Cumulative distributions
- They enable complex calculations, such as:
- Flexibility:
- SQL Window functions can be applied over a defined “window” of rows, which can be specified using the
OVER()clause, allowing for tailored analysis based on specific criteria.
- SQL Window functions can be applied over a defined “window” of rows, which can be specified using the
- Efficiency:
- By allowing calculations without losing the granularity of the data, SQL window functions enhance reporting capabilities and facilitate efficient data analysis.
- Use Cases:
- Analysts and developers use SQL window functions to analyze trends, generate insights, and make data-driven decisions, making them a vital tool in SQL data manipulation.
1. Sales Table
This table contains information about individual sales transactions.
| sale_id | salesperson_id | sale_amount | sale_date | region |
|---|---|---|---|---|
| 1 | 101 | 150 | 2024-01-01 | North |
| 2 | 102 | 200 | 2024-01-02 | South |
| 3 | 101 | 250 | 2024-01-03 | North |
| 4 | 103 | 300 | 2024-01-04 | East |
| 5 | 102 | 400 | 2024-01-05 | South |
2. Salesperson Table
This table contains details about salespeople.
| salesperson_id | salesperson_name | hire_date | region |
|---|---|---|---|
| 101 | Alice | 2020-05-10 | North |
| 102 | Bob | 2021-06-15 | South |
| 103 | Charlie | 2023-07-20 | East |
3. Product Table
This table contains information about products sold.
| product_id | product_name | category | price |
|---|---|---|---|
| 1 | Widget A | Gadgets | 50 |
| 2 | Widget B | Gadgets | 75 |
| 3 | Widget C | Appliances | 100 |
| 4 | Widget D | Appliances | 150 |
| 5 | Widget E | Gadgets | 200 |
SQL Window Function Queries:
1: Find the top 3 products for each category in terms of total sales within each quarter.
Answer:
SELECT
category_id,
product_id,
EXTRACT(QUARTER FROM sale_date) AS quarter,
SUM(total_amount) AS total_sales,
RANK() OVER (PARTITION BY category_id, EXTRACT(QUARTER FROM sale_date) ORDER BY SUM(total_amount) DESC) AS sales_rank
FROM
Sales
JOIN
Products ON Sales.product_id = Products.product_id
GROUP BY
category_id, product_id, EXTRACT(QUARTER FROM sale_date)
HAVING
RANK() OVER (PARTITION BY category_id, EXTRACT(QUARTER FROM sale_date) ORDER BY SUM(total_amount) DESC) <= 3;
2: Identify customers who spent more than the average spending of all customers in each month.
Answer:
SELECT
customer_id,
EXTRACT(MONTH FROM sale_date) AS month,
SUM(total_amount) AS customer_monthly_spending
FROM
Sales
GROUP BY
customer_id, EXTRACT(MONTH FROM sale_date)
HAVING
SUM(total_amount) > AVG(SUM(total_amount)) OVER (PARTITION BY EXTRACT(MONTH FROM sale_date));
3: Calculate each customer’s rank based on their cumulative spending over a rolling 6-month period.
Answer:
SELECT
customer_id,
sale_date,
SUM(total_amount) OVER (PARTITION BY customer_id ORDER BY sale_date RANGE BETWEEN INTERVAL '6 MONTHS' PRECEDING AND CURRENT ROW) AS rolling_6_month_spending,
RANK() OVER (ORDER BY SUM(total_amount) OVER (PARTITION BY customer_id ORDER BY sale_date RANGE BETWEEN INTERVAL '6 MONTHS' PRECEDING AND CURRENT ROW) DESC) AS spending_rank
FROM
Sales
ORDER BY
customer_id, sale_date;
4: Identify customers with a consistent increase in the quantity of products purchased over their last 5 transactions.
Answer:
WITH RankedSales AS (
SELECT
customer_id,
sale_date,
quantity,
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY sale_date DESC) AS txn_rank
FROM
Sales
)
SELECT
customer_id
FROM
RankedSales
WHERE
txn_rank <= 5
GROUP BY
customer_id
HAVING
MIN(quantity) < ANY (SELECT quantity FROM RankedSales WHERE txn_rank = 5 AND customer_id = RankedSales.customer_id)
AND MAX(quantity) = ALL (SELECT quantity FROM RankedSales WHERE txn_rank = 1 AND customer_id = RankedSales.customer_id);
5: Determine which products had fluctuating sales (increasing one month and then decreasing the next, or vice versa) over a 4-month window.
Answer:
WITH MonthlyProductSales AS (
SELECT
product_id,
EXTRACT(MONTH FROM sale_date) AS month,
SUM(total_amount) AS monthly_sales
FROM
Sales
GROUP BY
product_id, EXTRACT(MONTH FROM sale_date)
),
SalesTrend AS (
SELECT
product_id,
month,
monthly_sales,
LAG(monthly_sales) OVER (PARTITION BY product_id ORDER BY month) AS prev_month_sales,
LEAD(monthly_sales) OVER (PARTITION BY product_id ORDER BY month) AS next_month_sales
FROM
MonthlyProductSales
)
SELECT
product_id
FROM
SalesTrend
WHERE
(monthly_sales > prev_month_sales AND monthly_sales > next_month_sales)
OR (monthly_sales < prev_month_sales AND monthly_sales < next_month_sales);
6: Identify salespersons whose average monthly sales are consistently in the top 10% over the past year.
Answer:
WITH MonthlySales AS (
SELECT
salesperson_id,
EXTRACT(MONTH FROM sale_date) AS month,
EXTRACT(YEAR FROM sale_date) AS year,
SUM(total_amount) AS monthly_sales
FROM
Sales
GROUP BY
salesperson_id, EXTRACT(MONTH FROM sale_date), EXTRACT(YEAR FROM sale_date)
),
Top10Percent AS (
SELECT
salesperson_id,
monthly_sales,
PERCENT_RANK() OVER (PARTITION BY year, month ORDER BY monthly_sales DESC) AS percentile_rank
FROM
MonthlySales
)
SELECT
DISTINCT salesperson_id
FROM
Top10Percent
WHERE
percentile_rank <= 0.1;
7: Find the running total of sales for each product, but only consider the last 7 days from each sale date.
Answer:
SELECT
product_id,
sale_date,
SUM(total_amount) OVER (PARTITION BY product_id ORDER BY sale_date RANGE BETWEEN INTERVAL '7 DAYS' PRECEDING AND CURRENT ROW) AS running_7_day_total
FROM
Sales
ORDER BY
product_id, sale_date;
8: Identify the top N customers contributing to at least 80% of total sales, where N can be dynamically chosen by the user.
Answer:
WITH CumulativeSales AS (
SELECT
customer_id,
SUM(total_amount) AS total_sales,
SUM(SUM(total_amount)) OVER (ORDER BY SUM(total_amount) DESC) AS cumulative_sales
FROM
Sales
GROUP BY
customer_id
)
SELECT
customer_id
FROM
CumulativeSales
WHERE
cumulative_sales <= (SELECT 0.8 * SUM(total_amount) FROM Sales);
9: For each salesperson, show their sales growth percentage compared to their average sales across the last 3 months.
Answer:
WITH Last3MonthsSales AS (
SELECT
salesperson_id,
sale_date,
SUM(total_amount) OVER (PARTITION BY salesperson_id ORDER BY sale_date RANGE BETWEEN INTERVAL '3 MONTHS' PRECEDING AND CURRENT ROW) AS total_sales_last_3_months,
AVG(total_amount) OVER (PARTITION BY salesperson_id ORDER BY sale_date RANGE BETWEEN INTERVAL '3 MONTHS' PRECEDING AND CURRENT ROW) AS avg_sales_last_3_months
FROM
Sales
)
SELECT
salesperson_id,
(total_sales_last_3_months - avg_sales_last_3_months) / avg_sales_last_3_months * 100 AS sales_growth_percentage
FROM
Last3MonthsSales;
10: Calculate the median sale amount for each product using window functions.
Answer:
SELECT
product_id,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY total_amount) OVER (PARTITION BY product_id) AS median_sale_amount
FROM
Sales
ORDER BY
product_id;
- Mastering Oracle SQL Joins: The Ultimate Guide to Data Retrieval and Performance
- Mastering Oracle Database Architecture: A Comprehensive Overview
- Mastering PostgreSQL Architecture: 6 Powerful Steps to Optimize Performance & Scalability
- Understanding Oracle Triggers – A Comprehensive Guide with Examples
- SQL Query Design Patterns: Writing Efficient Queries
1 thought on “Top 10 SQL Window Function Queries”