Understanding the 'HAVING' Clause
The HAVING
clause in SQL is used to filter the results of an aggregate function, such as SUM
, AVG
, COUNT
, MIN
, or MAX
. It allows you to apply additional conditions to the grouped data, which can be useful in various data analysis scenarios.
The basic syntax for using the HAVING
clause is as follows:
SELECT column1, column2, ... , aggregateFunction(column)
FROM table
GROUP BY column1, column2, ...
HAVING condition;
The HAVING
clause is typically used in conjunction with the GROUP BY
clause, which groups the data based on one or more columns. The HAVING
clause then filters the grouped data based on the specified condition.
Here's an example to illustrate the usage of the HAVING
clause:
SELECT department, COUNT(*) as num_employees
FROM employees
GROUP BY department
HAVING COUNT(*) > 10;
In this example, the HAVING
clause is used to filter the results to only include departments with more than 10 employees.
graph LR
A[SELECT department, COUNT(*) as num_employees] --> B[FROM employees]
B --> C[GROUP BY department]
C --> D[HAVING COUNT(*) > 10]
D --> E[Result: Departments with more than 10 employees]
The HAVING
clause can be used with various aggregate functions and can also be combined with other SQL clauses, such as WHERE
, ORDER BY
, and LIMIT
, to further refine the query results.
Table 2: Comparison of WHERE
and HAVING
clauses
Clause |
Purpose |
Applies to |
WHERE |
Filters individual rows before grouping |
Individual rows |
HAVING |
Filters grouped rows after grouping |
Grouped rows |
In the next section, we will explore how to execute HAVING
queries in the context of Hadoop.