Skip to content
17 changes: 17 additions & 0 deletions docs/sphinx/source/reference/Functions/aggregate_functions.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,28 @@
Aggregate Functions
===================

.. _aggregate_functions:

Aggregate functions perform calculations on zero, one, or multiple rows of values, and they return a single value.

List of functions (by sub-category)
###################################

Standard SQL Aggregates
------------------------

.. toctree::
:maxdepth: 1

aggregate_functions/count
aggregate_functions/sum
aggregate_functions/avg
aggregate_functions/min
aggregate_functions/max

Specialized Aggregates
----------------------

.. toctree::
:maxdepth: 1

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
Diagram(
Terminal('AVG'),
Terminal('('),
NonTerminal('expression'),
Terminal(')'),
)
116 changes: 116 additions & 0 deletions docs/sphinx/source/reference/Functions/aggregate_functions/avg.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
===
AVG
===

.. _avg:

Computes the average (arithmetic mean) of all non-NULL values in a group.

Syntax
======

.. raw:: html
:file: avg.diagram.svg

Parameters
==========

``AVG(expression)``
Calculates the average of all non-NULL values of ``expression`` in the group. NULL values are ignored.

Returns
=======

Returns a ``DOUBLE`` representing the average of all non-NULL values. If all values are NULL or the input set is empty, returns NULL.

Examples
========

Setup
-----

For these examples, assume we have a ``sales`` table:

.. code-block:: sql

CREATE TABLE sales(
id BIGINT,
product STRING,
region STRING,
amount BIGINT,
PRIMARY KEY(id))

INSERT INTO sales VALUES
(1, 'Widget', 'North', 100),
(2, 'Widget', 'South', 150),
(3, 'Gadget', 'North', 200),
(4, 'Gadget', 'South', NULL),
(5, 'Widget', 'North', 120)

AVG - Average All Values
-------------------------

Calculate the average of all amounts in the table:

.. code-block:: sql

SELECT AVG(amount) AS average_amount FROM sales

.. list-table::
:header-rows: 1

* - :sql:`average_amount`
* - :json:`142.5`

Notice that the NULL value in row 4 is ignored, so the average is 570 / 4 = 142.5.

AVG with GROUP BY
------------------

Calculate average amounts per product:

.. code-block:: sql

SELECT product, AVG(amount) AS average_amount
FROM sales
GROUP BY product

.. list-table::
:header-rows: 1

* - :sql:`product`
- :sql:`average_amount`
* - :json:`"Widget"`
- :json:`123.33333333333333`
* - :json:`"Gadget"`
- :json:`200.0`

Calculate average amounts per region:

.. code-block:: sql

SELECT region, AVG(amount) AS average_amount
FROM sales
GROUP BY region

.. list-table::
:header-rows: 1

* - :sql:`region`
- :sql:`average_amount`
* - :json:`"North"`
- :json:`140.0`
* - :json:`"South"`
- :json:`150.0`

The South region average only includes the non-NULL value (150), ignoring the NULL from the Gadget sale.

Important Notes
===============

* ``AVG`` ignores NULL values in the aggregation
* When used without GROUP BY, AVG returns a single value for the entire table
* When used with GROUP BY, AVG returns one value per group
* If all values in a group are NULL, AVG returns NULL for that group
* The return type is ``DOUBLE`` (even if the input is ``BIGINT``)
* **Index Requirement**: For optimal performance, queries with GROUP BY require an appropriate index. See :ref:`Indexes <index_definition>` for details on creating indexes that support GROUP BY operations.
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
Diagram(
Terminal('COUNT'),
Terminal('('),
Choice(0,
Terminal('*'),
NonTerminal('expression'),
),
Terminal(')'),
)
136 changes: 136 additions & 0 deletions docs/sphinx/source/reference/Functions/aggregate_functions/count.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
=====
COUNT
=====

.. _count:

Counts the number of rows or non-NULL values in a group.

Syntax
======

.. raw:: html
:file: count.diagram.svg

Parameters
==========

The function accepts two forms:

``COUNT(*)``
Counts all rows in the group, including rows with NULL values.

``COUNT(expression)``
Counts only the rows where ``expression`` is not NULL.

Returns
=======

Returns a ``BIGINT`` representing the count of rows or non-NULL values. If the input set is empty, returns ``0``.

Examples
========

Setup
-----

For these examples, assume we have a ``sales`` table:

.. code-block:: sql

CREATE TABLE sales(
id BIGINT,
product STRING,
region STRING,
amount BIGINT,
PRIMARY KEY(id))

INSERT INTO sales VALUES
(1, 'Widget', 'North', 100),
(2, 'Widget', 'South', 150),
(3, 'Gadget', 'North', 200),
(4, 'Gadget', 'South', NULL),
(5, 'Widget', 'North', 120)

COUNT(*) - Count All Rows
--------------------------

Count all rows in the table:

.. code-block:: sql

SELECT COUNT(*) AS total_sales FROM sales

.. list-table::
:header-rows: 1

* - :sql:`total_sales`
* - :json:`5`

COUNT(column) - Count Non-NULL Values
--------------------------------------

Count only non-NULL amounts:

.. code-block:: sql

SELECT COUNT(amount) AS sales_with_amount FROM sales

.. list-table::
:header-rows: 1

* - :sql:`sales_with_amount`
* - :json:`4`

Notice that the count is 4, not 5, because the fourth row has a NULL ``amount``.

COUNT with GROUP BY
-------------------

Count sales per product:

.. code-block:: sql

SELECT product, COUNT(*) AS sales_count
FROM sales
GROUP BY product

.. list-table::
:header-rows: 1

* - :sql:`product`
- :sql:`sales_count`
* - :json:`"Widget"`
- :json:`3`
* - :json:`"Gadget"`
- :json:`2`

Count non-NULL amounts per region:

.. code-block:: sql

SELECT region, COUNT(amount) AS non_null_amounts
FROM sales
GROUP BY region

.. list-table::
:header-rows: 1

* - :sql:`region`
- :sql:`non_null_amounts`
* - :json:`"North"`
- :json:`3`
* - :json:`"South"`
- :json:`1`

The South region has 2 sales, but only 1 has a non-NULL amount.

Important Notes
===============

* ``COUNT(*)`` counts all rows, including those with NULL values in any column
* ``COUNT(column)`` counts only rows where the specified column is not NULL
* When used without GROUP BY, COUNT returns a single value for the entire table
* When used with GROUP BY, COUNT returns one value per group
* The return type is always ``BIGINT``
* **Index Requirement**: For optimal performance, queries with GROUP BY require an appropriate index. See :ref:`Indexes <index_definition>` for details on creating indexes that support GROUP BY operations.
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
Diagram(
Terminal('MAX'),
Terminal('('),
NonTerminal('expression'),
Terminal(')'),
)
Loading
Loading