Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add information about aggregations on zero rows #750

Merged
merged 7 commits into from
Sep 27, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
86 changes: 42 additions & 44 deletions modules/ROOT/pages/functions/aggregating.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -3,39 +3,13 @@
[[query-functions-aggregating]]
= Aggregating functions

== Introduction
An aggregating function performs a calculation over a set of values, returning a single value.
Aggregation can be computed over all the matching paths, or it can be further divided by introducing xref:functions/aggregating.adoc#grouping-keys[grouping keys].

Aggregating functions take a set of values and calculate an aggregated value over them.
Aggregation can be computed over all the matching paths, or it can be further divided by introducing grouping keys.
Grouping keys are non-aggregating expressions that are used to group the values going into the aggregating functions.

For example, given the following query containing two return expressions, `n` and `+count(*)+`:

[source, cypher, role=test-skip]
----
RETURN n, count(*)
----

The first, `n` is not an aggregating function, so it will be the grouping key.
The latter, `count(*)` is an aggregating function.
The matching paths will be divided into different buckets, depending on the grouping key.
The aggregating function will then be run on these buckets, calculating an aggregate value per bucket.

The input expression of an aggregating function can contain any expression, including expressions that are not grouping keys.
However, not all expressions can be composed with aggregating functions.
The example below will throw an error since `n.x`, which is not a grouping key, is combined with the aggregating function `count(*)`.
For more information, see xref:functions/aggregating.adoc#grouping-keys[Grouping keys].

[source, cypher, role=test-skip]
----
RETURN n.x + count(*)
----

To sort the result set using aggregating functions, the aggregation must be included in the `ORDER BY` sub-clause following the `RETURN` clause.

The `DISTINCT` operator works in conjunction with aggregation.
It is used to make all values unique before running them through an aggregating function.
More information about `DISTINCT` can be found in xref::syntax/operators.adoc#query-operators-aggregation[Syntax -> Aggregation operators].
[TIP]
====
To learn more about how Cypher handles aggregations performed on zero rows, refer to link:https://neo4j.com/developer/kb/understanding-aggregations-on-zero-rows//[Neo4j Knowledge Base -> Understanding aggregations on zero rows].
====

== Example graph

Expand Down Expand Up @@ -1061,29 +1035,41 @@ The sum of the two supplied Durations is returned:


[[grouping-keys]]
== Grouping keys
== Aggregating expressions and grouping keys

Aggregating expressions are expressions which contain one or more aggregating functions.
*Aggregating expressions* are expressions which contain one or more aggregating functions.
A simple aggregating expression consists of a single aggregating function.
For instance, `sum(x.a)` is an aggregating expression that only consists of the aggregating function `sum( )` with `x.a` as its argument.
Aggregating expressions are also allowed to be more complex, where the result of one or more aggregating functions are input arguments to other expressions.
For instance, `0.1 * (sum(x.a) / count(x.b))` is an aggregating expression that contains two aggregating functions, `sum( )` with `x.a` as its argument and `count( )` with `x.b` as its argument.
Both are input arguments to the division expression.

*Grouping keys* are non-aggregating expressions that are used to group the values going into the aggregating functions.
For example, given the following query containing two return expressions, `n` and `+count(*)+`:

[source, cypher, role=test-skip]
----
RETURN n, count(*)
----

For aggregating expressions to be correctly computable for the buckets formed by the grouping key(s), they have to fulfill some requirements.
Specifically, each sub-expression in an aggregating expression has to be either:
The first, `n` is not an aggregating function, so it will be the grouping key.
The latter, `count(*)` is an aggregating function.
The matching paths will be divided into different buckets, depending on the grouping key.
The aggregating function will then be run on these buckets, calculating an aggregate value per bucket.

* an aggregating function, e.g. `sum(x.a)`,
* a constant, e.g. `0.1`,
* a parameter, e.g. `$param`,
* a grouping key, e.g. the `a` in `RETURN a, count(*)`
* a local variable, e.g. the `x` in `count(*) + size([ x IN range(1, 10) | x ])`, or
* a sub-expression, all operands of which have to be allowed in an aggregating expression.
The input expression of an aggregating function can contain any expression, including expressions that are not grouping keys.
However, not all expressions can be composed with aggregating functions.
The example below will throw an error since `n.x`, which is not a grouping key, is combined with the aggregating function `count(*)`.

[source, cypher, role=test-skip]
----
RETURN n.x + count(*)
----

To sort the result set using aggregating functions, the aggregation must be included in the `ORDER BY` sub-clause following the `RETURN` clause.

[[grouping-key-examples]]
=== Examples of aggregating expressions
=== Examples

.Simple aggregation without any grouping keys
======
Expand Down Expand Up @@ -1222,4 +1208,16 @@ RETURN groupingKey, groupingKey - max(f.age)
| +116+ | +45+
2+d|Rows: 1
|===
======
======

=== Rules for aggregating expressions

For aggregating expressions to be correctly computable for the buckets formed by the grouping key(s), they have to fulfill some requirements.
Specifically, each sub-expression in an aggregating expression has to be either:

* an aggregating function, e.g. `sum(x.a)`.
* a constant, e.g. `0.1`.
* a parameter, e.g. `$param`.
* a grouping key, e.g. the `a` in `RETURN a, count(*)`.
* a local variable, e.g. the `x` in `count(*) + size([ x IN range(1, 10) | x ])`.
* a sub-expression, all operands of which have to be allowed in an aggregating expression.