Add information about aggregations on zero rows (neo4j#750)

Trello: https://trello.com/c/BVFmLHB1/5295-improvement-understanding-aggregations-on-zero-rows-in-cypher Also deletes duplicate content about grouping keys
JPryce-Aklundh · Sep 27, 2023 · 1099570 · 1099570
1 parent 2c7700c
commit 1099570
Showing 1 changed file with 42 additions and 44 deletions.
diff --git a/modules/ROOT/pages/functions/aggregating.adoc b/modules/ROOT/pages/functions/aggregating.adoc
@@ -3,39 +3,13 @@
 [[query-functions-aggregating]]
 = Aggregating functions
 
-== Introduction 
+An aggregating function performs a calculation over a set of values, returning a single value.
+Aggregation can be computed over all the matching paths, or it can be further divided by introducing xref:functions/aggregating.adoc#grouping-keys[grouping keys].
 
-Aggregating functions take a set of values and calculate an aggregated value over them.
-Aggregation can be computed over all the matching paths, or it can be further divided by introducing grouping keys.
-Grouping keys are non-aggregating expressions that are used to group the values going into the aggregating functions.
-
-For example, given the following query containing two return expressions, `n` and `+count(*)+`:
-
-[source, cypher, role=test-skip]
-----
-RETURN n, count(*)
-----
-
-The first, `n` is not an aggregating function, so it will be the grouping key.
-The latter, `count(*)` is an aggregating function.
-The matching paths will be divided into different buckets, depending on the grouping key.
-The aggregating function will then be run on these buckets, calculating an aggregate value per bucket.
-
-The input expression of an aggregating function can contain any expression, including expressions that are not grouping keys.
-However, not all expressions can be composed with aggregating functions.
-The example below will throw an error since `n.x`, which is not a grouping key, is combined with the aggregating function `count(*)`.
-For more information, see xref:functions/aggregating.adoc#grouping-keys[Grouping keys].
-
-[source, cypher, role=test-skip]
-----
-RETURN n.x + count(*)
-----
-
-To sort the result set using aggregating functions, the aggregation must be included in the `ORDER BY` sub-clause following the `RETURN` clause.
-
-The `DISTINCT` operator works in conjunction with aggregation.
-It is used to make all values unique before running them through an aggregating function.
-More information about `DISTINCT` can be found in xref::syntax/operators.adoc#query-operators-aggregation[Syntax -> Aggregation operators].
+[TIP]
+====
+To learn more about how Cypher handles aggregations performed on zero rows, refer to link:https://neo4j.com/developer/kb/understanding-aggregations-on-zero-rows//[Neo4j Knowledge Base -> Understanding aggregations on zero rows].
+====
 
 == Example graph
 
@@ -1061,29 +1035,41 @@ The sum of the two supplied Durations is returned:
 
 
 [[grouping-keys]]
-== Grouping keys
+== Aggregating expressions and grouping keys
 
-Aggregating expressions are expressions which contain one or more aggregating functions.
+*Aggregating expressions* are expressions which contain one or more aggregating functions.
 A simple aggregating expression consists of a single aggregating function.
 For instance, `sum(x.a)` is an aggregating expression that only consists of the aggregating function `sum( )` with `x.a` as its argument.
 Aggregating expressions are also allowed to be more complex, where the result of one or more aggregating functions are input arguments to other expressions.
 For instance, `0.1 * (sum(x.a) / count(x.b))` is an aggregating expression that contains two aggregating functions, `sum( )` with `x.a` as its argument and `count( )` with `x.b` as its argument.
 Both are input arguments to the division expression.
 
+*Grouping keys* are non-aggregating expressions that are used to group the values going into the aggregating functions.
+For example, given the following query containing two return expressions, `n` and `+count(*)+`:
+
+[source, cypher, role=test-skip]
+----
+RETURN n, count(*)
+----
 
-For aggregating expressions to be correctly computable for the buckets formed by the grouping key(s), they have to fulfill some requirements.
-Specifically, each sub-expression in an aggregating expression has to be either:
+The first, `n` is not an aggregating function, so it will be the grouping key.
+The latter, `count(*)` is an aggregating function.
+The matching paths will be divided into different buckets, depending on the grouping key.
+The aggregating function will then be run on these buckets, calculating an aggregate value per bucket.
 
-* an aggregating function, e.g. `sum(x.a)`,
-* a constant, e.g. `0.1`,
-* a parameter, e.g. `$param`,
-* a grouping key, e.g. the `a` in `RETURN a, count(*)`
-* a local variable, e.g. the `x` in  `count(*) + size([ x IN range(1, 10) | x ])`, or
-* a sub-expression, all operands of which have to be allowed in an aggregating expression.
+The input expression of an aggregating function can contain any expression, including expressions that are not grouping keys.
+However, not all expressions can be composed with aggregating functions.
+The example below will throw an error since `n.x`, which is not a grouping key, is combined with the aggregating function `count(*)`.
 
+[source, cypher, role=test-skip]
+----
+RETURN n.x + count(*)
+----
+
+To sort the result set using aggregating functions, the aggregation must be included in the `ORDER BY` sub-clause following the `RETURN` clause.
 
 [[grouping-key-examples]]
-=== Examples of aggregating expressions
+=== Examples
 
 .Simple aggregation without any grouping keys
 ======
@@ -1222,4 +1208,16 @@ RETURN groupingKey, groupingKey - max(f.age)
 | +116+ | +45+
 2+d|Rows: 1
 |===
-======
+======
+
+=== Rules for aggregating expressions
+
+For aggregating expressions to be correctly computable for the buckets formed by the grouping key(s), they have to fulfill some requirements.
+Specifically, each sub-expression in an aggregating expression has to be either:
+
+* an aggregating function, e.g. `sum(x.a)`.
+* a constant, e.g. `0.1`.
+* a parameter, e.g. `$param`.
+* a grouping key, e.g. the `a` in `RETURN a, count(*)`.
+* a local variable, e.g. the `x` in  `count(*) + size([ x IN range(1, 10) | x ])`.
+* a sub-expression, all operands of which have to be allowed in an aggregating expression.