From bf15641c9714a2cd41a2a1cd43de61829fc9e5d8 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Jens=20Pryce-=C3=85klundh?= <112686610+JPryce-Aklundh@users.noreply.github.com> Date: Tue, 23 Jan 2024 14:18:14 +0100 Subject: [PATCH] Update Parallel runtime example query plan (#860) --- .../runtimes/concepts.adoc | 73 ++++++++++--------- 1 file changed, 38 insertions(+), 35 deletions(-) diff --git a/modules/ROOT/pages/planning-and-tuning/runtimes/concepts.adoc b/modules/ROOT/pages/planning-and-tuning/runtimes/concepts.adoc index 813ecaf95..c40aa1dc5 100644 --- a/modules/ROOT/pages/planning-and-tuning/runtimes/concepts.adoc +++ b/modules/ROOT/pages/planning-and-tuning/runtimes/concepts.adoc @@ -273,44 +273,47 @@ Runtime version {neo4j-version-minor} Batch size 128 -+-------------------+----+------------------------------------------------------------------------+----------------+---------------------+ -| Operator | Id | Details | Estimated Rows | Pipeline | -+-------------------+----+------------------------------------------------------------------------+----------------+---------------------+ -| +ProduceResults | 0 | `count(*)` | 1 | In Pipeline 6 | -| | +----+------------------------------------------------------------------------+----------------+---------------------+ -| +EagerAggregation | 1 | count(*) AS `count(*)` | 1 | | -| | +----+------------------------------------------------------------------------+----------------+ | -| +Filter | 2 | not anon_1 = anon_5 AND anon_0.name = $autostring_0 AND anon_0:Station | 0 | | -| | +----+------------------------------------------------------------------------+----------------+ | -| +Expand(All) | 3 | (d)-[anon_1:CALLS_AT]->(anon_0) | 0 | Fused in Pipeline 5 | -| | +----+------------------------------------------------------------------------+----------------+---------------------+ -| +Filter | 4 | d:Stop | 0 | | -| | +----+------------------------------------------------------------------------+----------------+ | -| +NullifyMetadata | 14 | | 0 | | -| | +----+------------------------------------------------------------------------+----------------+ | -| +Repeat(Trail) | 5 | (a) (...){1, *} (d) | 0 | Fused in Pipeline 4 | -| |\ +----+------------------------------------------------------------------------+----------------+---------------------+ -| | +Filter | 6 | isRepeatTrailUnique(anon_7) AND anon_2:Stop | 6 | | -| | | +----+------------------------------------------------------------------------+----------------+ | -| | +Expand(All) | 7 | (anon_4)<-[anon_7:NEXT]-(anon_2) | 6 | Fused in Pipeline 3 | -| | | +----+------------------------------------------------------------------------+----------------+---------------------+ -| | +Filter | 8 | anon_4:Stop | 11 | | -| | | +----+------------------------------------------------------------------------+----------------+ | -| | +Argument | 9 | anon_4 | 13 | Fused in Pipeline 2 | -| | +----+------------------------------------------------------------------------+----------------+---------------------+ -| +Filter | 10 | a:Stop | 0 | | -| | +----+------------------------------------------------------------------------+----------------+ | -| +Expand(All) | 11 | (anon_6)<-[anon_5:CALLS_AT]-(a) | 0 | Fused in Pipeline 1 | -| | +----+------------------------------------------------------------------------+----------------+---------------------+ -| +Filter | 12 | anon_6.name = $autostring_1 | 1 | | -| | +----+------------------------------------------------------------------------+----------------+ | -| +NodeByLabelScan | 13 | anon_6:Station | 10 | Fused in Pipeline 0 | -+-------------------+----+------------------------------------------------------------------------+----------------+---------------------+ ++-----------------------------+----+------------------------------------------------------------------------+----------------+---------------------+ +| Operator | Id | Details | Estimated Rows | Pipeline | ++-----------------------------+----+------------------------------------------------------------------------+----------------+---------------------+ +| +ProduceResults | 0 | `count(*)` | 1 | In Pipeline 6 | +| | +----+------------------------------------------------------------------------+----------------+---------------------+ +| +EagerAggregation | 1 | count(*) AS `count(*)` | 1 | | +| | +----+------------------------------------------------------------------------+----------------+ | +| +Filter | 2 | NOT anon_1 = anon_5 AND anon_0.name = $autostring_0 AND anon_0:Station | 0 | | +| | +----+------------------------------------------------------------------------+----------------+ | +| +Expand(All) | 3 | (d)-[anon_1:CALLS_AT]->(anon_0) | 0 | Fused in Pipeline 5 | +| | +----+------------------------------------------------------------------------+----------------+---------------------+ +| +Filter | 4 | d:Stop | 0 | | +| | +----+------------------------------------------------------------------------+----------------+ | +| +NullifyMetadata | 14 | | 0 | | +| | +----+------------------------------------------------------------------------+----------------+ | +| +Repeat(Trail) | 5 | (a) (...){1, *} (d) | 0 | Fused in Pipeline 4 | +| |\ +----+------------------------------------------------------------------------+----------------+---------------------+ +| | +Filter | 6 | isRepeatTrailUnique(anon_8) AND anon_7:Stop | 6 | | +| | | +----+------------------------------------------------------------------------+----------------+ | +| | +Expand(All) | 7 | (anon_9)<-[anon_8:NEXT]-(anon_7) | 6 | Fused in Pipeline 3 | +| | | +----+------------------------------------------------------------------------+----------------+---------------------+ +| | +Filter | 8 | anon_9:Stop | 11 | | +| | | +----+------------------------------------------------------------------------+----------------+ | +| | +Argument | 9 | anon_9 | 13 | Fused in Pipeline 2 | +| | +----+------------------------------------------------------------------------+----------------+---------------------+ +| +Filter | 10 | a:Stop | 0 | | +| | +----+------------------------------------------------------------------------+----------------+ | +| +Expand(All) | 11 | (anon_6)<-[anon_5:CALLS_AT]-(a) | 0 | Fused in Pipeline 1 | +| | +----+------------------------------------------------------------------------+----------------+---------------------+ +| +Filter | 12 | anon_6.name = $autostring_1 | 1 | | +| | +----+------------------------------------------------------------------------+----------------+ | +| +PartitionedNodeByLabelScan | 13 | anon_6:Station | 10 | Fused in Pipeline 0 | ++-----------------------------+----+------------------------------------------------------------------------+----------------+---------------------+ ---- A key difference between the physical plans produced by the parallel runtime compared to those generated by pipelined runtime is that, in general, more pipelines are produced when using the parallel runtime (in this case, seven instead of the four produced by the same query being run on pipelined runtime). This is because, when executing a query in the parallel runtime, it is more efficient to have more tasks that can be run in parallel, whereas when running a single-threaded execution in the pipelined runtime it is more efficient to fuse several pipelines together. +Another important difference is that the parallel runtime uses partitioned operators (xref:planning-and-tuning/operators/operators-detail.adoc#query-plan-partitioned-node-by-label-scan[`PartitionedNodeByLabelScan`] in this case). +These operators first segment the retrieved data and then operate on each segment in parallel. + The parallel runtime shares the same architecture as the pipelined runtime, meaning that it will transform the logical plan into the same type of execution graph as described above. However, when using parallel runtime, each pipeline task can be executed in a separate thread. Another similarity with pipelined runtime is that queries run on the parallel runtime will begin by generating the first pipeline which eventually will produce a morsel in the input buffer of the subsequent pipeline. @@ -331,10 +334,10 @@ Consider the execution graph below, based on the same example query: image::runtimes_execution_graph2.svg[width="900",role="middle"] -The execution graph shows that execution starts at `pipeline 0`, which consists of the operator `NodeByLabelScan` and can be executed simultaneously on all available threads working on different morsels of data. +The execution graph shows that execution starts at `pipeline 0`, which consists of the operator `PartitionedNodeByLabelScan` and can be executed simultaneously on all available threads working on different morsels of data. Once pipeline `0` has produced at least one full morsel of data, any thread can then start executing `pipeline 1`, while other threads may continue to execute `pipeline 0`. More specifically, once there is data from a pipeline, the scheduler can proceed to the next pipeline while concurrently executing earlier pipelines. -In this case, `pipeline 5` ends with an aggregation (performed by the EagerAggregation operator), which means that the last pipeline (6) cannot start until all preceding pipelines are completely finished for all the preceding morsels of data. +In this case, `pipeline 5` ends with an aggregation (performed by the xref:planning-and-tuning/operators/operators-detail.adoc#query-plan-eager-aggregation[`EagerAggregation`] operator), which means that the last pipeline (`6`) cannot start until all preceding pipelines are completely finished for all the preceding morsels of data. [[runtimes-parallel-runtime-considerations]] === Considerations