feat: Add performance rules for persistence layer

devonfw · Nov 25, 2022 · 96b5cf5 · 96b5cf5
1 parent 4119344
commit 96b5cf5
Show file tree

Hide file tree

Showing 2 changed files with 38 additions and 0 deletions.
diff --git a/modules/ROOT/nav.adoc b/modules/ROOT/nav.adoc
@@ -12,6 +12,7 @@
 * Persistence
 ** Relational Databases
 *** xref:persistence/jpa.adoc[]
+*** xref:persistence/performance.adoc[]
 *** xref:persistence/transactional.adoc[]
 
 * Cross cutting

diff --git a/modules/ROOT/pages/persistence/performance.adoc b/modules/ROOT/pages/persistence/performance.adoc
@@ -0,0 +1,37 @@
+= Performance
+
+When doing performance optimization, it's important to keep the following rules in mind.
+
+== Rule 0: Measure, don't think
+
+[quote,Donald E. Knuth]
+Premature optimization is the root of all evil.
+
+Before starting to optimize any database operation, it's important to analyse the root cause by measuring the execution times.
+
+== Rule 1: The performance of a use case is proportional to the number of SQL statements it generates
+
+```
+executionTime(usecase) = executionTime(queryA) + executionTime(queryB) * N
+```
+
+* It has shown, that the problem often is in the number of query executions (N) and not in the query itself
+* Typical values for executionTime(queryA) are `5 to 30 msec` and `5 to 100 msec` for executionTime(queryB)
+* Before optimizing a single query A or B, check and if possible reduce the number of executions of the queries
+* In some rare cases the executionTime of queryB is the problem and tuning the SQL or the DB is helpful
+* Always consider Rule 0 before applying any action
+
+
+== Rule 2: The performance of a use case is the product of objects in a session with the number of flush operations
+
+In case a Object Relational Mapper (like JPA) is used and a high number of objects are in the session, then the flush operations can become a problem.
+
+The default configuration read/write auto flush of JPA makes sure that prior of any query the current session is flushed, to ensure, that the query gets the latest updates from the DB.
+
+If there are big datasets in the session, the flush operation can take an relativ huge amount of time. E.g. in an analytics (read only) application with lots of data, the flush would be not necessary, becasuse no data was changed. Nonetheless, it flushes the session with all data prior to any other query.
+
+* In case the number of objects in the session is < 1000 there's usually not an issue
+* Check if the number of objects in the session is necessary, missused eager loading
+* Check if the flush mode can be adjusted, as in the mentioned analytics application example
+* Always consider Rule 0 before applying any action
+