Merge pull request #86 from NethServer/feat-6890

New System Logs page
NethServer · Apr 11, 2024 · 1d57cd2 · 1d57cd2
2 parents 25ab115 + ea45965
commit 1d57cd2
Show file tree

Hide file tree

Showing 2 changed files with 141 additions and 37 deletions.
diff --git a/cluster.rst b/cluster.rst
@@ -96,6 +96,13 @@ In any case, after leader promotion it is necessary to perform these additional
 
 See also the note in :ref:`audit-trail-section` about node promotion.
 
+.. note::
+
+  The promotion of a new leader entails a change in the System logs
+  configuration. Refer to :ref:`logs-persistence-section` for more
+  details.
+
+
 Reachable leader node
 ---------------------
 

diff --git a/log_server.rst b/log_server.rst
@@ -1,64 +1,161 @@
 .. _loki-section:
 
+.. _system-logs-section:
+
 ===========
 System logs
 ===========
 
-Almost everything is logged inside journalctl and sent to the local log server.
+Log records generated by any cluster node are collected and stored in the
+leader node. See :ref:`logs-persistence-section` for additional information.
+
+Accessing logs
+==============
+
+The System Logs page allows users to efficiently search logs based on
+date, text query, or context. The available contexts include:
+
+* **Cluster**: Displays logs from all cluster nodes.
+
+* **Node**: Shows logs from a specific node.
+
+* **Application**: Presents logs from a particular application instance.
+
+Users can select between two modes of operation:
+
+* **Dump mode**  Retrieves a limited number of log records within a
+  specified time period. The maximum number of lines can be adjusted using
+  the ``Max lines`` field.
+
+* **Follow mode** Displays logs in real-time, providing live updates.
+
+If the ``Search query`` field is set, only log records matching the given
+string are returned. Only exact substring matches are allowed.
+
+For comparative analysis of logs from two nodes or applications, follow
+these steps:
+
+- Click on :guilabel:`Add search` button.
+
+- Optionally, switch to ``Vertical layout`` from the three-dots menu for a
+  side-by-side comparison.
+
+For example, comparing logs can be useful to see the Traefik log from one
+side, to see incoming HTTP requests, and Nextcloud logs on the other side
+to see that application activity.
+
+.. note::
+
+  By default, log searches are directed to the active Loki instance. If
+  there are :ref:`inactive Loki instances <inactive-loki-section>` within
+  the cluster, it is possible to select them to search past log entries.
+
+
+Command line interface
+======================
+
+.. highlight:: text
+
+In addition to accessing logs via the System Logs web page, users can
+utilize the ``api-server-logs`` command for log searching. Below are
+examples illustrating its usage:
+
+a) **Basic invocation:** Enters "follow mode" for the entire cluster:
+
+   ::
+
+     api-server-logs logs
+
+
+b) **Follow mode for application instance:** Enables follow mode for the
+   specified application (module) instance, such as ``traefik1``. The
+   ``--entity`` flag selects the context:
+
+   ::
+
+     api-server-logs logs --entity module --name traefik1
+
+
+c) **Dump mode for specific instance in a time period:** Initiates dump
+   mode for the same instance within a specific time period. Dates must
+   adhere to the ISO8601 format:
+
+   ::
+
+     api-server-logs logs --mode dump --entity module --name traefik1 --from 2024-04-09T16:43:22Z --to 2024-04-09T16:55:31Z
+
+
+d) **Changing output timezone:** Modifies the output timezone while
+   maintaining the same query. Refer to ``timedatectl list-timezones`` for
+   a full list of options:
+
+   ::
+
+     api-server-logs logs --timezone America/New_York --mode dump --entity module --name traefik1 --from 2024-04-09T16:43:22Z --to 2024-04-09T16:55:31Z
+
+
+.. _logs-persistence-section:
+
+Logs persistence
+================
+
+Upon cluster creation, a Loki [#loki]_ core module instance is installed
+on the leader node and designated as the active instance. The leader node,
+like any other worker node, continuously streams its log data to this
+active Loki instance [#promtail]_.
 
-By default, `Grafana Loki <https://grafana.com/oss/loki/>`_ is installed on the leader node, it collects the logs
-from all cluster nodes.
-A rootful `Promtail <https://grafana.com/docs/loki/latest/clients/promtail/>`_ container runs on all nodes,
-including the leader one. It sends all logs to the Loki server.
 
-From the leader node, it is possible to query the logs of all nodes.
+Adjusting Settings
+------------------
 
-Logs are accessible from the ``System logs`` page.
-You can filter logs by date, a text query or context. Available contexts are:
+Navigate to the ``Settings`` page and click on the System logs card to
+modify log retention (select ``Edit retention``) or assign a user-friendly
+name to the active Loki instance (choose the three-dots menu, then ``Edit
+label``).
 
-* ``cluster``: all logs from any source
-* ``node``: all logs from a given node
-* ``app``: logs from a given application instance, regardless where it's currently running
 
-You can see only a subset of the lines or follow the log to see what's happening in real time.
+Understanding log retention
+---------------------------
 
-Sometimes is useful to compare the logs of two applications side-by-side.
-You can do it by following these steps:
+Log retention refers to the maximum age of stored log records. Records
+older than the retention period are automatically purged. By default,
+System logs have a retention period of 365 days, but this can be
+customized to any desired duration. For compliance with common regulations
+and best practices, a recommended retention period is 200 days or longer.
 
-- setup the filter for a refined search
-- click on :guilabel:`Add search` button
-- setup the new filter
-- select ``Vertical layout`` from the three-dots menu
 
-Logs are now shown side-by-side to easily correlate events.
+.. _inactive-loki-section:
 
-Command line
-============
+Inactive Loki instances
+-----------------------
 
-.. highlight:: bash
+When a worker node is promoted to leader, a new Loki instance is installed
+on it and becomes the active instance, while the old instance is marked as
+*inactive*.
 
-If you're familiar with the command line, recent logs are visible using the ``journalctl`` command
-and services can be inspected using the ``systemctl`` command.
-As root use ``journalctl`` to see messages from agents, rootful and rootless modules.
+- The new active instance inherits the retention setting from the old one.
 
-You can also use the ``api-server-logs`` to query directly the Loki server.
-Example to inspect the log of the `traefik1` module: ::
+- An inactive instance retains logs based on its last retention setting.
 
-  api-server-logs logs -e module -n traefik1
+- Log searches can still be performed within an inactive instance.
 
-You can also enable the automatic completion of the above command.
-First, install the ``bash-completion`` package.
+- Restoring a Loki instance from backup renders it inactive.
 
-On RHEL-like distributions: ::
+- To remove an inactive instance, select the three-dots menu and choose
+  the ``Uninstall`` action.
 
-  dnf install bash-completion -y
 
-On Debian distribution: ::
+.. rubric:: Footnotes
 
-  apt-get install bash-completion -y
+.. [#loki]
 
-Then, generate the completion script: ::
+    Grafana Loki is a special database designed to store, index and search
+    system logs. For more information, see
+    https://github.com/nethserver/ns8-loki
 
-  api-server-logs completion bash > /etc/bash_completion.d/api-server-logs.sh
+.. [#promtail]
 
-Logout and login from the shell to enable the completion.
+  The promtail.service core service operates on every node, reading system
+  journals, forwarding new records to the active Loki instance, and
+  preserving the last sent journal cursor position to ensure seamless
+  restarts without log loss.