Skip to content

Commit

Permalink
Merge pull request #86 from NethServer/feat-6890
Browse files Browse the repository at this point in the history
New System Logs page
  • Loading branch information
DavidePrincipi authored Apr 11, 2024
2 parents 25ab115 + ea45965 commit 1d57cd2
Show file tree
Hide file tree
Showing 2 changed files with 141 additions and 37 deletions.
7 changes: 7 additions & 0 deletions cluster.rst
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,13 @@ In any case, after leader promotion it is necessary to perform these additional

See also the note in :ref:`audit-trail-section` about node promotion.

.. note::

The promotion of a new leader entails a change in the System logs
configuration. Refer to :ref:`logs-persistence-section` for more
details.


Reachable leader node
---------------------

Expand Down
171 changes: 134 additions & 37 deletions log_server.rst
Original file line number Diff line number Diff line change
@@ -1,64 +1,161 @@
.. _loki-section:

.. _system-logs-section:

===========
System logs
===========

Almost everything is logged inside journalctl and sent to the local log server.
Log records generated by any cluster node are collected and stored in the
leader node. See :ref:`logs-persistence-section` for additional information.

Accessing logs
==============

The System Logs page allows users to efficiently search logs based on
date, text query, or context. The available contexts include:

* **Cluster**: Displays logs from all cluster nodes.

* **Node**: Shows logs from a specific node.

* **Application**: Presents logs from a particular application instance.

Users can select between two modes of operation:

* **Dump mode** Retrieves a limited number of log records within a
specified time period. The maximum number of lines can be adjusted using
the ``Max lines`` field.

* **Follow mode** Displays logs in real-time, providing live updates.

If the ``Search query`` field is set, only log records matching the given
string are returned. Only exact substring matches are allowed.

For comparative analysis of logs from two nodes or applications, follow
these steps:

- Click on :guilabel:`Add search` button.

- Optionally, switch to ``Vertical layout`` from the three-dots menu for a
side-by-side comparison.

For example, comparing logs can be useful to see the Traefik log from one
side, to see incoming HTTP requests, and Nextcloud logs on the other side
to see that application activity.

.. note::

By default, log searches are directed to the active Loki instance. If
there are :ref:`inactive Loki instances <inactive-loki-section>` within
the cluster, it is possible to select them to search past log entries.


Command line interface
======================

.. highlight:: text

In addition to accessing logs via the System Logs web page, users can
utilize the ``api-server-logs`` command for log searching. Below are
examples illustrating its usage:

a) **Basic invocation:** Enters "follow mode" for the entire cluster:

::

api-server-logs logs


b) **Follow mode for application instance:** Enables follow mode for the
specified application (module) instance, such as ``traefik1``. The
``--entity`` flag selects the context:

::

api-server-logs logs --entity module --name traefik1


c) **Dump mode for specific instance in a time period:** Initiates dump
mode for the same instance within a specific time period. Dates must
adhere to the ISO8601 format:

::

api-server-logs logs --mode dump --entity module --name traefik1 --from 2024-04-09T16:43:22Z --to 2024-04-09T16:55:31Z


d) **Changing output timezone:** Modifies the output timezone while
maintaining the same query. Refer to ``timedatectl list-timezones`` for
a full list of options:

::

api-server-logs logs --timezone America/New_York --mode dump --entity module --name traefik1 --from 2024-04-09T16:43:22Z --to 2024-04-09T16:55:31Z


.. _logs-persistence-section:

Logs persistence
================

Upon cluster creation, a Loki [#loki]_ core module instance is installed
on the leader node and designated as the active instance. The leader node,
like any other worker node, continuously streams its log data to this
active Loki instance [#promtail]_.

By default, `Grafana Loki <https://grafana.com/oss/loki/>`_ is installed on the leader node, it collects the logs
from all cluster nodes.
A rootful `Promtail <https://grafana.com/docs/loki/latest/clients/promtail/>`_ container runs on all nodes,
including the leader one. It sends all logs to the Loki server.

From the leader node, it is possible to query the logs of all nodes.
Adjusting Settings
------------------

Logs are accessible from the ``System logs`` page.
You can filter logs by date, a text query or context. Available contexts are:
Navigate to the ``Settings`` page and click on the System logs card to
modify log retention (select ``Edit retention``) or assign a user-friendly
name to the active Loki instance (choose the three-dots menu, then ``Edit
label``).

* ``cluster``: all logs from any source
* ``node``: all logs from a given node
* ``app``: logs from a given application instance, regardless where it's currently running

You can see only a subset of the lines or follow the log to see what's happening in real time.
Understanding log retention
---------------------------

Sometimes is useful to compare the logs of two applications side-by-side.
You can do it by following these steps:
Log retention refers to the maximum age of stored log records. Records
older than the retention period are automatically purged. By default,
System logs have a retention period of 365 days, but this can be
customized to any desired duration. For compliance with common regulations
and best practices, a recommended retention period is 200 days or longer.

- setup the filter for a refined search
- click on :guilabel:`Add search` button
- setup the new filter
- select ``Vertical layout`` from the three-dots menu

Logs are now shown side-by-side to easily correlate events.
.. _inactive-loki-section:

Command line
============
Inactive Loki instances
-----------------------

.. highlight:: bash
When a worker node is promoted to leader, a new Loki instance is installed
on it and becomes the active instance, while the old instance is marked as
*inactive*.

If you're familiar with the command line, recent logs are visible using the ``journalctl`` command
and services can be inspected using the ``systemctl`` command.
As root use ``journalctl`` to see messages from agents, rootful and rootless modules.
- The new active instance inherits the retention setting from the old one.

You can also use the ``api-server-logs`` to query directly the Loki server.
Example to inspect the log of the `traefik1` module: ::
- An inactive instance retains logs based on its last retention setting.

api-server-logs logs -e module -n traefik1
- Log searches can still be performed within an inactive instance.

You can also enable the automatic completion of the above command.
First, install the ``bash-completion`` package.
- Restoring a Loki instance from backup renders it inactive.

On RHEL-like distributions: ::
- To remove an inactive instance, select the three-dots menu and choose
the ``Uninstall`` action.

dnf install bash-completion -y

On Debian distribution: ::
.. rubric:: Footnotes

apt-get install bash-completion -y
.. [#loki]
Then, generate the completion script: ::
Grafana Loki is a special database designed to store, index and search
system logs. For more information, see
https://github.com/nethserver/ns8-loki
api-server-logs completion bash > /etc/bash_completion.d/api-server-logs.sh
.. [#promtail]
Logout and login from the shell to enable the completion.
The promtail.service core service operates on every node, reading system
journals, forwarding new records to the active Loki instance, and
preserving the last sent journal cursor position to ensure seamless
restarts without log loss.

0 comments on commit 1d57cd2

Please sign in to comment.