You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Based on this definition, a Linux/K8s container (cgroup) is not explicitly considered a host. However, this raises an important question: isn’t a cgroup an isolated instance with its own dedicated computing resources? What truly distinguishes a virtual machine from a container?
If containers are not classified as hosts but as services running within a host, it brings up another critical question: what should the value of the host.name attribute be when retrieved by SDKs operating inside a container? Should this attribute even be populated in such cases?
Currently, the Go SDK utilizes the os.Hostname function from the internal package to retrieve and populate the host.name resource attribute. Meaning that a Go SDK container will report as host.name the container’s hostname, not the hostname value of the virtual machine it is running on. If a container is not considered a host, should SDKs running on containers report this value?
A workaround to fulfill the host’s semantic convention description, would be to send all container’s signals to a collector which overrides/sets the host.name value with the actual virtual machine hostname the container is running on. The following diagram shows an architecture that overrides all container's host.name value with the k8s.node.name value, which normally corresponds to the virtual machine's hostname:
The previous is a possible user interpretation of the host.name value to simplify service correlation—ensuring all service.name instances sharing the same host.name are identified as running on the same machine— but, it comes with notable downsides:
Requires of an additional processor (either on-site or on the backend) to override the host.name value. In addition, it requires the ability to gather the value of the host.name in the upper virtualization layer. The latest is feasible to achieve if there is a service like the OpenTelemetry collector that has resource detectors to do so. But what would happen for standalone agents running in a container and directly sending OTLP data to another node’s OTLP endpoint or even a remote endpoint?
Proposal: Include container in the host definition + new host.hostname attribute
Modify the host semantic conventions registry definition to explicitly include containers. This adjustment will narrow the scope of the definition, reducing ambiguity for containerized environments.
The current description of host.name is highly permissive:
“Name of the host. On Unix systems, it may contain what the hostname command returns, or the fully qualified hostname, or another name specified by the user.” - Reference
This flexibility leads to a lack of determinism in the value of host.name. In some cases, it reflects the value returned by the gethostname system call, while in others, it may represent a user-defined custom value. However, in distributed systems, the actual service networking hostname plays a critical role in enabling reliable entity and signal correlation.
To address this issue and align with established conventions, such as those in Elastic ECS, a new attribute could be introduced to explicitly reference the hostname used for networking communications. This proposal differentiates between the network hostname and the general host name as follows:
host.hostname: The hostname of the host as used for networking communications. Typically, this is the value returned by the hostname command on the host machine.
host.name: The name of the host. This attribute may contain what the hostname command returns on Unix systems, the fully qualified domain name (FQDN), or a name specified by the user. The recommended value is the lowercase FQDN of the host.
Describe alternatives you've considered
Proposal: Define the value of host.name dependent on the environment
To ensure consistency across different deployment scenarios, we could define the value of host.name in a way that depends on the environment in which the service is running. Suggested guidelines for various scenarios are as follows:
Virtual Machine Deployments: In virtual machine environments, the value of host.name for any running service should be equal to the virtual machine's fully qualified domain name (FQDN).
In Kubernetes environments, the approach should mirror that for Docker: either don’t provide the host.name value if SDKs don’t have a “host” resource detector (which will need an outer enricher) or be equal to the “host” FQDN hostname (or even the k8s.node.name value).
This second proposal complements the first by further emphasizing the need to distinguish between the actual networking hostname of the computing instance and the value of host.name as defined by semantic conventions.
Additional context
No response
The text was updated successfully, but these errors were encountered:
In Kubernetes environments, the approach should mirror that for Docker: either don’t provide the host.name value if SDKs don’t have a “host” resource detector (which will need an outer enricher) or be equal to the “host” FQDN hostname (or even the k8s.node.name value).
I think it would help in general if we could discuss some specific examples here to understand the various cases and validate any decision against them.
Area(s)
area:host
Is your change request related to a problem? Please describe.
The current semantic conventions' documentation for the “host” registry is defined as:
Reference: https://github.com/open-telemetry/semantic-conventions/blob/main/docs/attributes-registry/host.md
Based on this definition, a Linux/K8s container (cgroup) is not explicitly considered a host. However, this raises an important question: isn’t a cgroup an isolated instance with its own dedicated computing resources? What truly distinguishes a virtual machine from a container?
If containers are not classified as hosts but as services running within a host, it brings up another critical question: what should the value of the host.name attribute be when retrieved by SDKs operating inside a container? Should this attribute even be populated in such cases?
Currently, the Go SDK utilizes the os.Hostname function from the internal package to retrieve and populate the host.name resource attribute. Meaning that a Go SDK container will report as
host.name
the container’s hostname, not the hostname value of the virtual machine it is running on. If a container is not considered a host, should SDKs running on containers report this value?A workaround to fulfill the host’s semantic convention description, would be to send all container’s signals to a collector which overrides/sets the host.name value with the actual virtual machine hostname the container is running on. The following diagram shows an architecture that overrides all container's
host.name
value with thek8s.node.name
value, which normally corresponds to the virtual machine's hostname:The previous is a possible user interpretation of the
host.name
value to simplify service correlation—ensuring all service.name instances sharing the same host.name are identified as running on the same machine— but, it comes with notable downsides:Requires of an additional processor (either on-site or on the backend) to override the host.name value. In addition, it requires the ability to gather the value of the host.name in the upper virtualization layer. The latest is feasible to achieve if there is a service like the OpenTelemetry collector that has resource detectors to do so. But what would happen for standalone agents running in a container and directly sending OTLP data to another node’s OTLP endpoint or even a remote endpoint?
The hostname within a container/cgroup describes useful information: “Containers within the Pod see the system hostname as being the same as the configured name for the Pod.”
The container’s hostname is used for container’s communication, either in K8s or standalone Docker deployment.
Describe the solution you'd like
Proposal: Include container in the host definition + new
host.hostname
attributeModify the
host
semantic conventions registry definition to explicitly include containers. This adjustment will narrow the scope of the definition, reducing ambiguity for containerized environments.The current description of host.name is highly permissive:
This flexibility leads to a lack of determinism in the value of host.name. In some cases, it reflects the value returned by the gethostname system call, while in others, it may represent a user-defined custom value. However, in distributed systems, the actual service networking hostname plays a critical role in enabling reliable entity and signal correlation.
To address this issue and align with established conventions, such as those in Elastic ECS, a new attribute could be introduced to explicitly reference the hostname used for networking communications. This proposal differentiates between the network hostname and the general host name as follows:
host.hostname
: The hostname of the host as used for networking communications. Typically, this is the value returned by the hostname command on the host machine.host.name
: The name of the host. This attribute may contain what the hostname command returns on Unix systems, the fully qualified domain name (FQDN), or a name specified by the user. The recommended value is the lowercase FQDN of the host.Describe alternatives you've considered
Proposal: Define the value of host.name dependent on the environment
To ensure consistency across different deployment scenarios, we could define the value of host.name in a way that depends on the environment in which the service is running. Suggested guidelines for various scenarios are as follows:
Virtual Machine Deployments
: In virtual machine environments, the value of host.name for any running service should be equal to the virtual machine's fully qualified domain name (FQDN).Docker
deployments, container’shost.name
should correspond to the FQDN of the host machine is running on, not the hostname of the service's container. Services (SDKs) should have a resource detector in place to retrieve the “host” hostname, similar to the OpenTelemetry Collector docker resource detector: https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/processor/resourcedetectionprocessor/internal/docker/documentation.mdKubernetes
environments, the approach should mirror that for Docker: either don’t provide the host.name value if SDKs don’t have a “host” resource detector (which will need an outer enricher) or be equal to the “host” FQDN hostname (or even thek8s.node.name
value).This second proposal complements the first by further emphasizing the need to distinguish between the actual networking hostname of the computing instance and the value of host.name as defined by semantic conventions.
Additional context
No response
The text was updated successfully, but these errors were encountered: