Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PROF-10045] uploader: fix segfault on failure to init datadog uploader #26

Merged
merged 1 commit into from
Jun 25, 2024

Conversation

Gandem
Copy link
Member

@Gandem Gandem commented Jun 25, 2024

when we fail to initialize the datadog symbol uploader, we set the uploader to nil, which will cause a panic later on when handling a new executable for symbol upload.

this sets the uploader to symbolication.NewNoopUploader() when the datadog uploader is set up, but fails to be initialized.

when we fail to initialize the datadog symbol uploader, we set the
uploader to nil, which will cause a panic later on when handling
a new executable for symbol upload.

this sets the uploader to symbolication.NewNoopUploader() when the
datadog uploader is set up, but fails to be initialized.
Copy link
Collaborator

@r1viollet r1viollet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
Thanks for addressing this so fast

@Gandem Gandem merged commit d394e12 into nayef/local-symbols Jun 25, 2024
4 checks passed
@Gandem Gandem deleted the nayef/fix-npe-on-init-error branch June 25, 2024 08:40
Gandem added a commit that referenced this pull request Jun 25, 2024
* symbolication: support uploading local symbols remotely

* readme: document experimental local symbol upload

* symbolication: remove noisy error log

when we only filter out fs.ErrNotExist, we get frequent failures
in the local development environment with the following error:
/proc/<pid>/root// is a directory

we seem to think that /proc/<pid>/root// is a executable path,
this happens for short-lived runc processes where /proc/<pid>/maps
has "/" as the pathname for the runc executable itself.

so when we try to open the elf file, we first try to read it from
/proc/<pid>/map_files and (as the process is short lived) when
this fails, we try to open /proc/<pid>/root// which fails with
a "is a directory" error instead of fs.ErrNotExist

to fix this, we just ignore all errors returned by GetElf() and
only log them for debug purposes.

https://github.com/elastic/otel-profiling-agent/blob/0945fe6/libpf/process/process.go#L221-L239

* uploader: fix segfault on failure to init datadog uploader (#26)

when we fail to initialize the datadog symbol uploader, we set the
uploader to nil, which will cause a panic later on when handling
a new executable for symbol upload.

this sets the uploader to symbolication.NewNoopUploader() when the
datadog uploader is set up, but fails to be initialized.
Gandem added a commit that referenced this pull request Jun 26, 2024
* symbolication: support uploading local symbols remotely

* readme: document experimental local symbol upload

* symbolication: remove noisy error log

when we only filter out fs.ErrNotExist, we get frequent failures
in the local development environment with the following error:
/proc/<pid>/root// is a directory

we seem to think that /proc/<pid>/root// is a executable path,
this happens for short-lived runc processes where /proc/<pid>/maps
has "/" as the pathname for the runc executable itself.

so when we try to open the elf file, we first try to read it from
/proc/<pid>/map_files and (as the process is short lived) when
this fails, we try to open /proc/<pid>/root// which fails with
a "is a directory" error instead of fs.ErrNotExist

to fix this, we just ignore all errors returned by GetElf() and
only log them for debug purposes.

https://github.com/elastic/otel-profiling-agent/blob/0945fe6/libpf/process/process.go#L221-L239

* uploader: fix segfault on failure to init datadog uploader (#26)

when we fail to initialize the datadog symbol uploader, we set the
uploader to nil, which will cause a panic later on when handling
a new executable for symbol upload.

this sets the uploader to symbolication.NewNoopUploader() when the
datadog uploader is set up, but fails to be initialized.
Gandem added a commit that referenced this pull request Jun 26, 2024
* symbolication: support uploading local symbols remotely

* readme: document experimental local symbol upload

* symbolication: remove noisy error log

when we only filter out fs.ErrNotExist, we get frequent failures
in the local development environment with the following error:
/proc/<pid>/root// is a directory

we seem to think that /proc/<pid>/root// is a executable path,
this happens for short-lived runc processes where /proc/<pid>/maps
has "/" as the pathname for the runc executable itself.

so when we try to open the elf file, we first try to read it from
/proc/<pid>/map_files and (as the process is short lived) when
this fails, we try to open /proc/<pid>/root// which fails with
a "is a directory" error instead of fs.ErrNotExist

to fix this, we just ignore all errors returned by GetElf() and
only log them for debug purposes.

https://github.com/elastic/otel-profiling-agent/blob/0945fe6/libpf/process/process.go#L221-L239

* uploader: fix segfault on failure to init datadog uploader (#26)

when we fail to initialize the datadog symbol uploader, we set the
uploader to nil, which will cause a panic later on when handling
a new executable for symbol upload.

this sets the uploader to symbolication.NewNoopUploader() when the
datadog uploader is set up, but fails to be initialized.
Gandem added a commit that referenced this pull request Jun 27, 2024
* symbolication: support uploading local symbols remotely

* readme: document experimental local symbol upload

* symbolication: remove noisy error log

when we only filter out fs.ErrNotExist, we get frequent failures
in the local development environment with the following error:
/proc/<pid>/root// is a directory

we seem to think that /proc/<pid>/root// is a executable path,
this happens for short-lived runc processes where /proc/<pid>/maps
has "/" as the pathname for the runc executable itself.

so when we try to open the elf file, we first try to read it from
/proc/<pid>/map_files and (as the process is short lived) when
this fails, we try to open /proc/<pid>/root// which fails with
a "is a directory" error instead of fs.ErrNotExist

to fix this, we just ignore all errors returned by GetElf() and
only log them for debug purposes.

https://github.com/elastic/otel-profiling-agent/blob/0945fe6/libpf/process/process.go#L221-L239

* uploader: fix segfault on failure to init datadog uploader (#26)

when we fail to initialize the datadog symbol uploader, we set the
uploader to nil, which will cause a panic later on when handling
a new executable for symbol upload.

this sets the uploader to symbolication.NewNoopUploader() when the
datadog uploader is set up, but fails to be initialized.
Gandem added a commit that referenced this pull request Jun 28, 2024
* symbolication: support uploading local symbols remotely

* readme: document experimental local symbol upload

* symbolication: remove noisy error log

when we only filter out fs.ErrNotExist, we get frequent failures
in the local development environment with the following error:
/proc/<pid>/root// is a directory

we seem to think that /proc/<pid>/root// is a executable path,
this happens for short-lived runc processes where /proc/<pid>/maps
has "/" as the pathname for the runc executable itself.

so when we try to open the elf file, we first try to read it from
/proc/<pid>/map_files and (as the process is short lived) when
this fails, we try to open /proc/<pid>/root// which fails with
a "is a directory" error instead of fs.ErrNotExist

to fix this, we just ignore all errors returned by GetElf() and
only log them for debug purposes.

https://github.com/elastic/otel-profiling-agent/blob/0945fe6/libpf/process/process.go#L221-L239

* uploader: fix segfault on failure to init datadog uploader (#26)

when we fail to initialize the datadog symbol uploader, we set the
uploader to nil, which will cause a panic later on when handling
a new executable for symbol upload.

this sets the uploader to symbolication.NewNoopUploader() when the
datadog uploader is set up, but fails to be initialized.
Gandem added a commit that referenced this pull request Jun 28, 2024
* symbolication: support uploading local symbols remotely

* readme: document experimental local symbol upload

* symbolication: remove noisy error log

when we only filter out fs.ErrNotExist, we get frequent failures
in the local development environment with the following error:
/proc/<pid>/root// is a directory

we seem to think that /proc/<pid>/root// is a executable path,
this happens for short-lived runc processes where /proc/<pid>/maps
has "/" as the pathname for the runc executable itself.

so when we try to open the elf file, we first try to read it from
/proc/<pid>/map_files and (as the process is short lived) when
this fails, we try to open /proc/<pid>/root// which fails with
a "is a directory" error instead of fs.ErrNotExist

to fix this, we just ignore all errors returned by GetElf() and
only log them for debug purposes.

https://github.com/elastic/otel-profiling-agent/blob/0945fe6/libpf/process/process.go#L221-L239

* uploader: fix segfault on failure to init datadog uploader (#26)

when we fail to initialize the datadog symbol uploader, we set the
uploader to nil, which will cause a panic later on when handling
a new executable for symbol upload.

this sets the uploader to symbolication.NewNoopUploader() when the
datadog uploader is set up, but fails to be initialized.
Gandem added a commit that referenced this pull request Jun 28, 2024
* symbolication: support uploading local symbols remotely

* readme: document experimental local symbol upload

* symbolication: remove noisy error log

when we only filter out fs.ErrNotExist, we get frequent failures
in the local development environment with the following error:
/proc/<pid>/root// is a directory

we seem to think that /proc/<pid>/root// is a executable path,
this happens for short-lived runc processes where /proc/<pid>/maps
has "/" as the pathname for the runc executable itself.

so when we try to open the elf file, we first try to read it from
/proc/<pid>/map_files and (as the process is short lived) when
this fails, we try to open /proc/<pid>/root// which fails with
a "is a directory" error instead of fs.ErrNotExist

to fix this, we just ignore all errors returned by GetElf() and
only log them for debug purposes.

https://github.com/elastic/otel-profiling-agent/blob/0945fe6/libpf/process/process.go#L221-L239

* uploader: fix segfault on failure to init datadog uploader (#26)

when we fail to initialize the datadog symbol uploader, we set the
uploader to nil, which will cause a panic later on when handling
a new executable for symbol upload.

this sets the uploader to symbolication.NewNoopUploader() when the
datadog uploader is set up, but fails to be initialized.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants