Skip to content
This repository has been archived by the owner on Jun 21, 2022. It is now read-only.

PMM-5492 Add Pprof data to logs.zip #1117

Open
wants to merge 19 commits into
base: main
Choose a base branch
from
Open

Conversation

pkadej
Copy link
Contributor

@pkadej pkadej commented May 13, 2022

PMM-5492 Add Pprof data to logs.zip

Build: SUBMODULES-2574

utils/pprof/pprof_test.go Outdated Show resolved Hide resolved
utils/pprof/pprof_test.go Outdated Show resolved Hide resolved
utils/pprof/pprof_test.go Outdated Show resolved Hide resolved
utils/pprof/pprof_test.go Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented May 18, 2022

Codecov Report

Merging #1117 (3d5d295) into main (ec60eb4) will decrease coverage by 0.30%.
The diff coverage is 38.61%.

@@            Coverage Diff             @@
##             main    #1117      +/-   ##
==========================================
- Coverage   49.29%   48.99%   -0.31%     
==========================================
  Files         180      184       +4     
  Lines       20826    21519     +693     
==========================================
+ Hits        10267    10543     +276     
- Misses       9426     9794     +368     
- Partials     1133     1182      +49     
Flag Coverage Δ
all 48.65% <38.61%> (-0.30%) ⬇️
cover 49.15% <39.25%> (-0.26%) ⬇️
crosscover 48.99% <38.61%> (-0.30%) ⬇️
update 12.70% <18.18%> (+0.07%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
cmd/pmm-managed-starlark/main.go 0.00% <0.00%> (ø)
main.go 0.00% <0.00%> (ø)
models/database.go 49.61% <0.00%> (ø)
models/node_helpers.go 71.08% <ø> (ø)
models/settings.go 100.00% <ø> (ø)
services/agents/handler.go 0.00% <0.00%> (ø)
services/agents/state.go 0.00% <0.00%> (ø)
services/alertmanager/alertmanager.go 53.92% <ø> (ø)
services/config/config.go 0.00% <ø> (ø)
services/grafana/auth_server.go 59.18% <ø> (ø)
... and 42 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2050762...3d5d295. Read the comment docs.

@pkadej pkadej marked this pull request as ready for review May 30, 2022 22:21
main.go Outdated
mux.HandleFunc("/logs.zip", func(rw http.ResponseWriter, req *http.Request) {
contextTimeout := 10 * time.Second
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can it be moved to config

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I propose to use just a constant for that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with @artemgavrilov, no reason to artificially increase the number of items in configs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to const defaultContextTimeout.

main.go Outdated
mux.HandleFunc("/logs.zip", func(rw http.ResponseWriter, req *http.Request) {
contextTimeout := 10 * time.Second
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I propose to use just a constant for that.

services/config/pmm-managed.yaml Outdated Show resolved Hide resolved
services/supervisord/logs.go Outdated Show resolved Hide resolved
utils/pprof/pprof_test.go Outdated Show resolved Hide resolved
utils/pprof/pprof_test.go Outdated Show resolved Hide resolved
utils/pprof/pprof_test.go Outdated Show resolved Hide resolved
utils/pprof/pprof_test.go Outdated Show resolved Hide resolved
main.go Outdated
mux.HandleFunc("/logs.zip", func(rw http.ResponseWriter, req *http.Request) {
contextTimeout := 10 * time.Second
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with @artemgavrilov, no reason to artificially increase the number of items in configs.

main.go Outdated
mux.HandleFunc("/logs.zip", func(rw http.ResponseWriter, req *http.Request) {
contextTimeout := 10 * time.Second
// increase context timeout if pprof query parameter exist in request
pprofQueryParameter, _ := strconv.Atoi(req.FormValue("pprof"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest using strconv.ParseBool instead of strconv.Atoi.
Do not ignore err, let's at least log the message.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.

Comment on lines 35 to 37
pprof:
profile_duration: 30s
trace_duration: 10s
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

making timeouts configurable doesn't bring any value to customers, it's so rare case when we might need to update these values in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved to constants

@@ -49,6 +50,7 @@ type Config struct {
Services struct {
Platform platform.Config `yaml:"platform"`
Telemetry telemetry.ServiceConfig `yaml:"telemetry"`
Pprof pprof.Config `yaml:"pprof"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove it from the config, it will simplify the code a lot.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 220 to 232
traceBytes, err := pprofUtils.Trace(pprofConfig.TraceDuration)
files = append(files, fileContent{
Name: "pprof/trace.out",
Data: traceBytes,
Err: err,
})

profileBytes, err := pprofUtils.Profile(pprofConfig.ProfileDuration)
files = append(files, fileContent{
Name: "pprof/profile.pb.gz",
Data: profileBytes,
Err: err,
})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we run them in parallel to speed up?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Fixed.

@pkadej pkadej requested review from BupycHuk and artemgavrilov June 11, 2022 21:09
}
var pprofConfig *pprof.Config
if pprofQueryParameter {
contextTimeout += pProfProfileDuration + pProfTraceDuration
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess nginx will terminate connections longer than 10 minutes. Can you please check it? If so we should document this limitation or change nginx configuration.

@pkadej pkadej requested a review from ritbl June 14, 2022 09:32
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants