-
Notifications
You must be signed in to change notification settings - Fork 707
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Service start timeout inside container servercore:ltsc2019 since v0.17.0 #946
Comments
Do you know which Golang release(s) contains these re-writes? I think it would be worth testing newer Golang versions to see if this is fixed. If not, checking an environment variable similar to the |
I would say why not test the latest golang release... I found the commits some times ago but they are probably dispatched across several releases. |
I don't have an problem with that, but it should be tested against this issue. I'm not able to run the container with my current setup, would you be OK to build the image using the latest Golang version and test? |
@breed808 I can test it but I don't succeed to build if for now. Is there some prerequisites to install before launching the makefile ? |
You'll need promu installed to build the executable via the makefile. Building the image will also require Docker or a substitute like Podman. |
I've tried to build it with the latest go version |
To clarify, is the exporter unable to run as a service or via a CLI command? |
via cli: I tried to run in powershell container:lts-nanoserver-1809 to version 0.14 and unfortunately in every case there is the same error. Running from cli - .\windows_exporter.exe |
Hello, Sorry for the big delay...
But if I start the installed service the process seems crash when the process starts :
@hpoznanski Using a "nanoserver" is probably not a good idea and by experience the windows version 1809 is far from perfect. |
@breed808 not at all, I just start manually a vanilla container, download and run win exporter on it. |
Thanks for the info. We may need to document the issue with running the service in a container; I don't use Windows containers so I wouldn't be able to debug this issue. |
Ah ! I thought it was related to goland fwk itself but I juste discovered working on something else that it's part of the package "x/sys" https://pkg.go.dev/golang.org/x/[email protected]/windows/svc I take the bets it will solve the issue ! :) |
@breed808 I can finaly take some time to take a look deeper. For now my tests are not great but at least I'm able to build and launch win exporter in a container ! |
OK.... I thought the IsWindowsService() was used, but I just discovered I was completely out of the way ! To summarize, if you launch win_exporter inside a container as CLI it works, but if you launch it as service it crashes, or timeout, or never start (hard to say preceisely) EDIT: my bad, your PR has never been merged, but you have a commit on master... a5f22eb |
Have you tried v0.20 as it should include my change which attempts to workaround the issue you seem to be describing by 'starting' the windows service as early as possible rather than waiting for all the dependencies to load. In a 'typical' Windows server this was happening due to a lack of resources (cpu) to load the dependencies within the 30s timeout foe a Windows service so you may have been having a similar issue in containers? |
#551 contains a lot of the background on this. |
Awsome @jammiemil I also face that issue on regular hosts ! Moreover I have the feeling we never enter in the init() function of initiate package, because I don't see any log like |
Yeah I saw that happen occasionally even with the rejigged init to try to start asap, ultimately the workaround I put in place stops a good chunk of the failures on startup but it can still happen because under certain conditions it can take the underlying golang subroutines more than 30 seconds to start, there's pretty much nothing you can do about that in any particular codebase as far as I can tell, but I will admit my golang abilities are limited so I'm very much open to a more robust solution, I know the guys working on Grafana Agent are trying to come up with something a little more robust than my 'fudge' to resolve the same issue I that repo. |
Slight correction to my previous comment. the delay is either in the underlying golang subroutines OR the remaining dependencies like sys |
So ! Thanks a lot @jammiemil for crossing informations here ! I finaly reverted the initiator to use |
I created an issue on the x/sys lib to follow that error golang/go#56335 |
This issue has been marked as stale because it has been open for 90 days with no activity. This thread will be automatically closed in 30 days if no further activity occurs. |
Hi all. Any clue how I can mitigate this issue? :) Great thanks in advance. |
Hello,
I discovered during a test to upgrade, that any version >= 0.17.0 are not more able to start as service inside a container.
If I launch it in CLI it works, but as service it fails.
I highly suspect #863 because even if IsWindowsService is the good practice it seems has been rewrited pretty recently in golang codebase due to bugs.
So maybe a golang update could solve the issue, else adding a workaround to "force" the service mode should be considered.
There is a similar trick on otel collector project : https://github.com/open-telemetry/opentelemetry-collector/blob/7ed3f75ef84d9e9d11b175a0859060f765faca0b/docs/troubleshooting.md#startup-failing-in-windows-docker-containers and used here https://github.com/open-telemetry/opentelemetry-collector/blob/4439e9b49c4de55bdc050ee4928b5b0c79c317cb/cmd/builder/internal/builder/templates/main_windows.go.tmpl#L32
The text was updated successfully, but these errors were encountered: