-
Notifications
You must be signed in to change notification settings - Fork 238
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix loki source file windows #2282
base: main
Are you sure you want to change the base?
Conversation
💻 Deploy preview available: https://deploy-preview-alloy-2282-zb444pucvq-vp.a.run.app/docs/alloy/latest/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -143,6 +144,24 @@ func (c *Component) Run(ctx context.Context) error { | |||
c.mut.RUnlock() | |||
}() | |||
|
|||
tickerChan := make(chan struct{}) | |||
if c.args.RetryInterval > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesnt handle updating the RetryInterval after running, can we add a test for that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs to handle updating
|
||
The `encoding` argument must be a valid [IANA encoding][] name. If not set, it | ||
defaults to UTF-8. | ||
|
||
You can use the `tail_from_end` argument when you want to tail a large file without reading its entire content. | ||
When set to true, only new logs will be read, ignoring the existing ones. | ||
|
||
`retry_interval` is deactivated by default (`"0s"`). This should be set on Windows-based systems when rotating files with the same names because the | ||
component will no try to re-open the files and the targets won't be updated because of the cache. | ||
This is not needed on Unix-like systems because the component will always try to re-open the deleted files. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Drafting up a new suggestion.
We can state the specific OSes we support.
This sentence (and the overall description) isn't really clear to me. Why will loki.source_file
try to reopen deleted files? The description in the table says the arg will tell the component to try to reopen closed files. Which is it? I assume it's closed files since trying to reopen deleted files doesn't really make sense. Why won't the component try to reopen closed files on Windows? It's also not really clear to me. Windows caches the files? And the other OSes do not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a case where a user is rotating a "app.log" file everytime it reaches 5MB by deleting it and creating a new app.log file.
On Windows you must first close the file before deleting it. Closing the file will stop the tailer in the loki.source.file component. If the new file would have a different name, the local.file_match would send it to the loki.source.file component and it would work fine. But for the local.file_match component nothing changed because it only checks at a regular interval and it will just see that there is still an "app.log" file. Because of this, it won't notify the loki.source.file component to start a new tailer and the file won't be tailed (it only notifies the next component when the set of files changed).
On Unix-like systems, you can delete the file without closing it. Because the file is not closed, the tailer is not stopped, it continuously tries to read the file until the local.file_match sends an update. In the case of the user, it will restart the tailer once the new "app.log" file is created because of the names it thinks that it's the same file.
I made it updatable but the tests are flaky and I'm not sure why... Also I'm not super happy with the solution, adding an argument for an edge case on Windows does not feel great. I'm tempted to just put a Note in the doc for now and maybe we could plan a refactor of this component to use a pool of worker as you suggested and at the same time see whether we can tackle this edge case in a different way @mattdurham ? |
PR Description
On Windows, when a file that
loki.source.file
was tailing is deleted, the tailer will be stopped. On Unix-like systems, the tailer will stay alive and keep retrying to open the file.This is ok in most cases, but in the specific scenario where you have a loki.source.file receiving targets from a local.file_match component and that you rotate the file by deleting/re-creating it with the same name and within the same the sync_period, the local.file_match won't send an update because the targets are still the same. As a result, on Windows the loki.source.file will not tail the file anymore. (On Unix-like system, it would recover because it would manage to re-open the file).
This PR adds an optional argument
retry_interval
(default 0) to restart the readers on a tick when they stop. By default, there is no tick so that existing implementations are not affected.Which issue(s) this PR fixes
Notes to the Reviewer
PR Checklist