Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Too small batch_count may lead to excessive memory usage #5

Open
saaj opened this issue Feb 15, 2018 · 1 comment
Open

Too small batch_count may lead to excessive memory usage #5

saaj opened this issue Feb 15, 2018 · 1 comment

Comments

@saaj
Copy link

saaj commented Feb 15, 2018

The default configuration of logbeam that is passed to cwlogs is the following:

buffer_duration = 10000
batch_count = 10
batch_size = 1024 * 1024

Default AWS's configuration is:
buffer_duration = 5000
batch_count = 1000
batch_size = 32768

logbeam sets maximum batch_size but sets batch_count too low which may lead (and does in our case) to slow log queue processing and excessive memory usage. It may be beneficial to set default batch_count to at least 1000.

@saaj
Copy link
Author

saaj commented Feb 19, 2018

Will a little more research in the excessive memory usage, it turns our that even moderate average logging volume with spikes can lead to excessive memory usage (we've seen over 1GiB of excess to normal memory profile with logbeam enabled). Memory profiling with dozer and pyrasite shows that all LogEvents were eventually submitted and garbage collected, but memory has not been released to underlying memory allocator (which is Python known for, e.g. see this SO question). Python 2 seems to be more affected because Python 3 which has received a few garbage collection improvements.

To alleviate excessive memory usage, it is possible to a use compatible file-based queue, like pqueue. Currently logbeam can be monkeypatched like the following:

from argparse import Namespace

import logbeam
import pqueue

class FileQueue(pqueue.Queue):

    def __init__(self):
        pqueue.Queue.__init__(self, tempfile.mkdtemp())

logbeam.Queue = Namespace(Queue=FileQueue, Empty=pqueue.Empty)

Thus it would be useful to be able to pass queue class to logbeam. Note that logbeam and cwlogs rely on stlib's queue exceptions, Full and Empty. pqueue reuses stdlib's exceptions. Other file-based queues, like sqlite-based persist-queue may have their own Full and Empty, but it can still be coerced in a subclass and doesn't undermine the need for queue implementation control.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant