Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory Optimizations #131

Open
wants to merge 12 commits into
base: develop
Choose a base branch
from
Open

Memory Optimizations #131

wants to merge 12 commits into from

Conversation

iyourshaw
Copy link
Collaborator

@iyourshaw iyourshaw commented Jan 3, 2025

Optimize CIMMS memory usage.

Implements several best practices for managing memory usage.

Java Memory

Updates JVM options in the Dockerfile:

  • Explicitly set the maximum Java heap space to 50% of the total container memory.
  • Adds garbage collection and metaspace options similar to the recommendation for Kafka to minimize GC pauses:
    https://kafka.apache.org/documentation.html#java

RocksDB Native Memory

Implements Confluent recommendations for managing RocksDB state store memory.

  • Use the recommended jemalloc allocator.
  • Implements BoundedMemoryRocksDBConfig to be able to configure total RocksDB memory usage to a finite value. Attempted to choose reasonable default values. Configurable with environment variables.

@iyourshaw iyourshaw changed the title Memory usage Memory Optimizations Jan 4, 2025
@iyourshaw iyourshaw marked this pull request as ready for review January 4, 2025 00:12
@iyourshaw iyourshaw requested a review from Michael7371 January 4, 2025 00:12
@@ -15,9 +15,11 @@ services:
MAVEN_GITHUB_TOKEN: ${MAVEN_GITHUB_TOKEN:?error}
MAVEN_GITHUB_ORG: ${MAVEN_GITHUB_ORG:?error}
image: jpo-conflictmonitor:latest
privileged: true # Set true to allow writing to /proc/sys/vm/drop_caches
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the rocksdb variables be passed into the conflict monitor in the docker compose?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes they should, good catch

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The conflict visualizer has another docker-compose file that brings up the conflict monitor. Do we need to add this parameter to that file as well? This change would not be part of this PR, but would involve a separate PR to the conflict visualizer.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "privileged" setting was mainly for testing. I changed it to false by default. It doesn't need to be included anywhere else (and shouldn't be set in production). But the new environment variables below, starting with "ROCKSDB_" should be included anywhere there is a docker compose for cimms.

Dockerfile Outdated
# Use jemalloc for RocksDB per Confluent recommendation:
# https://docs.confluent.io/platform/current/streams/developer-guide/memory-mgmt.html#rocksdb
RUN amazon-linux-extras install -y epel
RUN yum install -y jemalloc-devel
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be best to combine these in one line and remove the downloaded apt lists, relevant docker best practices

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, combined the commands, but amazon linux uses yum, not apt, so can't delete apt lists (and I can't find any info about if yum has some equivalent that should be removed)

Copy link
Collaborator

@Michael7371 Michael7371 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good just minor comments

kafka:
condition: service_healthy
required: false
# deduplicator:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to remove the deduplicator as part of this PR? Or should we remove it as a part of a separate PR? In either case, we should probably delete it entirely (including removing the source code) instead of just commenting it out.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had commented it out because, although it's not related to this PR, it was causing a conflict after pulling in the latest jpo-utils/develop branch. Now deleted entirely.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants