Skip to content
This repository has been archived by the owner on Mar 3, 2023. It is now read-only.

Limit size of HeronTupleSet. #2253

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

objmagic
Copy link
Contributor

Fix #2234. We limit the size of HeronTupleSet. If it is larger than the maximum size, we release it back to allocator instead of memory pool.

Tested on local machine.

@objmagic objmagic requested a review from srkukarni August 28, 2017 19:00
@@ -431,7 +433,15 @@ void StMgrServer::HandleTupleSetMessage(Connection* _conn,
->incr_by(_message->control().fails_size());
}
stmgr_->HandleInstanceData(iter->second, instance_info_[iter->second]->local_spout_, _message);
__global_protobuf_pool_release__(_message);
auto message_size = _message->ByteSize();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ByteSizeLong?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could not find ByteSizeLong API in generated code. Let me try again...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems this is introduced in 3.1.0 and then deprecated in 3.4.0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, ByteSize calculates size of serialized message, which seems will be slightly larger than actual size in memory.

Copy link
Contributor Author

@objmagic objmagic Aug 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so if the message is 60MB but only little part of it is used to hold the incoming message (because incoming message is small), we will not delete it because I believe serialized size is much smaller than 60MB. i'm not sure about

@objmagic
Copy link
Contributor Author

Did some experiment:

  LOG(INFO) << _message->ByteSize();
  LOG(INFO) << _message->SpaceUsed();
  _message->Clear();
  LOG(INFO) << _message->ByteSize();
  LOG(INFO) << _message->SpaceUsed();

gives us

I0828 14:10:23.306396 2502742976 stmgr-server.cpp:437] 97
I0828 14:10:23.306404 2502742976 stmgr-server.cpp:438] 832
I0828 14:10:23.306416 2502742976 stmgr-server.cpp:440] 0
I0828 14:10:23.306422 2502742976 stmgr-server.cpp:441] 832

@srkukarni
Copy link
Contributor

Thats interesting obs. So ByteSize is probably more related to wire format size and SpaceUsed is related to actual in memory repr. In that case, shouldn't you use SpaceUsed? But isn;t that very slow?

@objmagic
Copy link
Contributor Author

@srkukarni It seems that we have no choice but to use SpaceUsed here. We can do some benchmark here to measure its performance.

@huijunw
Copy link
Contributor

huijunw commented Aug 28, 2017

If the performance is a concern, I would suggest the old method: put it in mempool and run a garage collection against mempool to remove large tuples every 1 min.

@objmagic
Copy link
Contributor Author

objmagic commented Aug 28, 2017

Some benchmarking for SpaceUsed. Size is in bytes. It looks not bad @srkukarni

I0828 15:22:59.759600 2502742976 stmgr-server.cpp:439] size: 1105717
I0828 15:22:59.759625 2502742976 stmgr-server.cpp:440] time: 5 microseconds
I0828 15:22:59.967779 2502742976 stmgr-server.cpp:439] size: 1250892
I0828 15:22:59.967838 2502742976 stmgr-server.cpp:440] time: 10 microseconds
I0828 15:23:00.128224 2502742976 stmgr-server.cpp:439] size: 1250892
I0828 15:23:00.128247 2502742976 stmgr-server.cpp:440] time: 5 microseconds
I0828 15:23:00.290628 2502742976 stmgr-server.cpp:439] size: 1250892
I0828 15:23:00.290647 2502742976 stmgr-server.cpp:440] time: 4 microseconds
I0828 15:23:00.485448 2502742976 stmgr-server.cpp:439] size: 2150892
I0828 15:23:00.485477 2502742976 stmgr-server.cpp:440] time: 6 microseconds
I0828 15:23:00.642523 2502742976 stmgr-server.cpp:439] size: 2150892
I0828 15:23:00.642571 2502742976 stmgr-server.cpp:440] time: 5 microseconds
I0828 15:23:00.818881 2502742976 stmgr-server.cpp:439] size: 2150892
I0828 15:23:00.818902 2502742976 stmgr-server.cpp:440] time: 5 microseconds
I0828 15:23:00.970676 2502742976 stmgr-server.cpp:439] size: 3100892
I0828 15:23:00.970695 2502742976 stmgr-server.cpp:440] time: 4 microseconds
I0828 15:23:01.183621 2502742976 stmgr-server.cpp:439] size: 4000892
I0828 15:23:01.183640 2502742976 stmgr-server.cpp:440] time: 3 microseconds
I0828 15:23:01.350108 2502742976 stmgr-server.cpp:439] size: 4000892
I0828 15:23:01.350129 2502742976 stmgr-server.cpp:440] time: 4 microseconds
I0828 15:23:01.512719 2502742976 stmgr-server.cpp:439] size: 4000892
I0828 15:23:01.512740 2502742976 stmgr-server.cpp:440] time: 4 microseconds
I0828 15:23:01.721442 2502742976 stmgr-server.cpp:439] size: 4000892
I0828 15:23:01.721462 2502742976 stmgr-server.cpp:440] time: 3 microseconds
I0828 15:23:01.896715 2502742976 stmgr-server.cpp:439] size: 4000892
I0828 15:23:01.896742 2502742976 stmgr-server.cpp:440] time: 7 microseconds
I0828 15:23:02.010160 2502742976 stmgr-server.cpp:439] size: 4000892
I0828 15:23:02.010188 2502742976 stmgr-server.cpp:440] time: 6 microseconds
I0828 15:23:02.057590 2502742976 stmgr-server.cpp:439] size: 4000892
I0828 15:23:02.057623 2502742976 stmgr-server.cpp:440] time: 6 microseconds
I0828 15:23:02.272822 2502742976 stmgr-server.cpp:439] size: 4000892
I0828 15:23:02.272848 2502742976 stmgr-server.cpp:440] time: 6 microseconds
I0828 15:23:02.437445 2502742976 stmgr-server.cpp:439] size: 4000892
I0828 15:23:02.437464 2502742976 stmgr-server.cpp:440] time: 4 microseconds
I0828 15:23:02.597956 2502742976 stmgr-server.cpp:439] size: 4000892
I0828 15:23:02.598012 2502742976 stmgr-server.cpp:440] time: 8 microseconds

@srkukarni
Copy link
Contributor

Could you also share the thruput figures? Particularly in exclamation or other topologies that are used for Heron benchmarking?

@objmagic
Copy link
Contributor Author

word count, parallelism=20

Stmgr CPU user time doubled, Data Tuples from Instances dropped ~25%.

We need to fix #1908 first to see more metrics.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants