-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[1.0.2] SHiP: Fix hang on shutdown #816
Merged
Merged
Changes from 5 commits
Commits
Show all changes
8 commits
Select commit
Hold shift + click to select a range
32c5b87
GH-815 Add a test that kills ship clients and make sure nodeos is una…
heifner c09ec2a
GH-815 Attempt to make test fail more often by increasing number of c…
heifner 6350739
GH-815 shutdown nodeos during shutdown of clients
heifner 80d5aef
GH-815 Use state_history_plugin thread for socket
heifner 49492dc
GH-815 Modify connections on the main thread. Use explicit strand for…
heifner 9736f6f
GH-815 Make the test always run in Savanna mode
heifner 2641515
GH-815 Revert usage of strand. Needs to be done in fc::create_listener
heifner 6df1c11
GH-815 Cleanup test
heifner File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,132 @@ | ||
#!/usr/bin/env python3 | ||
|
||
import time | ||
import json | ||
import os | ||
import shutil | ||
import signal | ||
import sys | ||
|
||
from TestHarness import Account, Cluster, TestHelper, Utils, WalletMgr | ||
from TestHarness.TestHelper import AppArgs | ||
|
||
############################################################### | ||
# ship_kill_client_test | ||
# | ||
# Setup a nodeos with SHiP (state_history_plugin). | ||
# Connect a number of clients and then kill the clients and shutdown nodoes. | ||
# nodeos should exit cleanly and not hang or SEGfAULT. | ||
# | ||
############################################################### | ||
|
||
Print=Utils.Print | ||
|
||
appArgs = AppArgs() | ||
extraArgs = appArgs.add(flag="--num-clients", type=int, help="How many ship_streamers should be started", default=1) | ||
args = TestHelper.parse_args({"--activate-if","--dump-error-details","--keep-logs","-v","--leave-running","--unshared"}, applicationSpecificArgs=appArgs) | ||
linh2931 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Utils.Debug=args.v | ||
cluster=Cluster(unshared=args.unshared, keepRunning=args.leave_running, keepLogs=args.keep_logs) | ||
activateIF=args.activate_if | ||
dumpErrorDetails=args.dump_error_details | ||
walletPort=TestHelper.DEFAULT_WALLET_PORT | ||
|
||
totalProducerNodes=2 | ||
linh2931 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
totalNonProducerNodes=1 | ||
totalNodes=totalProducerNodes+totalNonProducerNodes | ||
|
||
walletMgr=WalletMgr(True, port=walletPort) | ||
testSuccessful=False | ||
|
||
WalletdName=Utils.EosWalletName | ||
shipTempDir=None | ||
|
||
try: | ||
TestHelper.printSystemInfo("BEGIN") | ||
|
||
cluster.setWalletMgr(walletMgr) | ||
Print("Stand up cluster") | ||
|
||
shipNodeNum = 2 | ||
specificExtraNodeosArgs={} | ||
specificExtraNodeosArgs[shipNodeNum]="--plugin eosio::state_history_plugin --trace-history --chain-state-history --finality-data-history --state-history-stride 200 --plugin eosio::net_api_plugin --plugin eosio::producer_api_plugin " | ||
|
||
if cluster.launch(pnodes=totalProducerNodes, loadSystemContract=False, | ||
totalNodes=totalNodes, totalProducers=totalProducerNodes, activateIF=activateIF, biosFinalizer=False, | ||
specificExtraNodeosArgs=specificExtraNodeosArgs) is False: | ||
Utils.cmdError("launcher") | ||
Utils.errorExit("Failed to stand up cluster.") | ||
|
||
# verify nodes are in sync and advancing | ||
cluster.waitOnClusterSync(blockAdvancing=5) | ||
Print("Cluster in Sync") | ||
|
||
prodNode0 = cluster.getNode(0) | ||
nonprodNode = cluster.getNode(1) | ||
linh2931 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
shipNode = cluster.getNode(shipNodeNum) | ||
|
||
# cluster.waitOnClusterSync(blockAdvancing=3) | ||
start_block_num = shipNode.getBlockNum() | ||
|
||
#verify nodes are in sync and advancing | ||
cluster.waitOnClusterSync(blockAdvancing=3) | ||
Print("Shutdown unneeded bios node") | ||
cluster.biosNode.kill(signal.SIGTERM) | ||
|
||
Print("Configure and launch txn generators") | ||
targetTpsPerGenerator = 10 | ||
testTrxGenDurationSec=60*60 | ||
numTrxGenerators=2 | ||
cluster.launchTrxGenerators(contractOwnerAcctName=cluster.eosioAccount.name, acctNamesList=[cluster.defproduceraAccount.name, cluster.defproducerbAccount.name], | ||
acctPrivKeysList=[cluster.defproduceraAccount.activePrivateKey,cluster.defproducerbAccount.activePrivateKey], nodeId=nonprodNode.nodeId, | ||
tpsPerGenerator=targetTpsPerGenerator, numGenerators=numTrxGenerators, durationSec=testTrxGenDurationSec, | ||
waitToComplete=False) | ||
|
||
status = cluster.waitForTrxGeneratorsSpinup(nodeId=nonprodNode.nodeId, numGenerators=numTrxGenerators) | ||
assert status is not None and status is not False, "ERROR: Failed to spinup Transaction Generators" | ||
|
||
nonprodNode.waitForProducer("defproducera") | ||
|
||
block_range = 100000 # we are going to kill the client, so just make this a huge number | ||
end_block_num = start_block_num + block_range | ||
|
||
shipClient = "tests/ship_streamer" | ||
cmd = f"{shipClient} --start-block-num {start_block_num} --end-block-num {end_block_num} --fetch-block --fetch-traces --fetch-deltas --fetch-finality-data" | ||
if Utils.Debug: Utils.Print(f"cmd: {cmd}") | ||
clients = [] | ||
files = [] | ||
shipTempDir = os.path.join(Utils.DataDir, "ship") | ||
os.makedirs(shipTempDir, exist_ok = True) | ||
shipClientFilePrefix = os.path.join(shipTempDir, "client") | ||
|
||
for i in range(0, args.num_clients): | ||
start = time.perf_counter() | ||
linh2931 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
outFile = open(f"{shipClientFilePrefix}{i}.out", "w") | ||
errFile = open(f"{shipClientFilePrefix}{i}.err", "w") | ||
Print(f"Start client {i}") | ||
popen=Utils.delayedCheckOutput(cmd, stdout=outFile, stderr=errFile) | ||
clients.append((popen, cmd)) | ||
files.append((outFile, errFile)) | ||
Print(f"Client {i} started, Ship node head is: {shipNode.getBlockNum()}") | ||
|
||
|
||
# allow time for all clients to connect | ||
shipNode.waitForHeadToAdvance(5) | ||
shipNode.waitForLibToAdvance() | ||
|
||
Print(f"Kill all {args.num_clients} clients and ship node") | ||
for index, (popen, _) in zip(range(len(clients)), clients): | ||
popen.kill() | ||
if index == len(clients)/2: | ||
shipNode.kill(signal.SIGTERM) | ||
assert not shipNode.verifyAlive(), "ship node did not shutdown" | ||
|
||
testSuccessful = True | ||
finally: | ||
TestHelper.shutdown(cluster, walletMgr, testSuccessful=testSuccessful, dumpErrorDetails=dumpErrorDetails) | ||
if shipTempDir is not None: | ||
if testSuccessful and not args.keep_logs: | ||
shutil.rmtree(shipTempDir, ignore_errors=True) | ||
|
||
errorCode = 0 if testSuccessful else 1 | ||
exit(errorCode) |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes here seem to make only a single strand for all connections. And then this single strand is moved here? Doesn't look right. I think what you had before was better even though it still didn't make the websocket's default executor be a strand.