Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PM-1903] Pegasus ensemble manager port is out of range on big systems #2016

Open
mayani opened this issue Dec 14, 2024 · 0 comments
Open

[PM-1903] Pegasus ensemble manager port is out of range on big systems #2016

mayani opened this issue Dec 14, 2024 · 0 comments
Assignees
Labels
affects-5.0.4 Ensemble Manager fix-master Current Trunk Version major Major loss of function. sync-from-jira Synced from Jira

Comments

@mayani
Copy link
Member

mayani commented Dec 14, 2024

When the pegasus-em server starts it tries to use a port based on the user id on the system. If there are many users that port is out of range so the service starts on random free port (I assume).

When we try to interact with the service via the cli (e.g, pegasus-em create runs) it tries to hit the port number that was constructed based on the user id and it fails.

Relevant code line: https://github.com/pegasus-isi/pegasus/blob/master/packages/pegasus-python/src/Pegasus/service/ensembles/commands.py#L18

Example error:
Traceback (most recent call last):
File "/usr/lib64/pegasus/externals/python/requests/models.py", line 380, in prepare_url
scheme, auth, host, port, path, query, fragment = parse_url(url)
File "/usr/lib64/pegasus/externals/python/urllib3/util/url.py", line 392, in parse_url
return six.raise_from(LocationParseError(source_url), None)
File "", line 3, in raise_from
urllib3.exceptions.LocationParseError: Failed to parse: http://127.0.0.1:108751/ensembles

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/lib64/python3.6/site-packages/Pegasus/cli/pegasus-em.py", line 6, in
main()
File "/usr/lib64/python3.6/site-packages/Pegasus/service/ensembles/commands.py", line 699, in main
EnsembleCommand().main()
File "/usr/lib64/python3.6/site-packages/Pegasus/command.py", line 113, in main
cmd.main(args)
File "/usr/lib64/python3.6/site-packages/Pegasus/command.py", line 28, in main
self.run()
File "/usr/lib64/python3.6/site-packages/Pegasus/service/ensembles/commands.py", line 186, in run
response = self.post("/ensembles", data=request)
File "/usr/lib64/python3.6/site-packages/Pegasus/service/ensembles/commands.py", line 54, in post
return self._request("post", path, **kwargs)
File "/usr/lib64/python3.6/site-packages/Pegasus/service/ensembles/commands.py", line 37, in _request
response = requests.request(method, url, **defaults)
File "/usr/lib64/pegasus/externals/python/requests/api.py", line 61, in request
return session.request(method=method, url=url, **kwargs)
File "/usr/lib64/pegasus/externals/python/requests/sessions.py", line 516, in request
prep = self.prepare_request(req)
File "/usr/lib64/pegasus/externals/python/requests/sessions.py", line 459, in prepare_request
hooks=merge_hooks(request.hooks, self.hooks),
File "/usr/lib64/pegasus/externals/python/requests/models.py", line 314, in prepare
self.prepare_url(url, params)
File "/usr/lib64/pegasus/externals/python/requests/models.py", line 382, in prepare_url
raise InvalidURL(*e.args)
requests.exceptions.InvalidURL: Failed to parse: http://127.0.0.1:108751/ensembles

Reporter: @papajim
Watchers:
@papajim
@vahi

@mayani mayani added sync-from-jira Synced from Jira Ensemble Manager affects-5.0.4 fix-master Current Trunk Version major Major loss of function. labels Dec 14, 2024
@mayani mayani changed the title PM-1903 [PM-1903] Pegasus ensemble manager port is out of range on big systems Dec 14, 2024
@mayani mayani self-assigned this Dec 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-5.0.4 Ensemble Manager fix-master Current Trunk Version major Major loss of function. sync-from-jira Synced from Jira
Projects
None yet
Development

No branches or pull requests

1 participant