-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python-based version of the QLever script #17
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first time I ran it, I just got this message and it kept running forever:
Follow olympics.server-log.txt until the server is ready (Ctrl-C stops following the log, but not the server)
nohup: ServerMain: No such file or directory
I think it would be good to check if the configured binary is found and if not, give a message to tell the user how to set PATH
, QLEVER_HOME
, or [server] binary
- whatever you think is the recommended method. Also, I am using the recommended directory structure of qlever-code
, qlever-control
, and qlever-indices
so it would be nice if it worked "out of the box" for people who set up that directory structure.
Index test
First, I tried PATH=../../qlever-code/build:$PATH ../qlever.py index
but it gave a strange error:
libc++abi: terminating due to uncaught exception of type std::runtime_error: U_FILE_ACCESS_ERROR
cat: stdout: Broken pipe
This could be some problem with my compiled binary of IndexBuilderMain
which is quite out of date at this point. If I get more time, I'll rebuild it and test this again.
Luckily, when I set USE_DOCKER = true
in the .ini file, it worked perfectly and built the index!
Start test
With docker disabled, I got another error message at first:
Action: "start"
nohup ServerMain -i olympics -j 8 -p 7019 -m 5 -c 2 -e 1 -k 100 -a olympics_7643543846 > olympics.server-log.txt 2>&1 &
Follow olympics.server-log.txt until the server is ready (Ctrl-C stops following the log, but not the server)
2023-08-28 17:32:58.879 - INFO: QLever Server, compiled on Fri Jun 9 09:58:04 PDT 2023 using git hash 8d0639
2023-08-28 17:32:58.883 - ERROR: U_FILE_ACCESS_ERROR
Once again, when I set USE_DOCKER = true
in the file and try again, it works perfectly.
Stop
Stop is not working, but it warned me that "stop" isn't yet working on MacOS, so that's expected. Let me know if I can help with getting it to work, maybe after this PR is merged.
qlever.py
Outdated
@@ -0,0 +1,441 @@ | |||
#!/usr/bin/python3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On MacOS, I have a newer version of python installed with Homebrew, and it gets installed at the path /opt/homebrew/bin/python3
to not conflict with the older system python3 at /usr/bin/python3
. Likewise, people may have a virtualenv active in which case they want to use the venv's python, not this system python. So changing this to the following would let everyone run this as ./qlever.py
regardless of how they have python configured, I think:
#!/usr/bin/python3 | |
#!/usr/bin/env python3 |
Qleverfile.ini
Outdated
|
||
[data] | ||
BASE_URL = https://github.com/wallscope/olympics-rdf | ||
GET_DATA_CMD = wget -q -nc -O olympics.zip ${BASE_URL}/raw/master/data/olympics-nt-nodup.zip && unzip -q -o olympics.zip && rm olympics.zip |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wget
is not installed on MacOS, but curl
is. Do you think we can use curl
instead? If not, let's recommend installing wget
along with psutil
before using this.
Action: "get-data"
wget -q -nc -O olympics.zip https://github.com/wallscope/olympics-rdf/raw/master/data/olympics-nt-nodup.zip && unzip -q -o olympics.zip && rm olympics.zip
sh: wget: command not found
GET_DATA_CMD = wget -q -nc -O olympics.zip ${BASE_URL}/raw/master/data/olympics-nt-nodup.zip && unzip -q -o olympics.zip && rm olympics.zip | |
GET_DATA_CMD = curl -L ${BASE_URL}/raw/master/data/olympics-nt-nodup.zip --output olympics.zip && unzip -q -o olympics.zip && rm olympics.zip |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A partial review of the first ~150 lines of the Python script. It is very readable so far and makes it easy to reason about.
qlever.py
Outdated
self.config.read("Qleverfile.ini") | ||
self.name = self.config['DEFAULT']['name'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do those give reasonable error messages when there is no, or an ill-formed Qleverfile?
qlever.py
Outdated
defaults = { | ||
"DEFAULT": { | ||
"log-level": "info", | ||
}, | ||
"server": { | ||
"binary": "ServerMain", | ||
"num_threads": "8", | ||
"cache_max_size_gb": "5", | ||
"cache_max_size_gb_single_entry": "1", | ||
"cache_max_num_entries": "100", | ||
"with_text_index": "no", | ||
"only_pso_and_pos_permutations": "no", | ||
"no_patterns": "no", | ||
}, | ||
"index": { | ||
"binary": "IndexBuilderMain", | ||
"with_text_index": "no", | ||
"only_pso_and_pos_permutations": "no", | ||
"no_patterns": "no", | ||
}, | ||
"docker": { | ||
"image": "adfreiburg/qlever", | ||
"container_server": f"qlever.server.{self.name}", | ||
"container_indexer": f"qlever.indexer.{self.name}", | ||
}, | ||
"ui": { | ||
"port": "7000", | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest moving those out of this class right to the beginning of the file.
qlever.py
Outdated
for section in defaults: | ||
for option in defaults[section]: | ||
if not self.config[section].get(option): | ||
self.config[section][option] = defaults[section][option] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there also a check, that all the configurations from the Qleverfile are also valid options?
For example what happens if I use num_treads
in the QLeverfile or a similar typo? I think rasing a hard error would be the most user-friendly, but therefore we need the set of all supported options as a whitelist (similar to the defaults).
qlever.py
Outdated
except Exception: | ||
self.docker_enabled = False | ||
print("Note: `docker info` failed, therefore" | ||
" docker.USE_DOCKER=true not supported") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it USE_DOCKER
or use_docker
(the other settings are lowercase). Or is this whole config case-insensitive?
qlever.py
Outdated
print("Note: QLever binaries not found or failed, therefore" | ||
" docker.USE_DOCKER=false not supported") | ||
self.binaries_work = False | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we immediately terminate the script if we have neither a working Docker nor a working native binary setup? I think this would be the right position to give a useful error message here.
Also: the self.binaries_work
variable is set here, but never used as far as I see.
qlever.py
Outdated
|
||
# Default values for options that are not mandatory in the Qleverfile. | ||
defaults = { | ||
"DEFAULT": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it "DEFAULT" in caps, but the rest is lowercase?
qlever.py
Outdated
curl_cmd = f"curl -s http://localhost:{port}/ping?msg={message}" | ||
exit_code = subprocess.call(curl_cmd, shell=True, | ||
stdout=subprocess.DEVNULL, | ||
stderr=subprocess.DEVNULL) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we use a cross-platform python http library for these things and subprocess
etc. only where really necessary?
Support for ad-freiburg/wharfer would also be nice. I don't have access to docker directly on some machines. Adding wharfer support is probably best done as a separate PR once this PR is merged. I would be willing to contribute this feature. |
@Qup42 |
@hannahbast Sure that's even better! Podman has been on my bucket list for some time now. :) Afaik dockers image format is compatible with podman. The cli options seem to differ quite a bit. So then it would be podman support (in a separate PR). |
The shell script we have so far was a proof of concept to play around with. The usability was great, but given the heavy use of bash-specific code, there were major portability issues with other operating systems. In particular, even the most basic functionality did not work on MacOS.
These are the first steps to rewrite the script in Python. As a welcome side effect, this also allows for cleaner code (bash scripts always end up very messy and cryptic when they grow bigger) and better testability.