-
Notifications
You must be signed in to change notification settings - Fork 142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cbt: creating a common base class for fio subclasses #318
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: Chris Harris <[email protected]>
Adding @sseshasa as well since the refactoring changes seem important |
@harriscr can you kindly post the snapshots showing the tests that you run? Also, it would be very useful the class diagrams that you got from I have some problems to understand what you mean by:
I'm guessing the main difference between each FIO subclass instance would be the underlying FIO engine (eg.libaio, librd, etc.) Or you mean "block" vs 'file" IO? Could you illustrate with an example please? Thanks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added comments to the points I consider might need amending/clarification. It looks good to me so far, but will be useful to have further insight from Mark and Seshasayee.
Thanks
'{:0>8}'.format(config.get('iteration')), | ||
'id-{}'.format(digest)) | ||
self.archive_dir = os.path.join( | ||
archive_dir, "results", "{:0>8}".format(config.get("iteration")), "id-{}".format(digest) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wonder whether a f-string would be better for the future instead of the old .format() style?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. I plan to look at benchmark.py in the future, so will look at changing this line to an f-string when I do.
self.log_lat = config.get('log_lat', True) | ||
logger.info("Results dir: %s", self.archive_dir) | ||
self.run_dir = os.path.join( | ||
settings.cluster.get("tmp_dir"), "{:0>8}".format(config.get("iteration")), self.getclass() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the convention of splitting lines on '{', '(', can you share the config file to instruct the formatter? If that can be part of the commit, much better 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am working on a pyproject.toml file which will contain the formatting instructions, as well as linting and typing. I would like to deliver under a separate PR however.
@@ -130,27 +135,27 @@ def prefill(self): | |||
|
|||
def run(self): | |||
if self.osd_ra and self.osd_ra_changed: | |||
logger.info('Setting OSD Read Ahead to: %s', self.osd_ra) | |||
self.cluster.set_osd_param('read_ahead_kb', self.osd_ra) | |||
logger.info("Setting OSD Read Ahead to: %s", self.osd_ra) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems to me that OSD read ahead is very Ceph specific (rather than a Benchmark attribute), so we might need to consider whether is convenient to abstract it out into the cluster class instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that idea.
Currently osd_ra is an attribute of the benchmark in the yaml file, so it would be good to change it to a cluster attribute in the yaml at the same time as moving it to an attribute of the Ceph object
if self.valgrind is not None: | ||
logger.debug('Adding valgrind to the command path.') | ||
logger.debug("Adding valgrind to the command path.") | ||
self.cmd_path_full = common.setup_valgrind(self.valgrind, self.getclass(), self.run_dir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wonder if setting up valgrid as a profile_tool instead would make it better? @markhpc what are your thoughts please?
yaml.dump(config_dict, fd, default_flow_style=False) | ||
|
||
def exists(self): | ||
def exists(self) -> bool: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the type annotation is going to be a continuous effort since not all the methods have been annotated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. I will type the methods as I need to edit/change them.
This one was changed so the signature matches in the child class and the parent class, as I was changing the child class.
return command | ||
|
||
def _build_prefill_command(self, endpoint_number: int) -> str: | ||
command = f"sudo {self.cmd_path} --rw=write --bs=4M --iodepth={self._configuration_with_defaults.get('prefill_iodepth')}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is assuming that prefill can only be done using block size of 4MB, which I don't object per se, but wondering whether that could be the default value but to allow the flexibility of different block sizes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea of this particular PR is not to change the current behaviour. It is something to consider for the future however
if 'recovery_test' in self.cluster.config: | ||
if self.recov_test_type == 'blocking': | ||
if "recovery_test" in self._cluster.config: | ||
if self._recovery_test_type == "blocking": | ||
recovery_callback = self.recovery_callback_blocking | ||
elif self.recov_test_type == 'background': | ||
elif self._recovery_test_type == "background": | ||
recovery_callback = self.recovery_callback_background | ||
self.cluster.create_recovery_test(self.run_dir, recovery_callback, self.recov_test_type) | ||
self._cluster.create_recovery_test(self.run_dir, recovery_callback, self._recovery_test_type) # type: ignore [no-untyped-call] | ||
|
||
if 'recovery_test' in self.cluster.config and self.recov_test_type == 'background': | ||
if "recovery_test" in self._cluster.config and self._recovery_test_type == "background": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably this block of code requires its own method
ps.append(p) | ||
for p in ps: | ||
p.wait() | ||
log.info("Running fio %s test.", self._cli_options["rw"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the original 'self.mode' was more generic to refer to the IO mode (random read, random write, etc), whereas the suggested '._cli_options["rw"]' is too specific to FIO, so not sure if that conveys the intention.
If in the near future we need to compare two different benchmarks over the same workload, we need that the options generic enough to compare apples with apples, so my vote would be towards a generic IO mode rather than specific FIO "rw".
@markhpc , @sseshasa what are your thoughts please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This matches the existing behaviour. self.mode is taken from the options in the yaml file, and is used for the -rw parameter to fio. The original code is:
self.mode = config.get('mode', 'write')
and the new code is:
self._cli_options["rw"] = self._configuration_with_defaults.get("mode", "write")
I'm happy to change the code to take the value from the configuration instead if you think that would be a better approach, but it doesn't change what is printed to the screen, or what the code does
common.sync_files('%s/*' % self.run_dir, self.out_dir) | ||
self.analyze(self.out_dir) | ||
|
||
def cleanup(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|I think its a good opportunity to add method documentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't believe I missed this. Will add in.
I'm surprised that the linter didn't pick this up either so may take a look at my settings
host = settings.host_info(client)["host"] | ||
for i in range(self.endpoints_per_client): | ||
def analyze(self, output_directory: str) -> None: | ||
log.info("Converting results to json format.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Method documentation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will add
Signed-off-by: Chris Harris <[email protected]>
Reverting the change to fio.py as it was made in error This reverts commit 3c75e3d. Signed-off-by: Chris Harris <[email protected]>
Reverting the change to fio.py as it was made in error This reverts commit 3c75e3d. Signed-off-by: Chris Harris <[email protected]>
Reverting the change to fio.py as it was made in error This reverts commit 3c75e3d. Signed-off-by: Chris Harris <[email protected]>
Reverting the change to fio.py as it was made in error This reverts commit 3c75e3d. Signed-off-by: Chris Harris <[email protected]>
teuthology runs are: rados/perf - all passed crimson-rados/perf - 2 failures - 'hit max job timeout' when running radosbench, which is not effected by these changes |
This is the first part of an effort to rationalise the 5 current fio classes into a single class structure. It will mean the users of cbt no longer need to know the subtle differences between the differing fio classes.
The change adds typing for methods and variables (using mypy) and pep8 compliance (using ruff) to the fio* files changed.
This is my first pull request so I expect there is something I've missed in the process.
This has been tested by: