Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(server/config): Implement checked access to server configuration, decoupled from SessionManager #4170

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

whisperity
Copy link
Contributor

@whisperity whisperity commented Feb 21, 2024

This patch introduces the configuration_access library which offers semantically correct and higher-level access to a configuration data structure, via Options as defined in a Schema, as opposed to raw operator[] of random dicts.

Options are registered into an ongoing Schema instance mimicking the interface of standard argparse. An Option carries with it potential semantic enhancements such as a validation function to automatically detect invalid configuration values, or an encapsulated default value.

Refactored the CodeChecker server package to use the new logic for accessing the [$CONFIG_DIRECTORY/]server_config.json file. Originally, the CodeChecker server only came with command-line options, which was extended with a configuration file for authentication purposes in #393. However, #1249 added a serious design blunder, in which the previous session_config.json was mixed together with other non–authentication-related options and became server_config.json. Unfortunately, the changing of the file's "schema" also came with hacking access to non–authentication-related configuration fields into the SessionManager class.

Moving forward with the development of the asynchronous store protocol is blocked by this lack of this separation of concerns, as background worker threads should not be given access to authentication data just for them to access otherwise needed configuration values. Accessing of shared state between different pools without explicit IPC primitives and exclusion can very quickly deadlock the system.

@whisperity whisperity added this to the release 6.24.0 milestone Feb 21, 2024
@whisperity whisperity force-pushed the feat/server/asynchronous-store-protocol/patch/1-refactor-session-manager branch 9 times, most recently from cb20f73 to 9ce1775 Compare August 7, 2024 15:13
@whisperity whisperity marked this pull request as ready for review August 7, 2024 15:40
This commit introduces a new `ServerConfiguration` handler object which
duty is to deal with everything about handling of the configuration
options the servers are started with.
Previously, this was done by the `SessionManager` class since
commit a5119f0
introduced non-authentication-related configuration options into the
previously authentication-only configuration file.

The new infrastructure aims to offer a generally more user-friendly
and more type-safe way of dealing with configuration options in
various parts of the `server` package.
`Option`s are registered simply, akin to adding command-line options
to an `argparse`-based parser.
@whisperity whisperity force-pushed the feat/server/asynchronous-store-protocol/patch/1-refactor-session-manager branch from 9ce1775 to 82c7af9 Compare August 10, 2024 12:42
@whisperity whisperity removed the WIP 💣 Work In Progress label Aug 17, 2024
Copy link
Contributor

@vodorok vodorok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not finished with the review, but I think the first round of questions and notes are worth to publish.

pip install $(grep -iE \
"mypy|pycodestyle|pylint|types" \
analyzer/requirements_py/dev/requirements.txt) \
$(grep -iE \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you searching both of the requirements.txts:?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This patch updated codechecker_common's dev-requirements with new dependencies, and those new dependencies are needed to be installed for the static analysis to be run in this linting job. The "common tests" now runs only the unit tests for the new code added to codechecker_common. Having static analysis (mypy) run together with the other linters is more appropriate than making the hard test case checking fail because mypy overapproximated something.

{"/Users/0/Privileges/0/Scope", "/Users/2/Privileges/0/ID"},
set(dict(fails).keys()))

def test_update(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please further divide the test methods?

path = parent._path.rstrip('/') + '/' + path.replace("./", '', 1)

options = object.__getattribute__(self, _K_OPTIONS)
for parent_path in map(str, reversed(PurePosixPath(path).parents)):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why reversed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pathlib.PurePath.parents returns a generator of parents from inside out, starting with the most direct parent first. However, in order to traverse the mapping we need to go outside in, starting from the outermost parent. Saying "/foo/bar is not defined" (think like "not a directory") when even /foo is missing follows the allegory with file systems and tree hierarchies.

>>> import pathlib
>>> p = pathlib.PurePosixPath("/foo/bar/baz/qux/aaa.txt")
>>> list(p.parents)
[PurePosixPath('/foo/bar/baz/qux'), PurePosixPath('/foo/bar/baz'), PurePosixPath('/foo/bar'), PurePosixPath('/foo'), PurePosixPath('/')]
>>> list(reversed(p.parents))
[PurePosixPath('/'), PurePosixPath('/foo'), PurePosixPath('/foo/bar'), PurePosixPath('/foo/bar/baz'), PurePosixPath('/foo/bar/baz/qux')]

# This code is frightening at first, but, unfortunately, the usual
# 'self.member' syntax must be side-stepped such that __getattr__ and
# __setattr__ can be implemented in a user-friendly way.
object.__setattr__(self, _K_OPTIONS, {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__setattr__ is not overridden in this class. Why aren't you use the conventional way to add the root member? We are also not inheriting from a class where it is changed. I might be missing something

parent = self.root
path = parent._path.rstrip('/') + '/' + path.replace("./", '', 1)

options = object.__getattribute__(self, _K_OPTIONS)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use self.options()?


@property
def options(self) -> OptionDict:
return object.__getattribute__(self, _K_OPTIONS)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am failing to comprehend the magic behind this __getattribute__

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is the same thing as throughout all the configuration access classes. They are special because they provide the member access . operator's results dynamically. When you say O.foo, it runs getattr(O, "foo"). This can be overloaded by providing O.__getattr__(self, str): the interpreter will first refer the member __getattr__ function from the object's corresponding type, and then call that to perform the resolution. (This gets more involved when there are multiple non-trivial paths in the __mro__ (multiple inheritance), but this patch doesn't abuse that feature!)

Now we have a function that can return a "member" based on a string at runtime, and that thing does not even have to be a real member of the class! The problem now arises if you want to actually access "real" members that the custom __getattr__ isn't taught about. Calling object.__getattribute__ explicitly instructs the program to disregard whatever customisations of __getattr__ there might be and run the "default" member access ("attribute lookup") logic unconditionally.


Additional keyword arguments are forwarded to the `Option` constructor.
"""
if path == '/':
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can imagine an example of this validation leading to"funny-looking" options or directories.
Eg.: when there is a [] in the middle preceding the final [] like /2ddirectory[][]/ Or when using, multiple / for specifying the option, .//path/to/option//// and the root parent was passed in as a parent.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a few new testcases that verifies that only the correct options and directories can be added.

return opt


def _get_config_json(file_path: Path) -> Dict[str, Any]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this function not part of the Configuration class?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only way it would make sense there as if it was a @classmethod or even better, a @staticmethod. However, this function does not consider anything related to the nice schematic access of configurations, so it would look odd to have it in there, a function that "talks" on a completely different abstraction level.

Copy link
Contributor

@bruntib bruntib left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This development is really impressive and comprehensive design. Maybe, slightly more than what we need. I would suggest not to change server configuration at this point as it is not unavoidable for async-store feature. Please, rebase the next 3 PRs so this server config is not the part of them. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants