-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rhel10: move os package sets to yaml files #1104
base: main
Are you sure you want to change the base?
Conversation
Have you tried setting |
spec is a simple format for defining image properties. It's yaml-based, supports inheritance, and deduplication across multiple distro versions. This commit defines the format, and adds a library for parsing it. Additionally, there's a spec-flatten command that allows people to flatten spec files.
This commit moves all os pipeline packages in the new spec format. There should be no functional changes in images.
People don't like the packages that we picked for individual types? They can now create their own spec directories, and pass them via OSBUILD_IMAGES_IMAGE_SPECS_DIR. This obviously should not be an environment variable, but this is the easiest first step that we can make.
Thanks! That did the trick for centos, it turns out that the x86_64 image installer was missing For RHEL, it's a bit of a mess, because the package set merging works slightly differently that in images, so the arrays are not sorted in the same way. Didn't we invent something for this for otk? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this a lot! Some tiny ideas inline and maybe worth brainstorming on some naming)
@@ -284,6 +290,47 @@ func (t *ImageType) Manifest(bp *blueprint.Blueprint, | |||
staticPackageSets[name] = getter(t) | |||
} | |||
|
|||
var itSpec struct { | |||
Spec struct { | |||
Packages []string `yaml:"packages"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe:
Packages struct {
Include []string
Exclude []string
}
it would be a deeper nesting but also more symmetric? Plus it seems most image types have some excludes (even tar!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to clarify: the current merge rules and their implementation merge lists just for top-level keys, so moving Include
and Exclude
under Packages
would break everything. However, we can alter the merge rules:
Dictionaries:
If both values are dictionaries, perform a recursive merge:
For each key:
If the key exists in both, merge the corresponding values.
If the key exists in only one, use that value.
Lists:
If both values are lists, merge them:
Append items from the second list to the first.
Deduplicate items, maintaining all unique entries.
Scalars (Strings, Numbers, Booleans, Null):
If both values are scalars, the value from the second structure overrides the first.
Type Mismatch:
If the types of values for a given key differ, the merge is not allowed. This results in an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The question is: are these rules a good default? Are there easy to reason about? I have no good answers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha, sorry, we can start with the exiting top-level logic, right now the format is internal and as long as we do not have external image defintions we can always refactor. So +1 with starting from that (even though I think we should get something better soon(ish)).
} | ||
|
||
func fileExists(dir fs.FS, filepath string) bool { | ||
f, err := dir.Open(filepath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a small comment there that the more common _, err := os.Stat(filepath); return err == nil
will not work because it operates on a fs.FS? // fs.FS has no Stat() so we actually need to test-open the file
(or something)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no idea what I was thinking about, it shouldn't be a problem to use Stat
. It's still very much prone to TOCTOU, but I cannot see how to solve this without having all files open, and I guess it's not a huge deal.
@@ -0,0 +1,58 @@ | |||
# Image specs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no strong opinion but maybe it is worth brainstorming the name here a bit. Spec sounds like rpm-spec to me but maybe that is just me. Alternative ideas for the dirname
- imgtypes
- types
- images
- image_definitions
- imgdefs
- imgspecs (if we want to keep specs)
...
Very cool and very necessary. As for naming, specifications or definitions both sounds fine. You've only taken the RHEL packages here but in the Fedora package sets we have a "lot" of conditionals such as only enabling certain packages for certain versions. How would we handle this with the YAML files here? |
I think I'm leaning towards the term "image definitions". We always refered to the go files as definitions, so I think we can continue using that term, just switch the "driver" to yaml. So I think I would rename the
I think that Fedora would benefit from less deduplication than RHEL. So, one directory == one version, no sharing between them. I think that our long-term dream was/is to extract these yamls into a separate repository, and use the traditional branching model. So for now, I suggest we just have three dirs: fedora-40, fedora-41 and fedora-rawhide (or 42, dunno). Later, when they are in a separate repo (and maybe shipped in a separate RPM?), we can either teach images how to load them from system, or keep it simple and just embed them using submodules:
Note that for CentOS/RHEL, I think I would keep the deduplication (as it exists in the go code) as much as possible. We know that it works for us, and for RHEL we are basically the maintainers. We can always reconsider this later. I don't like that I'm basically proposing different models for Fedora and RHEL, but I think it sorta makes sense given the differences between these two distros. |
I'm a bit surprised there's no apparent comparisons or references to other extant image format definitions already used in the Fedora-derivative ecosystem? There are many:
The thing that drove me towards spec files (a conclusion which honestly I didn't want to get to) is documented here https://gitlab.com/fedora/bootc/base-images-experimental/-/blob/main/README-build-rationale.md?ref_type=heads#key-aspect-reusing-spec-files That whole file has also a lot of related discussion. Certainly getting packages out of code into a declarative file format is something that makes obvious sense to me, but introducing a new package-list-in-YAML/toml/XML/specfile that isn't one of the above seems like it needs some rationale. Every single time someone has introduced one it has just been a new file format only understood by one tool. Picking just one is impossible, but my personal inclination is to try to have more tools accept multiple formats - that was the idea behind coreos/rpm-ostree#5119 (and I also have no problem having it accept blueprints too, but honestly they're pretty verbose for this use case). |
The rationale is that this is the first step to externalising our image definitions. These are currently defined as a combination of:
Now there's clearly a lot of history (and technical debt) here. Some of the things in these configuration structures can be simplified, squashed together, etc, but the fact is that they're our current state and our starting point for this process. Regardless, what we expect to end up with is some transformation (simplification) of all of these into an internal structure that's useful for us, for everything we need. I don't think something like kickstart (or any of the other examples in your list, at least the ones I'm familiar with) is going to cut it. Or if it can, then we'd need to start parsing kickstarts to make partition tables, or image configs, or os customizations, and that's not only a lot of work, it's also kinda pointless. We're not going to be working with an internal representation of a kickstart. What we're doing here is taking code structures and pulling them out (and simplifying, either during the extraction of after the fact). We're not creating an end-user-facing configuration format, just making our lives easier in a way that will hopefully make it easier for contributors (and hopefully distro maintainers) to modify image configurations without needing to navigate (or write) code and without needing to recompile go binaries. Oh and XML is out of the question 😄 |
Of course, I think this is an improvement and obviously none of my comments should be construed as trying to block anything. But at the same time, we can and should think about how the evolution of this fits in with at least some of the other ones. In the course of needing to succeed at doing my job I have had to interact with (or at least think about) relatively recently with every single one of the above things, and this would just add a new one to the mix. |
I understand where you're coming from. There's a long-running struggle to figure out the target audience for each configuration format and level. The configuration formats you mentioned are, in my opinion, serving similar purposes to our blueprints and blueprints are definitely the weakest of all of those. I think a conceptual problem here is that every one of the configuration formats you mentioned span the whole range of use cases from lightly customizing a base image configuration to distro/spin image configuration from scratch, and we always kept those two ends separate. A big part of the value of image builder is that (for example) selecting an Now, you're right, the other configs are serving a similar (or sometimes the same) purpose as what we're doing here: they let you define an image configuration pretty much from scratch, with some useful abstractions. But a lot of what we're defining here is coupled with how osbuild works (to varying degrees). Pulling in a configuration format from another project doesn't really serve any purpose. We could write a parser that reads a kickstart's partitioning directives, or a Kiwi |
This is my RFC PR for moving package sets outside Go into yaml.
What I did
Image spec format
I introduced a new yaml-based format called image spec. Spec is a simple format for defining image properties. It's yaml-based, supports inheritance, and deduplication across multiple distro versions.
A quick snippet showing rhel 10's qcow2:
The whole specification is in
specs/README.md
.Conversion of Centos Stream 10 and RHEL 10 and to image spec
Os package sets of both CS10 and RHEL 10 are now fully converted to image specs. There should not be any functional changes. The image specs are embedded using
go:embed
, so everything feels the same.I might need help from @achilleas-k to verify that nothing really changed, the generated manifests have different UUIDs for reasons that I don't understand.
Overriding embedded image specs
Embedded image specs can be fully overriden. Just pass the spec directory via
OSBUILD_IMAGES_IMAGE_SPECS_DIR
, and the library won't use the internal bits.Next steps
We can start moving more and more stuff into yaml. I think that the next piece could be base partition tables:
The ultimate goal is that the Go code doesn't know anything about what's e.g. an ami. The yaml files drive the available image types. There are some gaps in the current spec definition for this to happen, but we can iterate on them.