Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add additional metadata to app images #309

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
131 changes: 131 additions & 0 deletions text/0000-additional-app-metadata.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
# Meta
[meta]: #meta
- Name: Additional Application Image Metadata
- Start Date: 2024-03-05
- Author(s): dmikusa
- Status: Draft
- RFC Pull Request: (leave blank)
- CNB Pull Request: (leave blank)
- CNB Issue: (leave blank)
- Supersedes: N/A

# Summary
[summary]: #summary

When using a `Dockerfile`, it is possible to set additional metadata items like exposed ports and health check information. This metadata can be used by external systems, like container orchestrators, to automatically configure and deploy application images. It is not presently possible to set this metadata using buildpacks.

# Definitions
[definitions]: #definitions

Dockerfiles settings for the proposed metadata:

- Ports can be set with [`EXPOSE`](https://docs.docker.com/reference/dockerfile/#expose)
dmikusa marked this conversation as resolved.
Show resolved Hide resolved
- Health check information can be set with [`HEALTHCHECK`](https://docs.docker.com/reference/dockerfile/#healthcheck)

# Motivation
[motivation]: #motivation

- It is not presently possible to set this metadata using buildpacks.
- Deploying with Docker or other container orchestrators, the system can use this information to more easily run images.
- To support Prometheus' or Grafana Agent's Docker Service Discovery. The exposed port will be used to scrape metrics. If you don't expose (or map) your port, the service discovery doesn't find the container.

# What it is
[what-it-is]: #what-it-is

This RFC proposes that we add additional extension points so that it is possible to expose port and health check information in the application container images that are generated.

As a buildpack author, I will be able to include port and health check information as part of the process type that are defined by a buildpack.

For each process type defined, I can add a list of ports. A port is defined as the port number and an optional protocol. If the protocol is not set, then `TCP` is assumed. To expose both `TCP` and `UDP`, then the port number should be included twice. Once with with each protocol. Valid values for the protocol are `TCP` and `UDP`. This is the same behavior as Docker's `EXPOSE` functionality, which is intentional, as it should enable buildpack metadata to map in the same way as `Dockerfile` metadata.

For each process type defined, I can add health check information. The following health check properties will be available: interval, timeout, start period, start interval, retries, and command. If no health check information is provided, then nothing will be added to the image.

Because the image can only have a single set of ports and health check information, the lifecycle will include the information for the default process type when it exports the application image.

As an end user, if I want to have different ports/health check information exposed on the application image then the user will set the `--default-process` flag to `pack build` and change the default process type.

# How it Works
[how-it-works]: #how-it-works

As a buildpack author, additional information can be written into the [`launch.toml` file](https://github.com/buildpacks/spec/blob/main/buildpack.md#launchtoml-toml). This will include an array of ports and the health check information as described above.

Example:

```
[[processes]]
type = "web"
command = ["java"]
args = ["-jar", "app.jar"]
default = true
working-dir = "/workspace"

[[processes.ports]]
port = 8080
protocol = "TCP"

[[processes.ports]]
port = 8081 # protocol is optional, defaults to "TCP"

[processes.health-check]
interval = "20s"
timeout = "3s"
start-period = "1m"
start-interval = "30s"
retries = 3
use-shell = true
command = "thc"
```

This information is written into the `Config` section of the image metadata.

```
{
...
"Config": {
...
"ExposedPorts": {
"8080/tcp": {},
"9000/udp": {}
},
"Healthcheck": {
dmikusa marked this conversation as resolved.
Show resolved Hide resolved
"Test": [
"CMD-SHELL",
"thc"
],
"Interval": 20000000000,
"Timeout": 3000000000,
"StartPeriod": 60000000000,
"Retries": 3
},
...
}
...
}
```

The `Test` section is an array of the command to run. If `use-shell` is `true`, then the first item in the array is set to `CMD-SHELL`. If `use-shell` is `false` then the first item in the array is set to `CMD`. In addition, values for interval, timeout, start period, and start interval should be converted to nano seconds. This is all to be consistent with the way Docker embeds metadata from a `Dockerfile`, which is a major goal of this proposal.

# Drawbacks
[drawbacks]: #drawbacks

None beyond the time to implement.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any concern that this could cause confusion when multiple process types are defined? Do we need to warn / alert the user somehow about this?

That said I think this should be fine to enable. Buildpack authors don't have to support it if they don't want to.

Copy link
Contributor Author

@dmikusa dmikusa Mar 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ports/health check information is specified by the buildpack per process type, so if I have web and scheduler process types, you can define different ports/health check info for each one. Unfortunately, there's not parity for that in the container image metadata. There's only the capacity to to put metadata for one process type in the container's ports/health check info so we have to pick one of them.

I think the default process type set by buildpacks seems most reasonable and least likely to be surprising (i.e. best choice if we put anything in this container metadata) because if you're going to docker run the image then that's what is run. Given that, I think it stands to reason that the metadata in the container image would reference the thing that is run in this manner.

I'm not sure what opportunities we have to insert some warnings/helpful logging. Where I see this being potentially confusing is that when you run the non-default process type, (I assume, I haven't checked) Docker is still going to see this metadata for the default process type and attempt to use it as ports/health check info (unless the user overrides it). That might be confusing if you don't remember (or know) that the ports/health check metadata is on the container image. Especially the health check. The container would start and could potentially fail if the health check is different between the two process types.

I'm not totally sure what we can do about it, especially at this level. Maybe pack could have an opt-out flag for including this metadata (although buildpacks could do that too)? Buildpacks could also log at build time if this metadata is being set. Ideally, there would be some kind of reminder that it's set at runtime but I don't know that is possible and it would be noise a lot of the time, like you'd only want a warning when something other than the default process type is run & there is metadata set on the container image.

I suppose the other possibility would be that we don't write the metadata if there are multiple process types. Then it's not confusing because there's only one process type. I think this would negate a lot of the benefit though. I can only speak for Paketo Java buildpacks, but I know that would rule out most or all of our buildpacks. We might be able to change things to only include one process type, but right now they write the same command to multiple different process types for some historical reason that predates me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we provide a healthcheck binary, similar to launcher? Something that executes the correct heathcheck for the running process but allows HEALTHCHECK to be the healthcheck for the running process?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that idea. Is that possible? How would it know the correct process type to run?

If you put /cnb/healthcheck into HEALTHCHECK, when the health check is invoked /cnb/healthcheck would need to figure out the process type. It could default to the default process type, but I'm not sure how it would know if a different process type was being used. Is that metadata available somewhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we provide a healthcheck binary, similar to launcher? Something that executes the correct heathcheck for the running process but allows HEALTHCHECK to be the healthcheck for the running process?

This would actually be really helpful, IF we can make this process skip invoking the exec.d binaries. It should source the environment, but not run exec.d binaries. This is because some exec.d binaries have side effects when they run, so it's not really safe to execute them every time we run the health check, which could be multiple times a minute.

We actually learned this the hard way with the Paketo Health Checker buildpack :(

Having said that, it also seems like a tremendous amount of additional work, although please correct me if that's an invalid assumption, and I wonder if we couldn't just do something in the launcher that would allow us to launch the health check binary like it normally does, but skip the exec.d binaries.

If you put /cnb/healthcheck into HEALTHCHECK, when the health check is invoked /cnb/healthcheck would need to figure out the process type. It could default to the default process type, but I'm not sure how it would know if a different process type was being used. Is that metadata available somewhere?

I was thinking about this more and it seems like we'd know this if it were implemented in the launcher, since the launcher knows what process to run. So maybe that's not really a big deal.

@natalieparellano what do you think about the above two points?


# Alternatives
[alternatives]: #alternatives

None

# Prior Art
[prior-art]: #prior-art

None

# Unresolved Questions
[unresolved-questions]: #unresolved-questions

None

# Spec. Changes (OPTIONAL)
[spec-changes]: #spec-changes

This RFC requires changes to `launch.toml`. They are additive changes, so backward compatible. It adds the ports array and health-check object as described in the example above.
Loading