Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix regressions introduced by PR #359 "Flexible locations for data served by thredds" #478
fix regressions introduced by PR #359 "Flexible locations for data served by thredds" #478
Changes from 2 commits
237eae4
eea8733
3b036c5
0525e47
ba934b8
75f9726
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The preferred way to do this is to use the
birdhouse
executable which should be used as the interface for interacting with the stack (see #428).I know a lot of the other scripts do this for backwards compatibility reasons but going forward I would rather encourage using the
birdhouse
executable.I'm fine with this for the PR since it puts it in line with some of the other scripts in this directory but I'd like to move away from this strategy soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer the scripts having that "just in case" they are called directly.
If omitted, a server using the old way would suddenly get some operations working and others not.
That being said, we should promote using
birdhouse
binary rather than calling the scripts directly.However, it needs to be improved (other PRs), since the above command actually executes the script, and in this specific case, that leads to downloading all the test data. So, not that great for a "print configs" command.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fmigneault I agree with everything you're saying.
I'm not sure I understand what you mean by:
Do you mind clarifying
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh thanks for the reminder, my habits still stayed pre 2.0 version !
I think this is absolutely amazing for various external one-off scripts outside of this repo because then the user do not have to remember the boilerplate to source
read-configs.include.sh
.But for official scripts committed to this repo, I think I would prefer the current backward compatible behavior for the following reasons:
birdhouse
executable and even without being in a checkout). Historically all the scripts inscripts/
actually started as fully standalone scripts without the need to be in a checkout. I wrote them for my own quick usage, then I realized they could be useful for other, so I commit them in the repo for sharing.bootstrap-testdata
is actually being called bybootstrap-instance-for-testsuite
, the single entrypoint for any CI to prep the instance with test data so we do not need to update the CI steps if we change our testdata initialization process), then the script should be able to run "standalone".There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mishaschwartz
I've tried doing
birdhouse configs -c birdhouse/scripts/bootstrap-testdata
, and the operation proceeded to download test data, althoughbirdhouse -h
indicates thatconfigs
only "prints a command that can be used to load configuration". I assumed that means it should only print the configurable variables, but not run the script itself? Must be a side effect of the sourcing mechanism that runs the script anyway.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even with this more detailed help, I'm not sure what is expected 😅
What is the use of the
-p
flag (load variant?) vs not having it (execute variant?) ? Shouldn't load be the default for a "configs" operation?I'm somewhat expecting
birdhouse configs
to behave likedocker compose config
orbirdhouse-compose config
. It does any relevant templating replacement and prints the resulting "state" of what would be run (or available variables), but doesn't actually run it.Is that correct expectation?
If not, why would one use configs to run the script, rather than just running the script?
It would be clearer to have something like
birdhouse run -c <script-path>
for that purpose.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you have a minute, run
birdhouse configs -h
and check out the examples at the bottom to see if that clears up some of your questions.To try to answer some of them here:
with
-p
you can essentially source the environment variables into your current process. So for example, we could replace the code that we're discussing right now with this to get the same result:You can do that with
birdhouse compose config
and you'll get the result you're expecting.Not really... think of it this way... we're solving two problems:
-c
we can load all the environment variables and then run a command in a subprocess-p
we can load all of the environment variables in the current process (like we were sourcing read-configs.include.sh)Because now the script doesn't need to require that all relevant environment variables are sourced. Also we can run arbitrary commands/scripts so it makes testing and script writing easier.
That's currently what we can do with
birdhouse configs -c <script-path>
. Are you saying that you'd preferrun
as a subcommand?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry. I wasn't clear by saying "I'm somewhat expecting birdhouse configs to behave like docker compose config [...]". I understand the confusion I created.
What I actually meant was not literally the "same command", but "same behavior", as in, it evaluates the configs as if it would do its usual command, but stops just before and shows the "config" instead. In the case of
birdhouse configs
, the "configs" would just print the list of resolved variable names and values. Doing abirdhouse compose ...
right after should resolve exactly the same name/values, but do the compose operation with them. Basically, I was expectingbirdhouse configs
to be equivalent to apip freeze
, or any other<tool> list
command, and adding the-c
meant "using this config file instead of the default" (rather than run this command). Basically, a "show me resolved values" command.Maybe that could be a
birdhouse configs --list
?I've read the examples, and yeah, it's clearer to me what it does now. I guess the
-c
is potentially misleading because it could do way more that just "load configs" depending on what the script contains (as it did in this case downloading test data). Maybe there needs to be more warnings about that.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ok I see what you mean now. The only issue I see with implementing this is that all the configuration settings get loaded as environment variables. So I don't see a great way to correctly display all the resolved values without including all the other environment variables that exist on the system as well. You can just do:
This also includes other environment variables that are used internally but may not have much relevance to the user such as
DELAYED_EVAL
andVARS
.If we prefixed all our environment variables with
BIRDHOUSE_
then we could do it but as it stands, we don't have a clean way of implementing this.If you can think of something I'm missing please let me know.
Are you thinking of the
-e
or--env-file
flag here?I mean the help string for this command says "Execute the given command after loading configuration settings", I'm not sure how else to warn the user that the command that they provide will be executed other than by saying "execute the given command"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will merge this PR. Maybe you guys can continue the conversation over #479?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be
:-
instead of:=
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will set
THREDDS_SERVICE_DATA_LOCATION_ON_HOST
if undefined, and then setDATASET_ROOT
with it, which will result in the same behavior in this case.Since the rest of the script doesn't rely on
THREDDS_SERVICE_DATA_LOCATION_ON_HOST
, not a big deal, but we can use:-
since it is more common to indicate the default value.However, maybe consider
BIRDHOUSE_DATA_PERSIST_ROOT
as well:DATASET_ROOT="${THREDDS_SERVICE_DATA_LOCATION_ON_HOST:-${BIRDHOUSE_DATA_PERSIST_ROOT:-/data}/datasets}"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I started to use
:=
instead of:-
to actually setTHREDDS_SERVICE_DATA_LOCATION_ON_HOST
because had the sourcing ofread-configs.include.sh
worked, thatTHREDDS_SERVICE_DATA_LOCATION_ON_HOST
would also exist for real.I was trying to emulate the successful sourcing of
read-configs.include.sh
even in standalone mode, but it was not needed since we don't useTHREDDS_SERVICE_DATA_LOCATION_ON_HOST
afterwards. I can revert to:-
no problem.