-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable local deployment to improve development and testing #484
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why #361 was considered a failed attempt? My current dev setup is running the stack locally with a self-signed certificate, and everything works just fine.
Personally, I have found that bypassing SSL via HTTP lead to code that worked locally but didn't once deployed on actual servers enforcing HTTPS, because requests are not doing exactly the same checks and code paths in requests libraries, and that introduces slight variations in what actually happens when resolving requests.
See also other comments about the strategy employed for the HTTP access.
I have tried using the current master
branch with options already available, and I'm able to access the services in HTTP-only mode. Just need to have the dummy SSL cert file define to patch the volume mount (that on its own would be OK to adjust). I'm interested to see if the operations can be simplified.
Also interested in @tlvu's approach for local dev.
|
||
export INCLUDE_FOR_PORT_80='$([ x"$BIRDHOUSE_ALLOW_UNSECURE_HTTP" = x"True" ] && echo "include /etc/nginx/conf.d/all-services.include;" || echo "include /etc/nginx/conf.d/redirect-to-https.include;")' | ||
export PROXY_INCLUDE_FOR_PORT_80='$([ x"$BIRDHOUSE_ALLOW_UNSECURE_HTTP" = x"True" ] && echo "include /etc/nginx/conf.d/all-services.include;" || echo "include /etc/nginx/conf.d/redirect-to-https.include;")' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't the whole issue about using HTTPS avoided by setting BIRDHOUSE_ALLOW_UNSECURE_HTTP=True
? The HTTP→HTTPS would not occur, so requests would be already using port 80 to access the services.
Just tried this on my end and the stack responds using HTTP using either localhost
and my local IP:
> BIRDHOUSE_ALLOW_UNSECURE_HTTP=True pavics-compose up -d
Granted, the BIRDHOUSE_PROXY_SCHEME
updates would be needed to report the http
rather than https
where it was harcoded, but it seems the https.include
rewrite is not needed (the HTTPS server definition is just ignored).
Am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change here is only changing the variable name to follow the naming convention introduced in 2.4.0. This doesn't need to be in the PR but as long as we're fixing things I brought the variable name up to date.
Granted, the BIRDHOUSE_PROXY_SCHEME updates would be needed to report the http rather than https where it was harcoded
Yes, this is the important change.
but it seems the https.include rewrite is not needed (the HTTPS server definition is just ignored).
It's needed if you don't have a self signed SSL certificate. Otherwise the proxy container won't start since nginx fails to parse a valid SSL certificate in the https block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry. My comment wasn't clear.
The variable rename is fine for alignment with 2.x, and so is the BIRDHOUSE_PROXY_SCHEME
addition.
What I was referring to is the contents of the variable. I placed the comment here because I found it better illustrated the BIRDHOUSE_ALLOW_UNSECURE_HTTP=True
I was referring to.
Agreed, the SSL certificate file is needed for nginx
config to resolve.
I think that if you use the BIRDHOUSE_ALLOW_UNSECURE_HTTP=True
approach, you don't even need the contents to be an actual SSL cert file. It just needs to be "something" for the volume mount to work (but I'm not sure, to validate).
Might be worth checking out if the following works:
touch ssl-dont-care.pem
BIRDHOUSE_ALLOW_UNSECURE_HTTP=True \
BIRDHOUSE_SSL_CERTIFICATE="$(realpath ssl-dont-care.pem)" \
birdhouse compose up -d
Otherwise, the self-signed certificate described in #484 (comment) can be used to have a more "legit" file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see...
Yes, it has to be a well-formatted certificate file or else nginx complains and won't start.
birdhouse/components/weaver/config/canarie-api/canarie_api_monitoring.py.template
Outdated
Show resolved
Hide resolved
root /usr/share/nginx/html; | ||
} | ||
} | ||
${PROXY_INCLUDE_HTTPS} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See other comment in proxy default.env
for more context.
I think it is better to leave it there regardless and let the HTTPS redirection or not deal with which "server" config gets hit by the requests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comment here: #484 (comment)
Specifically:
but it seems the https.include rewrite is not needed (the HTTPS server definition is just ignored).
It's needed if you don't have a self signed SSL certificate. Otherwise the proxy container won't start since nginx fails to parse a valid SSL certificate in the https block.
@@ -12,12 +12,10 @@ services: | |||
image: ${PROXY_IMAGE} | |||
container_name: proxy | |||
ports: | |||
- "80:80" | |||
- "443:${PROXY_SECURE_PORT}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the 443 could be left and is simply ignored.
Makes sense to keep this edit depending on other discussions regarding how the redirects are handled and how variables get defined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the 443 could be left and is simply ignored.
True, but better not to open up ports if it's not necessary
Makes sense to keep this edit depending on other discussions
Sure that makes sense
birdhouse/scripts/logging.include.sh
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure the changes are entirely backward compatible.
Before, the 1>&2
redirects were done for specific log levels (WARN and above).
This allowed logging outputs to stdout or some file, and only potentially problematic messages to another file. It seems this distinction is not possible anymore, since all goes either to stdout, stderr or the logfile.
Also, if we want both stdout/logfile, we need to ignore both options the script provides, and do birdhouse [...] | tee logfile
ourselves. It seems like specifying both options should do the same as tee
using both outputs simultaneously.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before, the 1>&2 redirects were done for specific log levels (WARN and above).
This is not exactly true. Some logs at WARN level and above were redirected to stderr but others weren't. The previous approach was inconsistent.
This allowed logging outputs to stdout or some file, and only potentially problematic messages to another file.
This can be implemented but it should be done in logging.include.sh
, not on an ad hoc basis in the files that call the logger. I'm happy to implement this functionality if it's useful.
It seems like specifying both options should do the same as tee using both outputs simultaneously.
Sure, we can make that work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If some >WARN logs were not redirected, those were unintended bugs.
I agree it should be implemented in logging.include.sh
.
Which levels to log under with std/file output could be controlled by different CLI options, so it doesn't use any hardcoded redirects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes are now fully backwards compatible as of e655404
Alright, them I'm with @tlvu on this one... how is twitcher not complaining about your self-signed certificate? What happens if you try to access geoserver or stac (or any service behind twitcher)? |
I'm passing it: services:
twither:
environment:
REQUESTS_CA_BUNDLE: "${SSL_CERTIFICATE}"
volumes:
- "${SSL_CERTIFICATE}:${SSL_CERTIFICATE}:ro" (repeat for any relevant services) |
Very interesting. Is this documented anywhere? If this change is required, my point here (#484 (comment)) still stands since we're still diverging from a production-like environment to make this work. |
@fmigneault
@mishaschwartz A while back when I was still trying to get self-signed cert to work, I added this to Jenkins https://github.com/Ouranosinc/PAVICS-e2e-workflow-tests/blob/5d816cf9582e8db84ad62d5c5e01918c1b8a8af4/runtest#L37-L39 to avoid warning about self-signed cert. Hopefully it still works. |
Thanks @tlvu. This suppresses the warning message but there's still the error that gets raised unfortunately |
Just curious, what kind of error? Can you share? It's been a very long time since I attempted to make self-signed cert work. |
@mishaschwartz @tlvu |
@tlvu It's a >>> import requests
>>> import os
>>> requests.get(os.getenv("BIRDHOUSE_HOST_URL") + "/stac")
...
SSLError: HTTPSConnectionPool(host='<my-ip-addr>', port=443): Max retries exceeded with url: /stac (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate (_ssl.c:1091)')))
>>> import urllib.request
>>> urllib.request.urlopen(os.getenv("BIRDHOUSE_HOST_URL") + "/stac")
...
URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate (_ssl.c:1091)> Note that this is probably avoided if you're setting the |
Ok so I just want to summarize a few things here: Strategies for supporting local development:
My personal view on this is as follows:
|
I don't mind either IMO, if the code needs to be adjusted only to support local dev, that is not a good justification to introduce changes. I think the security aspects are not really problematic in either case since those are intended for local development and are purposely bypassing SSL checks with an override variable. You have to know what you're doing to be making setting those configs. We can add warnings (like the variables that printed when default security values are left unchanged) to be clear. |
I think Francis has a point here. I also see Misha point to not open unnecessary port since this will be directly on his laptop and he could be working in public spaces (airport, hotel, coffee shop, ...) I let you guys decide but if you decide to restore the 443 into the main
Yes this makes sense. However, as someone who have previously also spent quite a long time trying to get a fully working dev environment, I also understand Misha's desire to have an easier time setting up a local dev env. How about adding a warning the for the Let's consider this local-test-dev as the poor-man sharing, an intermediate working solution with some drawbacks. It's not ideal but it's better than having nothing at all. Misha, in order to use LetsEncrypt, all you need is a VM or machine stationary at work and VPN in to use it. If no space for additional VM, you can buy a used server for really cheap. My machine at work was actually a used machine. Yes, you heard me right. When I started at Ouranos, I actually requested a used machine to leave budget for more RAM and HDD because I knew I would be runnings VM. Let's take this one for example https://deltaserverstore.com/product/dell-precision-5820/, Upgraded cpu to 6 cores, ram to 64G, disk to 1T SSD, it's $455 at the time of this writing. This machine can easily allow you to run 3 concurrent VMs at the same time. You'll need to find a network port that is in the DMZ, reserve some IP at work, open port 80 and 443 on the internet and you're set. I think your needs justify to make this expense request at work. This would be my suggestion once you've pushed the limit of local-test-dev. |
I'm not sure that's true since you are then forced to create a valid SSL certificate that you aren't going to use. I can see if there are other ways to work around this that don't require these files to be split up.
Are the changes you're objecting to here about splitting up the nginx config files here or introducing the BIRDHOUSE_PROXY_SCHEME variable?
Agreed and I'm happy to add those |
Yes, I have a used computer at work that is set up exactly as you describe. If we're serious about this being a community developed project then this is too high of a bar for entry to allow people to develop this software with us. |
Great ! I wasn't aware of this.
Agreed ! That's why I already approved this PR. We should give choices and not force someone into a particular workflow. As for the Jenkins testsuite, in the current state, all the notebooks and the matching Jupyter image are pretty Ouranos specific. So I would not count this as a blocker for not adopting a deployment/development workflow because the Jenkins testsuite is not working. Jenkins is pretty modular enough to run other notebooks from any repos as well so it's not a complete dead-end if someone wants to integrate Jenkins into their workflow. |
I can confirm that a valid SSL certificate is required in this case. I can think of more complicated ways to do this but the current proposed solution is the least complicated method I can think of. I really don't think that this solution is that complex. It's exactly what we do for all-services.include, cors.include, etc. and all the files are in one place. It's not like it's complicated for a developer to look at one more file when they already have to look at 4+ to understand the configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we only need to document the alternatives from #484 (comment). The rest seems good to go.
Great! I'll work on that today and re-request a review shortly. |
birdhouse/README.rst
Outdated
To deploy locally, enable the :ref:`local-dev-test` component. This will allow you to access the Birdhouse | ||
software in a browser on your local machine using the URL ``http://host.docker.internal``. | ||
|
||
Deploy locally with ``https`` | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure for an external user whether this is clear enough that it is an "either of" approach.
Maybe say something like the following to make it very explicit?
Two options are made avaiable to work around HTTPS/SSL certificate issues locally:
- `Using HTTP scheme deployment`_
- `Using a Self-Signed SSL certificate`_
Using HTTP scheme deployment
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Enable the :ref:`local-dev-test` component [...]
Using a Self-Signed SSL certificate
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[...]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all updates and investigations.
Overview
The Birdhouse stack can now be deployed locally and accessed on a browser on the host machine without the need
for an SSL certificate. This is useful for local development and for running tests against the full stack while developing
and in CI environments.
To enable this, add the new
optional-components/local-dev-test
component toBIRDHOUSE_EXTRA_CONF_DIRS
andset the following environment variables in the local environment file:
export BIRDHOUSE_FQDN=host.docker.internal
export BIRDHOUSE_HTTP_ONLY=True
You should also add
host.docker.internal
to your/etc/hosts
file pointing to the loopback address sothat URLs generated by Birdhouse that refer to
host.docker.internal
will resolve properly in a browser:After deploying the stack, you can now interact with the Birdhouse software at
http://host.docker.internal
from the machine that is the docker host.
In order to implement the changes above, the following non-breaking changes have been made to the deployment code:
BIRDHOUSE_HTTP_ONLY
which is not set by default. If set toTrue
theproxy
component will only serve content overhttp
(nothttps
).BIRDHOUSE_PROXY_SCHEME
: default ishttp
ifBIRDHOUSE_HTTP_ONLY
isTrue
, otherwisehttps
PROXY_INCLUDE_HTTPS
: default is unset ifBIRDHOUSE_HTTP_ONLY
isTrue
, otherwiseinclude /etc/nginx/conf.d/https.include;
BIRDHOUSE_ALLOW_UNSECURE_HTTP
: default is nowTrue
ifBIRDHOUSE_HTTP_ONLY
isTrue
manually filtered out before program outputs could be used.
--log-stdout
and--log-file
flags to thebin/birdhouse
interface to allow redirecting logs tostdout or to a specific file instead.
BIRDHOUSE_LOG_FD
can be used to redirect logs to a file descriptor (ex:BIRDHOUSE_LOG_FD=3
)BIRDHOUSE_LOG_FILE
can be used to redirect logs to file (ex:BIRDHOUSE_LOG_FILE=/some/file/on/disk.log
)some logs are written. Instead, set these by exporting them in the parent process that calls
bin/birdhouse
.bin/birdhouse
interface, logs will still bewritten to stdout.
Changes
Non-breaking changes
bin/birdhouse
interfacebin/birdhouse
and environment variables to set logging optionsBreaking changes
https
, this may need to be changed to use theBIRDHOUSE_PROXY_SCHEME
environment variables instead (replacehttps
with${BIRDHOUSE_PROXY_SCHEME}
)Related Issue / Discussion
CONTRIBUTING.rst
?start-local
option to the Makefile?Additional Information
This change is necessary if we are serious about improving the testing infrastructure for this stack.
CI Operations
birdhouse_daccs_configs_branch: master
birdhouse_skip_ci: false