-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
long-running code on cloud services #1
Comments
Hi @cboettig ! I'm a great admirer of all the Rocker stuff, it helps me everyday.
App Engine spins up and down based on how much CPU is being used, so once the job finishes it should shut down and not charge further. A cron job is just a scheduled request to a URL. In an R API case, it spawns a new instance for each request to the URL which you can configure - say once the underlying instance hits 50% of CPU load. Feasibly this means you can scale up and down as needed. I guess the load balancer is always up and running but you don't pay for that, just the CPU resources. I'm in the process of testing this to see how it compares cost wise to my current setup, moving some of my existing workflows to this more serverless philosophy. My existing setups are more like what you describe in your post, with a master VM setting off slave VMs (described here). In those, I rely on the scripts themselves calling the stop signal via |
Wow, that's awesome. I should take a closer look at App Engine. I figured there'd always be some cpu use since the kernel is always doing something, but it's a clever idea to just set a threshhold. Not clear how that translates to multi-core, but guessing you can say 'shut off when use drops below 1/ (2n) %', e.g. no core running at > 50% load? |
This is a good place to start https://cloud.google.com/appengine/docs/standard/python/an-overview-of-app-engine And how it scales in particular: https://cloud.google.com/appengine/docs/standard/python/how-instances-are-managed Its probably not been on your radar as App Engine only used to work with Python and Java, but with the advent of flexible runtimes that use Docker (e.g. Rocker) any code can use its feature set now, although they don't qualify for the free tier its been cheaper than running a small master cron VM. If its a long running process over 60 seconds, you would want to set your URL endpoint (via plumber), then trigger it using a task queue where the max timeout is 24 hours. These are what the |
Actually @cboettig I had a closer look and its perhaps not suitable for your use case - for flexible environments the minimum instances you can have is 1, so its not possible to scale from 0 (e.g. no charge) - so you'd need to pay for at least 1 instance running 24/7 which would be around $30, so probably better to use a static VM for that as its cheaper. |
@MarkEdmondson1234 Thanks for the follow-up. Yeah, that makes sense, seems like the standard use case is always to scale your app up and down in response to demand, maintaining 100% uptime, rather than on demand computing. |
Hey @MarkEdmondson1234 ,
Thanks for your comments in cloudyr/cloudyr.github.io#16 (comment)! Just trying to wrap my head around the approach you have here wrt to the long-running code issue. To expand on this: I often have some cpu or memory intensive code I just want to run the largest available cloud instance. Usually the code will take a few hours to a few days to run, and of course I want the instance to shut down as soon as the job is done so I'm not paying $2-$3 / hr for resources I'm not using.
The README example seems to document a case of something you want to be up persistently? Or is the google compute resource spun up and shut down after each successive cron iteration?
Figuring out how to get machines to kill themselves when done seems tricky, so I usually rely on a tiny 'master' instance which is running the script that spins up the big machine and waits patiently for it to finish, and then shuts it down if it finishes successfully. (e.g. http://www.carlboettiger.info/2015/12/17/docker-workflows.html) But maybe that's happening automatically somehow with Google's appengine project here? Haven't really groked what all the pieces are doing.
The text was updated successfully, but these errors were encountered: