-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lazy=true
seems unnecessarily lazy?
#823
Comments
This is a difference of use cases. If you have only a few files it is no problem to keep the file open in the background, but with a few hundred you get memory issues and at some point GDAL cant handle it anymore. |
Aha, didn't realize that case was also a thing. Too used to working with Kerchunk I guess :D We probably need a lot more documentation around |
Yeah, Rasters has a "no open files" policy. I had a lot of issues hitting is open file limits early on. Like 400 and and that's it on Linux. So they are really lazy. We should add a |
Isn't that what |
Yep. It doesn't get around file limits... And it's relying on people using |
sorry to belabour this point - but isn't the point of a finalizer that you don't need to use I get your point about file limits though. Maybe this is something to do only if |
You don't need to use close if you don't care about open files I guess. We could add But what's wrong with using an I never find myself limited by this. I don't really get the use case. (ArchGDAL is also designed to work like this everywhere and prefer closures where possible) |
Anything interactive is where I think I personally rarely, if ever, am in a position to use ArchGDAL with closures - if you're writing and debugging analysis code, it's very hard to inspect intermediate results within a closure! |
Do you have an example of this kind of workflow? I guess I don't understand why you need to open the Raster for analysis... What works on the open file that won't work on the raw raster? |
Currently,
lazy = true
defaults to always creating a FileArray, which means that any read requires re-opening the dataset, and calling GDAL again if using e.g a VRT.Consider this benchmark on a VRT of Copernicus DEM.
dem
is the raster, opened withlazy = true
:That's pretty bad. Now, see what happens when I simply run the benchmark inside
open(dem)
:Users will initially try
lazy = true
, see that the performance is horrible, and move on with their lives - when we could have 2 OOM better performance on reads!Since we have finalizers in Julia, it seems like we could make
lazy = true
at least open the GDAL dataset, andlazy = false
would eagerly read into memory as it currently does? Or is there something I am missing here?The text was updated successfully, but these errors were encountered: