Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up gdf_to_bool_da and gdf_to_bool_ds functions. Added centroid checks and minimum area fraction filtering to docs. #396

Merged
merged 11 commits into from
Dec 23, 2024

Conversation

bdestombe
Copy link
Collaborator

No description provided.

…checks and minimum area fraction filtering to docs.
Copy link
Collaborator Author

@bdestombe bdestombe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The speedup is because all polygons are first merged before intersecting. The intersection with one large polygon in compiled code with RTree is faster than intersecting each polygon separately.

@bdestombe
Copy link
Collaborator Author

Here:

if ix is None:

I see the following code:

# build list of gridcells
if ix is None:
    modelgrid = modelgrid_from_ds(ds, rotated=False)
    ix = GridIntersect(modelgrid, method="vertex")

For Vertex and structured grids, and for rotated and unratated grids.. Shouldn't that be:

if ix is None:
    modelgrid = modelgrid_from_ds(ds)
    ix = GridIntersect(modelgrid)

@bdestombe
Copy link
Collaborator Author

bdestombe commented Dec 22, 2024

Here:

if ix is None:

I see the following code:

# build list of gridcells
if ix is None:
    modelgrid = modelgrid_from_ds(ds, rotated=False)
    ix = GridIntersect(modelgrid, method="vertex")

For Vertex and structured grids, and for rotated and unratated grids.. Shouldn't that be:

if ix is None:
    modelgrid = modelgrid_from_ds(ds)
    ix = GridIntersect(modelgrid)

Ah okay, GridIntersect(modelgrid, method="structured") is deprecated for some reason and later fails if used.

the rotated=False is still a mystery to me and why gdf is translated instead. Is gdf rotated in the correct direction?

@bdestombe
Copy link
Collaborator Author

Docs failed to compile due to:
Command killed due to timeout or excessive memory consumption

@dbrakenhoff
Copy link
Collaborator

dbrakenhoff commented Dec 23, 2024

Hey @bdestombe,

The speedup is because all polygons are first merged before intersecting. The intersection with one large polygon in compiled code with RTree is faster than intersecting each polygon separately.

It's probably true for most cases, but might still be slower for some odd situations where the bounding box of the polygons is about equal to the model grid, but there are still many cells with no intersection. The most extreme case being a thin diagonal across the model grid.

As for method="structured", this was deprecated recently (modflowpy/flopy#2346), and will be removed in the next flopy version. This method is pretty much always inferior to the full shapely implementation for vertex grids.

the rotated=False is still a mystery to me and why gdf is translated instead. Is gdf rotated in the correct direction?

I vaguely recall either me or Ruben implementing this. We discussed it at the time, but I don't remember at the moment why we implemented it this way. But fairly certain there was a reason ;). Maybe it had to do with the fact that it was easier to transform the shapes than it is to transform the model dataset (which is defined in model-coordinates).

From the 'Working with grid rotation' notebook:

  • before intersecting with the grid, GeoDataFrames are automatically transformed to model coordinates.

And I think we have tested this both in tests (though those tests only check if some intersection result is returned), and in the grid rotation notebook with a surface water geodataframe.

@bdestombe
Copy link
Collaborator Author

bdestombe commented Dec 23, 2024

The speedup is because all polygons are first merged before intersecting. The intersection with one large polygon in compiled code with RTree is faster than intersecting each polygon separately.

It's probably true for most cases, but might still be slower for some odd situations where the bounding box of the polygons is about equal to the model grid, but there are still many cells with no intersection. The most extreme case being a thin diagonal across the model grid.

As for method="structured", this was deprecated recently (modflowpy/flopy#2346), and will be removed in the next flopy version. This method is pretty much always inferior to the full shapely implementation for vertex grids.

Thank you for your detailed explaination! Comming from the author himself!

the rotated=False is still a mystery to me and why gdf is translated instead. Is gdf rotated in the correct direction?

I vaguely recall either me or Ruben implementing this. We discussed it at the time, but I don't remember at the moment why we implemented it this way. But fairly certain there was a reason ;). Maybe it had to do with the fact that it was easier to transform the shapes than it is to transform the model dataset (which is defined in model-coordinates).

From the 'Working with grid rotation' notebook:

  • before intersecting with the grid, GeoDataFrames are automatically transformed to model coordinates.

And I think we have tested this both in tests (though those tests only check if some intersection result is returned), and in the grid rotation notebook with a surface water geodataframe.

That makes sense! shall I revert it then, by rotating the multipolygon while cancelling the gridrotation?

@dbrakenhoff
Copy link
Collaborator

Yea, it doesn't really matter of course, but let's keep it consistent for now with what was implemented thus far.

@dbrakenhoff
Copy link
Collaborator

Command killed due to timeout or excessive memory consumption

Was this a weird one-off or did you fix something to avoid this?

@bdestombe
Copy link
Collaborator Author

Command killed due to timeout or excessive memory consumption

Was this a weird one-off or did you fix something to avoid this?

No, I didn't change anything. Probably just a busy server resulting in a slower run.

@bdestombe
Copy link
Collaborator Author

What makes it a bit more complicated is that if the ix is provided by the user, the polygons only should be rotated if ix.mfgrid.angrot is zero

@bdestombe bdestombe merged commit 296e98d into dev Dec 23, 2024
5 checks passed
@bdestombe bdestombe deleted the gdf_to_bool branch December 23, 2024 13:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make use of flopy's support for multipolygons in grid intersection
2 participants