You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We'd like to use Shapely or PyGEOS on the Pythoon side (user side) because this is what GeoPandas outputs after reading files. Therefore we want our C++ function to receive a Shapely or PyGEOS LineString as an argument.
Pybind11 is available to convert from a shapely.coords.CoordinateSequence object to a std::vector<std::tuple<double, double>>:
doublefun(std::vector<std::tuple<double, double>> linestring) {
std::tuple<double, double> point1 = linestring[0];
return std::get<0>(point1); //return x value
}
PYBIND11_MODULE(example, m) {
m.doc() = "pybind11 example plugin"; // optional module docstring
m.def("fun", &fun, "A function");
}
The conversion from Python type to C++ type implies a copy. In addition, we would still have to convert the Linestring to a vector of Point
std::vector<Point> linestring;
for (auto point_as_tuple : linestring_frompy) {
Point p = Point(point_as_tuple); // Construct Point object from tuple
linestring.push_back(p);
}
This conversion would have to be done for each element in the vector dataset.
To avoid the first conversion we could work with native Python types, through Pybind11's pybind11::object class:
doublefun(py::object linestring_frompy) {
py::object coords = linestring_frompy.attr("coords");
int size = coords.attr("__len__")();
// Build linestring from CoordinateSequence Python object
std::vector<Point> linestring;
for (int i = 0; i<size;i++) {
// Provided Point constructor takes a std::tuplePoint p = Point(coords.attr("__getitem__")(i));
linestring.push_back(p);
}
}
Both Shapely and PyGEOS are wrapper around GEOS. For example PyGEOS geometries contain a pointer to a GEOS GEOSGeometry object. Therefore it sounds redundant to convert a PyGEOS Python geometry to a C++ vector of Point, when we could use GEOS on the C++ side and grab the GEOSGeometry pointer attached to the Python object.This seems to be the main idea behind PyGEOS as explained in this blog post by Matthew Rockling (see section Cythonizing GeoPandas).
PyGEOS geometry object have an attribute _ptr which I think is a pointer to GEOSGeometry:
// in pygeos/src/pygeom.cstaticPyMemberDefGeometryObject_members[] = {
{"_ptr", T_PYSSIZET, offsetof(GeometryObject, ptr), READONLY,
"pointer to GEOSGeometry"},
{"_ptr_prepared", T_PYSSIZET, offsetof(GeometryObject, ptr_prepared), READONLY,
"pointer to PreparedGEOSGeometry"},
{NULL} /* Sentinel */
};
However I don't know how to access this pointer from Pybind11, as attr("_ptr") returns another pybind11::object from which I cannot define a new GEOSGeometry pointer.
In conclusion, the ideal solution would be to work directly with GEOS in the C++ side, working with the objects that underly the PyGEOS objects - but I'm stuck for now.
An alternative approach would to stick to Python types (option 1) and define a wrapper around a Shapely/PyGEOS class:
As discussed, a related piece of the puzzle is how best to operate on a whole "geometry" column / PyGEOS vector.
Is this best as passing the column as in as an argument to the C++ code? Can we interpret the numpy array as a vector of pointers to GEOS objects through pybind11? Can we access the coordinates with a minimum of copies directly, and construct a new array of split geometries to return?
Or is it simpler and fast enough to work with pygeos to return a representation of raw coordinates in numpy arrays of (coordinates, offsets) as in pygeos/pygeos#75? The example there is for multipolygons, with offsets to index 1) multipolygon parts 2) exterior/interior rings of the parts 3) coordinates of the rings. For linestrings we would presumably need only one offsets array for the coordinates of the linestrings.
thomas-fred
changed the title
Interface between Shapely/PyGEOS and C++
Vectorise split linestring for PyGEOS objects
May 25, 2022
Subissue of #8
We'd like to use Shapely or PyGEOS on the Pythoon side (user side) because this is what GeoPandas outputs after reading files. Therefore we want our C++ function to receive a Shapely or PyGEOS LineString as an argument.
Pybind11 is available to convert from a
shapely.coords.CoordinateSequence
object to astd::vector<std::tuple<double, double>>
:The conversion from Python type to C++ type implies a copy. In addition, we would still have to convert the Linestring to a vector of
Point
This conversion would have to be done for each element in the vector dataset.
To avoid the first conversion we could work with native Python types, through Pybind11's
pybind11::object
class:Both Shapely and PyGEOS are wrapper around GEOS. For example PyGEOS geometries contain a pointer to a GEOS
GEOSGeometry
object. Therefore it sounds redundant to convert a PyGEOS Python geometry to a C++ vector ofPoint
, when we could use GEOS on the C++ side and grab theGEOSGeometry
pointer attached to the Python object.This seems to be the main idea behind PyGEOS as explained in this blog post by Matthew Rockling (see section Cythonizing GeoPandas).PyGEOS geometry object have an attribute
_ptr
which I think is a pointer toGEOSGeometry
:However I don't know how to access this pointer from Pybind11, as
attr("_ptr")
returns anotherpybind11::object
from which I cannot define a newGEOSGeometry
pointer.In conclusion, the ideal solution would be to work directly with GEOS in the C++ side, working with the objects that underly the PyGEOS objects - but I'm stuck for now.
An alternative approach would to stick to Python types (option 1) and define a wrapper around a Shapely/PyGEOS class:
The text was updated successfully, but these errors were encountered: