-
Notifications
You must be signed in to change notification settings - Fork 8
Description
The figures below show results from quick and naive benchmarks that compare the total (single thread) execution times for a few functions in shapely 2.0 vs. s2shapely. All benchmarks are run using 10 000 random x,y (lat,lon) points.
(Note: for all functions except "s2shapely.is_geography", the access to the wrapped C/C++ geo objects is almost direct -- in s2shapely it doesn't go through pybind11's complex conversion logic, see #3 (comment) and #5).
It is hardly comparable (different C/C++ libraries, different binding approaches), but it already highlights a few things:
-
The overhead caused by pybind11's "default" C++ <-> Python conversion is large (i.e., large difference measured for "is_geo"). This could be improved with some workarounds. It clearly has an impact on trivial functions (1st figure) but less so for more computationally expensive tasks (2nd figure).
-
The execution of the vectorized inner loop looks relatively similar for the two libraries, as shown by all trivial functions (except "is_geo") in the 1st figure. I guess that native Numpy ufuncs (this is what shapely provides, right?) are a bit more optimized than
pybind11:vectorize. Maybe using xtensor(-python)xt::vectorizeandxt::pyvectorizecould provide some speed-up? -
The results shown in the 2nd figure (equals / intersects predicates) are weird. This should be explained mostly by the fact that the underlying libraries (GEOS vs. s2geometry / s2geography) are very different from each other. Those are very naive and incomplete benchmarks, though. For s2shapely there no difference between unprepared / prepared geometries, but I suspect that this is because those are all point geometries.

