Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dying in specialization & generics #801

Open
kylebarron opened this issue Sep 29, 2024 · 1 comment
Open

Dying in specialization & generics #801

kylebarron opened this issue Sep 29, 2024 · 1 comment

Comments

@kylebarron
Copy link
Member

I'm currently dying in generics.

It's making it harder to move fast, and gives a ton of development overhead. If there's one lesson I should learn from DuckDB Spatial, it's that I'm not moving fast enough.

Any algorithms that return float values, such as area, should only take in a &dyn NativeArray + NativeArrayAccessor<2>.

So we want something like an NativeArrayAccessor<const D: usize>, so that we can implement algorithms on

pub fn intersection(lhs: &dyn NativeArray + NativeArrayAccessor<2>, rhs: &dyn NativeArray + NativeArrayAccessor<2>) -> GeometryArray

This also means we can greatly simplify our IndexedArray implementation too. We should switch to IndexedNativeArray (or maybe IndexedNativeArray<const D: usize>) which holds only an `Arc<>

Also, anything like area that is implemented on geo::Geometry (or geos::Geometry) should be implemented solely on &dyn NativeArray + NativeArrayAccessor<2> (or maybe allow 3d as well there) and not on every array type.

The other issue here is that the Geometry scalar is generic over both dimension and the offset size of the underlying NativeArray. This is especially painful as the dimension matters for the Geometry but the offset size does not. It's only an artifact of the underlying storage.

I'm getting closer to removing support internally for i64 offset arrays. We can support ingesting that data but only represent i32 data internally. To max out i32 offsets, we'd need to have 2^31 + 1 (2,147,483,649) coordinates, which would be 34,359,738,384 bytes, or 32GB.

Especially the binary operations are a total disaster right now. This file is 750 lines of code just to implement intersections, when we aren't even doing any of the core implementation ourselves! This should be 10 lines of code, to take in a &dyn NativeArray + NativeArrayAccessor<2>, iterate over its geoarrow::scalar::Geometry<2> objects, and call intersects on each one.

@kylebarron kylebarron changed the title Dying in generics Dying in specialization & generics Sep 29, 2024
@kylebarron
Copy link
Member Author

#803 removed i64 support. So we're making progress towards simplification

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant