Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GeoDataFrame init #909

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open

GeoDataFrame init #909

wants to merge 6 commits into from

Conversation

AndreiKingsley
Copy link
Collaborator

No description provided.

Copy link
Collaborator

@Jolanrensen Jolanrensen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good initial version :) just some comments here and there

dataframe-geo/build.gradle.kts Show resolved Hide resolved
import org.jetbrains.kotlinx.dataframe.api.with

class GeoDataFrame<T : GeoFrame>(val df: DataFrame<T>, val crs: CoordinateReferenceSystem?) {
fun update(updateBlock: DataFrame<T>.() -> DataFrame<T>): GeoDataFrame<T> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you consider other names? Like

  • updateDf
  • replaceDf (more consistent with what replace does in DF compared to update)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace implies that we have df1 and df2, and we change df1 to df2. But in reality we are only modifying df1 slightly, using it as a receiver.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, what do you think about modifyDf then?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion "Df" is redundant and doesn't look good. Plus it's already clear from signature that it applies to DataFrame

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe .modify {} or .transform {} are okay?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

transform doesn't look good to me.

Copy link
Collaborator Author

@AndreiKingsley AndreiKingsley Nov 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@koperagen omitting "df" makes function not so obvious that it's the dataframe update and not something else.

dataframe-geo/build.gradle.kts Show resolved Hide resolved

object Geocoder {

private val url = "https://geo2.datalore.jetbrains.com/map_data/geocoding"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be included in the stock library? As far as I remember this was a reverse engineering endeavour and since it depends on an external URL behaviour might change or stop working entirely

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this experimental module, I am totally fine with including in library.

  1. It's stable for enough time
  2. We not guarantee for a while all our releases of experimental geo-module

But we could also include this solution for the @Jolanrensen question:

  1. Checking the API availability and handling the unavailability with some exception
  2. Checking the API consistency and notification for the users or raise an exception
  3. Customize it with a URL parameter (if this API just will be moved in another place)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Geocoder is not part of the main API and will be removed/reworked completely (via future Kotlin geocoding library we discussed). I kept it, exclusively to play with it. So no effort should be wasted on it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably fine then :) You could add a small note in KDocs too telling this is experimental/temporary, so potential users won't get used to it.

settings.gradle.kts Show resolved Hide resolved
Copy link
Collaborator

@zaleslaw zaleslaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, I am approving these changes, but want to summarize some thoughts and make this piece of code more mature and maintanable

repositories {
// geo repositories should come before Maven Central
maven("https://maven.geotoolkit.org")
maven("https://repo.osgeo.org/repository/release")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it true, that these maven repo links will be hidden in the module, and the final user don't need to refer to them somehow in its Gradle script? (just for self checking)

Copy link
Collaborator Author

@AndreiKingsley AndreiKingsley Nov 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, the user will need to add the repository. However, I believe we can fix this with shadow jar.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please check whether you can do that before this is merged

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd leave it that way for now, then I'll try to get rid of it. When using a descriptor in notebook, the user won't need to do this.

useJUnitPlatform()
}
kotlin {
jvmToolchain(11)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we downgrade it somehow to 8?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a global problem - DataFrame uses 8 and the Jupyter API uses 11. Therefore, in the future it will be necessary to allocate separate module(s) for integrations.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but for now, for convenience, I'd keep it 11

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, you can probably remove this piece, as kotlin.jvmToolChain(11) is already called from the main build.gradle.kts file; It configures all subprojects automatically to target jvm 8 in the published jars while setting the jvm toolchain (so the version gradle uses to build and test the project) to 11.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

import org.locationtech.jts.geom.Polygon

@DataSchema
interface GeoFrame {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name is lead in the wrong direction. It's not GeoFrame. It's close to wrappers for Geometry or Form or GeometryForm

I agree, that geometry is produced from Geo, but it's misleading me

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we have Geometry from jts, what do you think about Form or Figure or GeometryFigure/GeometryType or GeometryColumnType, verbose but clear

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WithGeometry?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


object Geocoder {

private val url = "https://geo2.datalore.jetbrains.com/map_data/geocoding"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this experimental module, I am totally fine with including in library.

  1. It's stable for enough time
  2. We not guarantee for a while all our releases of experimental geo-module

But we could also include this solution for the @Jolanrensen question:

  1. Checking the API availability and handling the unavailability with some exception
  2. Checking the API consistency and notification for the users or raise an exception
  3. Customize it with a URL parameter (if this API just will be moved in another place)

return transformation.transform(this)
}

fun Geometry.translate(valueX: Double, valueY: Double): Geometry {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you consider renaming to somthing like shift/move for simplicity

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "translate" is a common name in the math/geometry-world

Copy link
Collaborator

@zaleslaw zaleslaw Nov 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the main issue, that we are doing this not for math guys, but for Kotlin developers who are not familiar with all these concepts, and I am trying to make the naming simpler and more obvious

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a well-established concept, so I can't agree:
https://en.wikipedia.org/wiki/Translation_(geometry)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Andrei on this one. "translate" is common enough I think

}

companion object {
val DEFAULT_CRS = CRS.decode("EPSG:4326", true)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be ENUM of different constants

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it even exists in some form in GeoTools (see DefaultEngineeringCRS). But here is a more global question - whether it is necessary to use geotools API directly or should we wrap it completely.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest not wrap for now, we could not avoid the learning of that


fun GeoDataFrame<*>.writeGeoJson(path: String): Unit = writeGeoJson(File(path))

fun GeoDataFrame<*>.writeGeoJson(file: File) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add for IO part some JUnit tests for reading/writing files close to your examples https://gist.github.com/AndreiKingsley/8eada06f174c681e953f8d939248d153

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@Kotlin Kotlin deleted a comment from zaleslaw Nov 6, 2024
@@ -0,0 +1,79 @@
package org.jetbrains.kotlinx.dataframe.geo.jts
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo in filename: *Extensions



fun GeoDataFrame.Companion.readGeoJson(path: String): GeoDataFrame<*> {
return readGeoJson(asURL(path))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing the linter will complain about is that these should be expression functions. One ktLintFormat can replace them all at once tho

import org.jetbrains.kotlinx.dataframe.api.update
import org.jetbrains.kotlinx.dataframe.api.with

/**
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good docs :D

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I gave you a compliment haha

*/
class GeoDataFrame<T : WithGeometry>(val df: DataFrame<T>, val crs: CoordinateReferenceSystem?) {
/**
* Updates the `GeoDataFrame` using a specified transformation block on the underlying DataFrame.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only thing I'd mention in these functions is that it doesn't simply "update" (which would mean the class is mutable), but it returns a new GeoDataFrame copy with the dataframe being replaced

)
}

override fun equals(other: Any?): Boolean {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing hashCode and toString functions

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

}

@get:JvmName("geometry")
val <T : WithGeometry> ColumnsContainer<T>.geometry: DataColumn<Geometry>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the T is not needed, right? ColumnsContainer<WithGeometry> has the same effect

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, for the generated accessor we have both ColumnsContainer and DataRow extensions, so probably you should add DataRow<WithGeometry>.geometry: Geometry etc. too :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@koperagen could plugin help here, to generate all these extensions?

@Jolanrensen
Copy link
Collaborator

Left a couple more comments :) Thanks for fixing some stuff already!
Also, don't forget to update from master. Most important recent changes include:

  • a README.md file in each module with a small description of what the module does
  • binary compatibility checker, you can add the plugin, run apiDump and add the api file to git just before merging this PR

import org.jetbrains.kotlinx.dataframe.geo.WithGeometry
import org.locationtech.jts.geom.Geometry

fun SimpleFeatureCollection.toGeoDataFrame(): GeoDataFrame<*> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please document it with KDocs, looks like it will not change too much in future

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's worth to add in comment the summarization of the underlying algorithm

import org.jetbrains.kotlinx.dataframe.geo.GeoDataFrame
import org.locationtech.jts.geom.Geometry

fun GeoDataFrame<*>.toSimpleFeatureCollection(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a KDoc, with tiny explanation (especially if some edge cases are not working for now)

Copy link
Collaborator

@zaleslaw zaleslaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me it looks much more mature today, please fix comments/linter and it could be merged, I believe

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants