Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revisit accessor name generation #922

Open
Jolanrensen opened this issue Oct 16, 2024 · 0 comments
Open

Revisit accessor name generation #922

Jolanrensen opened this issue Oct 16, 2024 · 0 comments
Labels
research This requires a deeper dive to gather a better understanding
Milestone

Comments

@Jolanrensen
Copy link
Collaborator

Brought to attention by #911

Column names can contain any symbol. This is important to support reading and writing any format.
Accessors, however, don't support all symbols due to limitations of the JVM.

Identifiers need to follow the spec:

  • (Letter | '_') {Letter | '_' | UnicodeDigit} is allowed without `
    • Letter: any unicode character of categories Lu, Ll, Lt, Lm or Lo
    • UnicodeDigit: any unicode character of category Nd
  • '`' QuotedSymbol {QuotedSymbol} '`'
    • any character excluding CR, LF and '`' (well except the last part, we cannot write ` inside a name with backticks
  • ., ;, [, ], /, <, >, :, \\ are never allowed

Source: https://kotlinlang.org/spec/syntax-and-grammar.html#identifiers

To support QuotedSymbol characters, our generator automatically inserts backticks where needed.
For disallowed characters, we use the following conversion:

image

This conversion makes it so that columns from data will be accessible like:

  • "my::colName" -> df.`my - colName`
  • "Dwayne `The Rock` Johnson" -> df.`Dwayne 'The Rock' Johnson`
  • "name.first" -> df.`name first`

These conversions are defined to cause as little clashes as possible, but there are some confusing choices.
For instance, "." becoming " ", instead of "_".

This needs some research and feedback.

@Jolanrensen Jolanrensen added the research This requires a deeper dive to gather a better understanding label Oct 16, 2024
@Jolanrensen Jolanrensen added this to the Backlog milestone Oct 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
research This requires a deeper dive to gather a better understanding
Projects
None yet
Development

No branches or pull requests

1 participant