Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve resolution using expected type #379

Open
serras opened this issue Jun 17, 2024 · 41 comments
Open

Improve resolution using expected type #379

serras opened this issue Jun 17, 2024 · 41 comments

Comments

@serras
Copy link
Contributor

serras commented Jun 17, 2024

This is an issue to discuss improve resolution using expected type. The current full text of the proposal can be found here.

We propose an improvement of the name resolution rules of Kotlin based on the expected type of the expression. The goal is to decrease the amount of qualifications when the type of the expression is known.

@serras
Copy link
Contributor Author

serras commented Jun 17, 2024

⚠️ The proposal mentions some non-goals for the KEEP, in particular relating to improving resolution in calls. Please read that section carefully.

@FilipDolnik
Copy link

FilipDolnik commented Jun 17, 2024

Hi!

First of all, I want to say that I really like this idea, and I've been missing this feature in Kotlin for a while.

I have one question/comment about the decision not to use the Swift syntax .value (or something similar).

I use both Kotlin and Swift, with Kotlin as my primary language. I personally strongly prefer the .value syntax because it makes it easier to understand exactly where the value comes from. On the other hand, I tend to avoid using the workaround with static imports in Kotlin, which can be used to simulate this feature (without the .), because I've run into too many situations when it creates confusion. (As the document mentions, developers ignore the import section.)

To be fair, this is partially because I only use the PascalCase naming convention for enum cases (which works better with the naming convention of sealed classes). Therefore, it's easy to create name collisions with another class or, worse, an object.

The document mentions two issues why this wasn't the chosen solution:

The big drawback of this choice is that new syntax ambiguities may arise, and these are very difficult to predict upfront. Code that compiled perfectly may now have an ambiguous reading by the compiler.

Furthermore, Kotliners already use a pattern to avoid re-writing the name of an enumeration over and over, namely importing all the enumeration entries, which makes them available without any . at the front. It seems better to support the style that the community already uses than trying to make people change their habits.

Let's start with the second one. The issue here is that there is currently no other way to do it, so just because people use it doesn't necessarily mean it's the better option or that they wouldn't want to switch to the new syntax (also, the old syntax doesn't go away).

As for the first issue, I can't think of any situation where this new syntax would create a problem because I don't think Kotlin has any expression and type declaration that can start with a .. However, I understand that this is difficult to prove and predict. But I think this is solvable by having a longer beta cycle and the usual optional compiler flag + warnings, then enabling the flag by default, as is the case for other features.

It wouldn't make sense to bother with the issues mentioned above if there weren't any advantages to compensate for them. However, I think this dot syntax has significant benefits, or to put it better; the currently proposed syntax has some significant drawbacks.

First of all, clarity. The . syntax makes it immediately obvious what the expression is. Compared to the proposed syntax, in which case it can be basically anything from anywhere (like an object, class, etc.), and you have to rely on the IDE (or manually looking at the imports) to recognize that. (my main issue with the static imports)

Second: With ., it's possible to use this feature even in a scope that contains this name conflict (with the proposed syntax, you have to use the whole name or resolve the name collision).

Third: Having the . syntax should make it less likely that we will run into some unsolvable issue when this feature is extended to cover all the other use cases (like, for example, the overloading resolution).

Last but not least, the proposed syntax allows for creating nasty semantic bugs that the compiler cannot always catch. For example, given this code:

sealed interface I {  
      
    object SomeClass : I
      
    interface SomeInterface : I
}
import some.package.*

fun foo(i: I) = when (i) {  
    SomeClass -> println("some case")  
    else -> println("default")  
}

Now add this object:

package some.package

object SomeClass : I.SomeInterface

The result is a change in the semantics of the foo function due to a name collision.

Yeah, this is a convoluted example, and in most cases, the result will be a compilation error (still not good, but better). However, I wouldn't be surprised if there were more realistic examples.

I also don't like that this feature's behavior depends on the imports. So many times, I have experienced that the indices in IDEA get corrupted, and the IDE removes some "unnecessary" imports. I somehow have a feeling that this feature could be especially prone to this problem. The result would again be a potentially silent change to the program's behavior in the worst case.

Note that this issue will occur more frequently if the project uses the camelCase naming convention for things like enum cases. I know that this is not in line with the official code style, but it will be the default used by Swift developers who use Kotlin as their secondary language and expect this feature not to have this problem.

@serras
Copy link
Contributor Author

serras commented Jun 17, 2024

As for the first issue, I can't think of any situation where this new syntax would create a problem because I don't think Kotlin has any expression and type declaration that can start with a ..

The problem arises when you combine .value with infix functions. Right now the following expression:

thing
  .foo

is not ambiguous: access member foo on receiver thing. However, once we bring .foo as a potential expression into the mix, we would need some heuristic to decide whether we should parse it as an infix function applied to .foo or at the access member.

Second: With ., it's possible to use this feature even in a scope that contains this name conflict

I don't 100% follow here. The current proposal allows for situations like the following, where the resolution of ON depends on the type of Status.

enum class Status { ON, OFF }
enum class Switch { ON, OFF }

fun f(s: Status) = when(s) {
  ON -> ...
  OFF -> ...
}

@daniel-rusu
Copy link

daniel-rusu commented Jun 18, 2024

I like how this KEEP reduces clutter and improves readability for most scenarios.

Note that value equalities (==, !=) are not part of that group, since they are always resolved to kotlin.Any.equals

Inline value classes have a special equals operator function that isn't overriding Any.equals (behind a feature flag). Similarly to the Color example from the KEEP, will constants from the companion object of an inline value class benefit from this feature?

Some concerns:

  1. Bringing constants or enums into the scope seems natural. However, bringing functions is quite surprising.
  2. Since this has the lowest precedence, I'm really concerned that introducing new unrelated code can auto-magically change the behavior of existing code without even realizing it. For example, if someone creates a function with the same signature, my existing code automatically binds to that new function. This is especially likely in large projects where others can define public top-level functions (or constants of the same type) in a completely different part of the codebase (which is common with junior developers) and then the behavior accidentally changes.
  3. Although the lowest-precedence rule is there to resolve ambiguities, it seems counter-intuitive to existing precedence rules. For example, when having similar-looking code by using a lambda with receiver (eg. the with function), function calls on the receiver take precedence over top-level functions with the same signature. However, the opposite is true with the lowest-precedence rule which is surprising.

Reversing the precedence order would address concerns 2 & 3 but I'm not sure if that introduces other problems.

Alternatively, I think that using the .VALUE or .function() syntax would address all these concerns as it makes it obvious where the function or value is coming from and avoids accidentally changing the binding as the syntax signifies that it's coming from the current scope. I think that using this new feature alongside infix functions should be much less common, so using heuristics to resolve that scenario along with the .VALUE syntax seems like a much safer approach (perhaps lower precedence after current compiler rules).

@gildor
Copy link
Contributor

gildor commented Jun 18, 2024

I think .value syntax is disregarded too soon, it would be great to add a section about it just to discuss potential advantages of it. Like already mentioned above and the fact that it can be easier to use for other important cases like function call with expected type switch(.ON)

One more advantage of .value is very clear for IDE, I understand that it may be improved for proposed syntax to prioritize expected type, but .<caret><TAB> makes it very clear and get only relevant suggestions for auto complete

Having it explicit is actually looks pretty good to me and .value syntax looks pretty natural for this use case and already well-tested in Swift and I don't see it as an issue for Kotlin developers, rather opposite, when existing importa approach is mixed with proposed expected type

@FilipDolnik
Copy link

FilipDolnik commented Jun 18, 2024

As for the first issue, I can't think of any situation where this new syntax would create a problem because I don't think Kotlin has any expression and type declaration that can start with a ..

The problem arises when you combine .value with infix functions. Right now the following expression:

thing
  .foo

is not ambiguous: access member foo on receiver thing. However, once we bring .foo as a potential expression into the mix, we would need some heuristic to decide whether we should parse it as an infix function applied to .foo or at the access member.

Oh, I see. So the issue is that there can be a infix fun SomeType.foo(a: Int) in SomeEnum.

I think that the use of infix function calls in this way should be prohibited. I'm not convinced that such a pattern should exist because it is confusing even for the developer, not just the parser.

I also do not think that it would be used that often. Infix functions are generally used relatively rarely outside of DSLs, and in DSLs, it's common to achieve similar syntax (without the dot) by using contexts. Regular code can also do the same using the static imports.

Alternatively, we can give priority to the other scopes - just like the proposal does for the name collisions. This kind of collision will be much rarer, but it would complicate the parsing, and I'm not sure it's worth it.

Second: With ., it's possible to use this feature even in a scope that contains this name conflict

I don't 100% follow here. The current proposal allows for situations like the following, where the resolution of ON depends on the type of Status.

enum class Status { ON, OFF }
enum class Switch { ON, OFF }

fun f(s: Status) = when(s) {
  ON -> ...
  OFF -> ...
}

The issue I mentioned is about a situation in which you already have the case "ON" in scope. In that case it would always take precedence even though it's not the correct type (assuming I understand the KEEP correctly).

For example, following is not possible:

sealed interface Json {

  class String : Json

  class List : Json
}


fun f(json: Json) = when (json) {
  is String -> ...
  is List -> ...
}

(However, it would be possible with the . syntax.)
This is also a good example of the last issue I mentioned:

fun f(json: Json) = when (json) {
  is Json.String -> ...
  is List -> ...
}

This code would work just fine until you import List in the file.

@CLOVIS-AI
Copy link

Hi!

I really like this proposal, I don't have much to say about it.

I'm 50/50 on member functions being included. The main use-case I see for including member functions is when building complex objects:

AComplexObject(
    ANestedComplexObject(
        …
    )
)

but since the current conventions are to create constructors/factories as top-level entities, and this KEEP explicitly excludes argument resolution, this isn't a very compelling use-case. I wonder if excluding member functions would be more confusing for beginners ("why does it work for some things and not for others") than including them.

One potential risk of this proposal is the difficulty of understanding when exactly it is OK to drop the qualifier, which essentially corresponds to understanding the propagation of the expected type through the compiler.

I think this is mostly mitigated by the fact that IntelliJ will always let you write the name of the enum element, auto-complete, and will add the qualified type automatically if necessary (which is the current behavior).


Now, on the .VALUE syntax discussion.

Currently, there are two main ways enums are used:

  • With explicit naming (Foo.Bar),
  • With star-import (Bar).

If we are in a codebase that uses star-import (and many of them do), the arrival of this language feature with .VALUE syntax would mean:

  • Replacing all usages of Bar by .Bar
  • Removing all star-imports

To me, this is clearly a regression. If a codebase has to be made more verbose to accommodate a new language feature made to simplify this exact use-case, then I believe it is a failure of the language feature.

Before:

import TransferType.*

when (transfer) {
    Entry ->InternalTransfer ->Exit -> …
}

After:

when (transfer) {
    .Entry -> …
    .InternalTransfer -> …
    .Exit -> …
}

Why add syntax when the original code worked already?

I much prefer avoiding adding any new syntax at all, and simply codifying the existing pattern and conventions into the compiler.

Without the .VALUE syntax, migrating code looks like:

  • The star-import is now flagged as unused by the IDE.

That's it. Before:

import TransferType.*

when (transfer) {
    Entry ->InternalTransfer ->Exit -> …
}

After:

when (transfer) {
    Entry ->InternalTransfer ->Exit -> …
}

Adding new syntax means having to teach users about the new syntax. Sure, the .VALUE syntax is easy to explain, but it also doesn't provide enough value to require taking time from a beginner. Allowing the compiler to make the existing code patterns safer and less verbose with no user intervention seems to me like a much better choice.

So, I'm in favor of keeping the KEEP as-is: purely about resolution, with no new syntax at all.

@FilipDolnik
Copy link

If we are in a codebase that uses star-import (and many of them do), the arrival of this language feature with .VALUE syntax would mean:

  • Replacing all usages of Bar by .Bar
  • Removing all star-imports

The static imports do not go away, so existing code doesn't have to be migrated at all, even with the .value syntax.

Why add syntax when the original code worked already?

I'm not convinced that it does, at least not universally. The static imports create name collisions. The syntax without the dot has the same issue, potentially much worse than the static imports.

Depending on the kind of code you write they can be very common. For example when you write mapping between DTOs and domain objects - they tend to use the same names for nested types.

To me, this is clearly a regression. If a codebase has to be made more verbose to accommodate a new language feature made to simplify this exact use-case, then I believe it is a failure of the language feature.

I think readability is more important than reducing verbosity (especially when we talk about an extra dot). I consider the .value syntax to be much easier to understand, and it doesn't have the issues with name collisions. Also, as was mentioned above, it works better with code completion in the IDE, so that's already a significant improvement over the static imports (and the proposed syntax).

@gildor
Copy link
Contributor

gildor commented Jun 18, 2024

Without the .VALUE syntax, migrating code looks like:
The star-import is now flagged as unused by the IDE.

I'm not sure that it's true, considering that the expected type resolution has the lowest priority, so at least from compiler point of view, import is used
Of course IDE could be smart enough to detect it, but still

doesn't provide enough value to require taking time from a beginner

Beginners will follow what IDE would suggest, now it's a non-static version, if it would change to .value, they will use it instead, I think it's very easy to understand
And the value comes not from saving one dot, or the potential learning curve for beginners (which is questionable IMO), but from readability and making it explicit, also to support more complex use cases like function invocation with expected type, methods (I would much prefer for example Button(.width(10.dp)) instead of implicit Button(width(10.dp)) or current Button(Modifier.width(10.dp)), especially when I have multiple params Button(.width(10.dp), .BLUE, stringResource(.string.OK))

@CLOVIS-AI
Copy link

I guess this is going to be an issue of taste, but I don't think

Button(.width(10.dp), .BLUE, stringResource(.string.OK))

looks like Kotlin, whereas this KEEP aims towards:

Button(width(10.dp), BLUE, stringResource(string.OK))

I don't think this KEEP (which is about type resolution, which is almost a detail to most users) should be the start of such a major syntactical change to the language. And anyway, this KEEP explicitly excludes function calls.

@serras
Copy link
Contributor Author

serras commented Jun 18, 2024

Let me try to spell out some of the concerns and how important they were in the current design.

☣️ Full backward compatibility: every program which currently compiles should keep compiling with the same behavior and no additional warnings.

  • Just to clarify, we do not expect people to change existing code when this feature is enabled. So there's no need to "migrate", in the strongest sense of the word; you can of course reduce some imports by taking this into account.
  • Note that even though we may want to guide users into a certain pattern, we cannot just decide "people should not write code this way", and break some existing codebase.

🛑 Toolability: it should not become way more difficult to provide tooling for this feature.

⚠️ Forward compatibility: if a program compiles with this flag on, it should be more or less resilient to changes.

  • It should suffer from the same problems as the rest of Kotlin code. For example, right now you can change resolution if you introduce a function in a class which hides an imported extension function, since the former has more priority than the latter.

⚠️ User expectations: the feature should not guide users into wrong expectations, as far as possible.


About .value syntax:

  • Backward compatibility
    • This variant introduces new syntax for type-guided resolution, so once we've parsed the file, there's no overlap with any other resolution, so we ensure backward compatibility.
    • Alas, it's not clear that we can actually parse files in a fully backward compatible way if we introduce that syntax. The reason is that foo<newline>.bar is now unambiguously a field access, but it may be get another interpretation with .value syntax. Note that using <newline>. is very common in Kotlin code.
    • As a result, any variation using this syntax should have a good answer to the question "how to parse things unambiguously?"
  • Toolability: one possibility would be to make the parsing dependent on some compiler flag. However, that means that now parsing the file depends on some external input (Gradle file), so tools need to be updated to consult this (which is not always trivial).
    • You can see that other KEEPs extend syntax in a backward compatible way, so we can update the parser to the larger language, and then emit a warning if a new feature is being used.
  • Forward compatibility: again, .value syntax is the most explicit, so the chances of changing resolution are lower.
    • And since all type-guided resolution is at the same level, we would get at most an ambiguity error, not a resolution change.
  • User expectations: one very important difference with Swift is that we take a restrictive route, in which we are very clear about when you can expect types to guide resolution. Swift, on the other hand, allows .value syntax everywhere, and tries its best to decide which is the correct way to resolve it.
    • I'm afraid people would expect .value to work everywhere, instead of the handful of cases in this KEEP.

About having the last priority:

  • Backward compatibility: this is the main motivation. Any other choice would potentially break existing code.
  • Forward compatibility: it's indeed the case that if you introduce another import (as done with Json.String above), your resolution may change. Note again that this is not the only place in Kotlin where this can happen.
    • We expect the IDE and tooling to help here. If suddenly a type resolves differently in a when condition, you'll get a message about "this condition is always false" in most cases (since you're pattern matching on a value of a particular type and then checking the type from a completely different hierarchy).

@gildor
Copy link
Contributor

gildor commented Jun 18, 2024

@CLOVIS-AI problem comes when different params use expected type, and some just regular function calls.

Yes, this proposal doesn't support method invocation of expected type, but it's a very powerful feature; I see that it may save more code than in the case of when/enums, and the decision now will affect how it is implemented in the future, and even if it possible to implement.

I would say that I'm not against the proposal; I just would like the KEEP to have a comparison for .value syntax (advantages and disadvantages) because I still see serious advantages, mostly because it's very explicit and concise.

@serras
Copy link
Contributor Author

serras commented Jun 18, 2024

Bringing constants or enums into the scope seems natural. However, bringing functions is quite surprising.

Note that there's a fine line into what counts as function here and what not. The main issue is that a constant may not have an explicit functional type, but there may still be an .invoke operator defined on it.

@FilipDolnik
Copy link

User expectations: one very important difference with Swift is that we take a restrictive route, in which we are very clear about when you can expect types to guide resolution. Swift, on the other hand, allows .value syntax everywhere, and tries its best to decide which is the correct way to resolve it.
I'm afraid people would expect .value to work everywhere, instead of the handful of cases in this KEEP.

Sure, but the code wouldn't compile, so people would quickly learn about this difference. Also, I thought this might be only a temporary limitation that might change with some future KEEP. Therefore, it makes sense to consider what that future would look like, and I feel the current syntax will be very limiting. (For example, for the Modifier builder mentioned above.)

Alas, it's not clear that we can actually parse files in a fully backward compatible way if we introduce that syntax. The reason is that foo<newline>.bar is now unambiguously a field access, but it may be get another interpretation with .value syntax. Note that using <newline>. is very common in Kotlin code.
As a result, any variation using this syntax should have a good answer to the question "how to parse things unambiguously?"

Swift also supports the foo<newline>.bar syntax, so they had to solve the same issue. I'm not sure, but I think that Swift always gives precedence to a member call, even if such a member doesn't exist. I believe that this approach makes sense for Kotlin as well, and it would be fully backward compatible.

It's true that this would be slightly more restrictive and more or less prevent the usage of this feature in implicit returns in multi-line lambdas. However, I'd question how frequent this pattern would be in practice.

It should suffer from the same problems as the rest of Kotlin code. For example, right now you can change resolution if you introduce a function in a class which hides an imported extension function, since the former has more priority than the latter.

Yes, the issue with forward compatibility already exists, but I think it makes sense to consider not contributing more to this problem if there are alternatives.

We expect the IDE and tooling to help here. If suddenly a type resolves differently in a when condition, you'll get a message about "this condition is always false" in most cases (since you're pattern matching on a value of a particular type and then checking the type from a completely different hierarchy).

I agree that you will usually (but not always) notice the issue because of a compilation error or at least a warning. However, my main problem is that if such a collision happens, you cannot use this feature without the .value syntax (whereas with the .value syntax, there is no ambiguity).

You also need to go and fix your code, which can become annoying depending on how often you run into the issue - which I think can be more frequent than expected, especially if this feature is used with functions and properties rather than just the enum cases and classes.

@serras
Copy link
Contributor Author

serras commented Jun 18, 2024

Let me reiterate that Swift had that syntax from the beginning, so whatever the chose back then has been stable. We have a different starting point, with lots of code in the wild, so we need to guarantee no change in parsing. Any proposal for new syntax should give strong reasons that this is the case. I’ve tried to come with such rules myself, and I never managed to convince myself of that property (which is of course not a proof of anything, but I wanted to mention it to dismiss that the team hasn’t looked at this possibility).

I'd question how frequent this pattern would be in practice

Let me take this as example (not trying to pinpoint the particular comment itself). Even if the pattern is not very frequent, a small percentage of Kotlin users is already a large amount of people that would be affected. This is why we strongly prefer full backward compatibility unless the benefits are really big.

@FilipDolnik
Copy link

Let me reiterate that Swift had that syntax from the beginning, so whatever the chose back then has been stable. We have a different starting point, with lots of code in the wild, so we need to guarantee no change in parsing. Any proposal for new syntax should give strong reasons that this is the case. I’ve tried to come with such rules myself, and I never managed to convince myself of that property (which is of course not a proof of anything, but I wanted to mention it to dismiss that the team hasn’t looked at this possibility).

I'd question how frequent this pattern would be in practice

Let me take this as example (not trying to pinpoint the particular comment itself). Even if the pattern is not very frequent, a small percentage of Kotlin users is already a large amount of people that would be affected. This is why we strongly prefer full backward compatibility unless the benefits are really big.

That's why I proposed to give priority to the member call in this case instead of this new feature. In other words:

val a: () -> SomeEnum = {
     foo()
     .someCase
}

would call the someCase property on the result of foo() instead of SomeEnum.someCase. This is also how this is handled in Swift.

@daniel-rusu
Copy link

daniel-rusu commented Jun 18, 2024

@serras I just tested with Kotlin 1.9.23 and confirmed that the current variable resolution is opposite to the way it would behave with this KEEP.

Consider this example:

Color.kt

package com.danrusu.resolution

data class Color(val r: Int, val g: Int, val b: Int) {
    companion object {
        val RED = Color(255, 0, 0)
        val GREEN = Color(0, 255, 0)
        val BLUE = Color(0, 0, 255)
    }
}

CheckResolution.kt

package com.danrusu.resolution
import com.danrusu.resolution.Color.Companion.RED

val RED = Color(200, 0, 0) // Same name & type as in the companion object

fun main() {
    printRedIntensity(RED)
}

fun printRedIntensity(color: Color) {
    println("Red: ${color.r}")
}

If I comment out the import then RED resolves to the local top-level variable. However, with the import, RED resolves to the variable defined in the companion object. The KEEP suggests the opposite behavior since it would have the lowest precedence.

I expect this scenario to be much more common especially with variables & functions defined in other files. Adding unrelated variables with the same name & type or functions with the same signature will silently change the behavior based on the KEEP without any compiler errors or warnings to the developer adding these somewhere else.

Edit:
The same resolution happens with this code example which more closely matches the KEEP:

private fun describeColor(color: Color) = when (color) {
    RED -> println("Red ${color.r}")
    BLUE -> println("Blue")
    else -> println("Other color")
}

Using existing syntax in Kotlin 1.9.23, the import has a higher precedence than top-level variables that have the same name & type. It would be very confusing and error-prone if we use the new feature and then the precedence is reversed and behavior silently changes even though the code looks the same.

@serras
Copy link
Contributor Author

serras commented Jun 18, 2024

I just tested with Kotlin 1.9.23 and confirmed that the current variable resolution is opposite to the way it would behave with this KEEP.

Please be careful, as there's a special rule in the compiler that states that an imported identifier has priority over locally-defined ones (in fact, you can "import" the local declaration to reverse this fact). If you want to have a more realistic example, you need to define values in two different files, and import it in a third one.

@daniel-rusu
Copy link

daniel-rusu commented Jun 18, 2024

Please be careful, as there's a special rule in the compiler that states that an imported identifier has priority over locally-defined ones (in fact, you can "import" the local declaration to reverse this fact). If you want to have a more realistic example, you need to define values in two different files, and import it in a third one.

I just tested that now:

Color.kt

package com.danrusu.resolution

data class Color(val r: Int, val g: Int, val b: Int) {
    companion object {
        val RED = Color(255, 0, 0)
        val GREEN = Color(0, 255, 0)
        val BLUE = Color(0, 0, 255)
    }
}

OtherFile.kt

package com.danrusu.resolution

val RED = Color(200, 0, 0) // Same name & type as in the companion object

Usage.kt

package com.danrusu.resolution
import com.danrusu.resolution.Color.Companion.RED

fun describeColor(color: Color) = when (color) {
    RED -> println("Red ${color.r}")
    else -> println("Other color")
}

With 3 files, I see the same behavior where the import has higher precedence so RED resolves to the companion object and commenting out the import resolves to the top-level declaration from OtherFile.kt. This looks contradictory to the behavior from similarly-looking code with the KEEP.

@serras
Copy link
Contributor Author

serras commented Jun 19, 2024

With 3 files, I see the same behavior

Sorry, my bad, I said "three files", but I meant "three packages". But yes, indeed imported elements have a high priority. But I don't see how this directly relates to the KEEP: in a similar situation the value is resolved to the imported RED, as we only get to the type-directed scope if nothing else matches.

@serras
Copy link
Contributor Author

serras commented Jun 19, 2024

I've taken some time to compare Swift's and Kotlin's grammar, to figure out whether and how .value syntax could be accommodated.


Swift defines implicit member expressions using the following grammar rules, where postfix-expression include all things which go "after the dot", like function calls, and so on.

primary-expression → implicit-member-expression
primary-expression → /* other rules */

implicit-member-expression → . identifier
implicit-member-expression → . identifier . postfix-expression

postfix-expression → function-call-expression
postfix-expression → explicit-member-expression
postfix-expression → /* other rules */

Note that this is not enough to disambiguate all the expressions, so the reference contains an additional note:

If a period appears at the beginning of a line, it’s understood as part of an explicit member expression, not as an implicit member expression. For example, the following listing shows chained method calls split over several lines:

let x = [10, 3, 20, 15, 4]
    .sorted()
    .filter { $0 > 5 }
    .map { $0 * 100 }

Swift allows defining custom operators, but they need to be made from a special set of symbols:

operator → operator-head operator-characters?
operator → dot-operator-head dot-operator-characters

operator-head → / | = | - | + | ! | * | % | < | > | & | | | ^ | ~ | ?
operator-head → /* other rules */

Let's now move to Kotlin's grammar for expressions. The grammar is made out of levels, following the usual technique for avoiding left-recursion in formal grammars. We show here the relevant fragments.

expression:
  disjunction

disjunction:
  conjunction {{NL} '||' {NL} conjunction}

/* more levels in between */

infixFunctionCall:
  rangeExpression {simpleIdentifier {NL} rangeExpression}

rangeExpression:
  additiveExpression {('..' | '..<') {NL} additiveExpression}

/* more levels in between */

postfixUnaryExpression:
  primaryExpression {postfixUnarySuffix}

postfixUnarySuffix:
  navigationSuffix
  | /* other rules */

primaryExpression:
  parenthesizedExpression
  | simpleIdentifier
  | /* other rules */

navigationSuffix:
  memberAccessOperator {NL} (simpleIdentifier | parenthesizedExpression | 'class')

memberAccessOperator:
  ({NL} '.')
  | ({NL} safeNav)
  | '::'

In addition, we have the following rule about tokenization of whitespace. Note that Kotlin's grammar clearly differentiates between newlines (NL token) and other whitespace (WS token); and is specific about where NL is allowed.

Note that syntax grammar ignores tokens DelimitedComment, LineComment and WS.

WS:
  <one of the following characters: SPACE U+0020, TAB U+0009, Form Feed U+000C>

As a result, this means that an expression of the form foo .bar is parsed as an identifier and then a navigation suffix.

          postfixUnarySuffix
         /                 \
primaryExpression      navigationSuffix
        |             /                 \
 simpleIdentifier  memberAccessOperator  simpleIdentifier   
        |             |                        |
      "foo"          "."                     "bar"

The only possibility I see with the current grammar is to make .value another kind of primaryExpression. But that only covers the expression part, we would also need to make userType have a . in front.

primaryExpression:
  parenthesizedExpression
  | simpleIdentifier
  | '.' simpleIdentifier
  | /* other rules */

userType:
  ['.'] simpleUserType {{NL} '.' {NL} simpleUserType}

That means that you'd be able to use .value right after another keyword, so the following would be allowed:

when (x) {
  .ENTRY -> ...
  is .Foo -> ...
}

return .Left(3)

val x: Either<Int, String> = .Left(3)

As hinted in one of the comments above, the main problem would be implicit returns. In the following fragment the .Left is "attached" to 1.

val x: (Int) -> Either<Int, String> = { n ->
  val m = n + 1
  .Left(m)
}
// equivalent to
val x: (Int) -> Either<Int, String> = { n ->
  val m = n + 1.Left(m)
}

It would also be impossible to use an infix function followed by .value expression. Although this KEEP does not look into the function call problem, this might be relevant in the future.


This lies a possible path to include .value as the syntax for type-guided resolution. The main caveat seems that newlines may "hide" the actual parse tree, and lead people to believe that they are creating a .value expression when they are not. This is an actual risk.

However, the main question still remain: is this syntax something worth having in Kotlin, or does the current syntax serves us well? Personally, I think that including a major new syntactic form in the language is too big of a change. Some of the reasons why .value is better (for example, because it helps the IDE) feel like they should be solved in the IDE side instead.

@SPC-code
Copy link

I have mixed impression. On one hand, the proposed syntax is optional so it should be possible just to skip this part if needed. On the other hand, I tried to understand how and when I should be able to omit qualifier and failed. It could create additional entry level barrier. Also, I do not see how this feature interacts with other language features like namespaces/statics and context parameters.

As I stated before, I do not like language features that solve only one problem.

Also, I am not sure if it is related, I just remembered another proposal: #209. If we combine it with namespaces, then the feature you want to implement could be brought down to auto-import of a namespace as a file or class level receiver.

@FilipDolnik
Copy link

@serras

Thanks for looking into how the .value syntax could be implemented. I tried something similar yesterday and also concluded that the best way is to add another derivation rule (potentially with a new non-terminal symbol) to the primaryExpression and userType that would start with the .. The solution should have the properties mentioned above, and it seems that it doesn't introduce ambiguities into the grammar, meaning it shouldn't change how existing code would be parsed.

The main caveat seems that newlines may "hide" the actual parse tree, and lead people to believe that they are creating a .value expression when they are not. This is an actual risk.

I don't think this will be a problem in practice because the probability that you create compilable code, in this case, is very low, much lower than the probability of name collisions that occur with the proposed syntax. Also, the IDE will help you recognize this issue thanks to auto-completion and, more importantly, by automatically adding a tab in front of the . (which is how this already works in both IDEA and Xcode)

Some of the reasons why .value is better (for example, because it helps the IDE) feel like they should be solved in the IDE side instead.

The .value syntax has a slight inherent advantage in that you can start writing with ., and the IDE will show you autocomplete with only the relevant entries. Everything else is likely possible with either syntax.

However, I want to point out that this argument can be dangerous in certain situations because without bringing the IDE team into the discussion, we could get into a situation in which the language design team says: This is a problem that IDE will solve so it's a problem for the IDE team. Then the IDE team says this is really difficult to solve in IDE, so it's a problem for the language design team. (It's not necessarily the case here, but I just wanted to point that out.)

However, the main question still remain: is this syntax something worth having in Kotlin, or does the current syntax serves us well? Personally, I think that including a major new syntactic form in the language is too big of a change.

I'm not necessarily fixated on the .value syntax; I'm more against the proposed syntax due to some inherent problems it has. However, I don't have any other proposals for the syntax - but I'd be curious to know if there are some.

My primary issue with the proposed syntax is that it pollutes the scope with additional names (with all the problems this can cause that were mentioned above), which will be even more pronounced if Kotlin adds this support for function calls.

Kotlin already has a considerable issue with scope pollution. This problem was also a big part of all the discussion about replacing context receivers, where the user at least has control over how much they pollute their scope. So, I think that we should try not to contribute more to this problem if there are viable alternatives without other significant downsides.

@edrd-f
Copy link

edrd-f commented Jun 19, 2024

I don't think making these imports implicit is a good idea. I very often review code on GitHub or other platforms and it won't be easy to understand where things are coming from.

Can't this be solved solely as an IDE feature, with the compiler providing suggestions for the most relevant types? For example, in when (someSealedType) { is <cursor>, the IDE could suggest all subtypes of the sealed type and when the user selects one, it adds an import for it. This way everything keeps working as is now, but the default auto-completion would apply the unqualified form of the symbols.

@serras
Copy link
Contributor Author

serras commented Jun 20, 2024

One general rule we try to follow is that developers can easily track where some of the names are coming from. In this case, we do so by propagating only types which are somehow close to the place where the new resolution takes place. That is, if you write:

fun f(s: Status) = when (s) {
  is Status.Ok -> ...
}

the Status is already there in the parameter (so not that very far away). To give another example of this difference, you need to write:

fun f(): Status = Ok
fun g() = Status.Ok

because in g we have no information about the potential types, so the user must still be explicit. The goal here is to eliminate "obvious" redundancies in the code.


The solution should have the properties mentioned above, and it seems that it doesn't introduce ambiguities into the grammar, meaning it shouldn't change how existing code would be parsed.

But it introduces some "traps" about how users may expect their code to be parsed. Thinking about it, the problem of implicit returns from blocks is quite bad; for example the following is wrongly parsed as -(n.Right(abs)).

fun blah(n: Int): Either<String, Int> = when {
  n < 0 -> {
    val abs = - n
    .Right(abs)
  }
}

My primary issue with the proposed syntax is that it pollutes the scope with additional names.

Interestingly, I see this quite the opposite. Imports (or context receivers) pollute in the sense that you cannot hide the new names in any way once you have them. This new resolution is more precise, as it adds only a few new names, based on the information it already has about the type.

Can't this be solved solely as an IDE feature, with the compiler providing suggestions for the most relevant types?

Definitely a good IDE experience can help. But it's not the end of the story, as the new resolution in this KEEP takes the types into account, so it can help you reduce conflicts whenever two different subclasses share a name.

@FilipDolnik
Copy link

To give another example of this difference, you need to write:

fun f(): Status = Ok
fun g() = Status.Ok

because in g we have no information about the potential types, so the user must still be explicit. The goal here is to eliminate "obvious" redundancies in the code.

Yeah, but the problems mentioned more or less do not exist in these simple expressions. Here is an example of how this feature will definitely end up being used:

fun foo(): Status {
  // Many lines of code
  ...

  return Ok
}

In this example, it's no longer that easy to track where this expression comes from, and it can get much worse even without introducing this feature to function calls (where it can be quite difficult to determine the type):

fun foo(): SomeClass {
  // Many lines of code
  ...

  return from("something")
}

In this case, you can no longer make reasonable assumptions about the origin of this function - even after you read through all of the code around this expression. This will be true for basically all classes and functions. It's not like with the contexts where you at least know that this feature is used in the given function.

But it still can get worse:

// In some other file
fun foo(action: () -> SomeClass) {

}

fun callFoo() {
  foo {
    from("something")
  }
}

It's not even clear here if this function returns a value and of which type, let alone that the from function is actually from the static scope of SomeClass.

Now that I think about this example. What happens if I add the following function into some scope accessible from callFoo:

fun from(p: Any): SomeClass = ...

Will it get the priority even though this function wouldn't get priority under standard overloading resolution rules? (Assuming I understand the KEEP correctly, this seems to be the case, which is highly counterintuitive)

But it introduces some "traps" about how users may expect their code to be parsed. Thinking about it, the problem of implicit returns from blocks is quite bad; for example the following is wrongly parsed as -(n.Right(abs)).

fun blah(n: Int): Either<String, Int> = when {
 n < 0 -> {
   val abs = - n
   .Right(abs)
 }
}

I agree it's not ideal, but I'd be okay with this limitation if other solutions cannot be found. As I mentioned above, the code will generally not compile, and the IDE will immediately add a tab while you write this expression. So, this limitation is clearly communicated to developers.

Additionally, this limitation prevents developers from abusing this feature with implicit returns from complex multiline blocks, which I already try to avoid. If the developer insists on not writing the full name, they can always add the static import.

However, maybe we can find some other special character that would work better instead of ., in which case it might be preferable even at the cost of not having a precedent from another language.

@Peanuuutz
Copy link

Peanuuutz commented Jun 20, 2024

On second thought, I realize the reason behind the trap introduced by .value is that it occurs to be the first token after an expression (which means it would collide), and whether it's valid also depends on whether the type is known.

enum class E { A }

expr()
.A // No pass - Collide & Type unknown

val value = .A // No pass - Type unknown

val value: E = .A // Pass

val value = when {
    true -> .A // No pass - Type unknown
}

val value: E = when {
    true -> .A // Pass
}

val value = when {
    true -> {
        expr()
        .A // No pass - Collide & Type unknown
    }
}

val value: E = when {
    true -> {
        expr()
        .A // No pass - Collide
    }
}

val lambda = {
    expr()
    .A // No pass - Collide & Type unknown
}

val lambda: () -> E = {
    expr()
    .A // No pass - Collide
}

fun foo(e: E): E {
     expr()
     .A // No pass - Collide & Type unknown

     return .A // Pass
}

foo(.A) // Pass

foo(
    .A // Pass
)

when (value) {
    .A -> {} // Pass
}

As you can see, the rule could be regarded as obvious but strange. It makes me wonder if we should limit that only to when and function argument, as in these scenarios it's very easy to reason about and keeps good compatibility (both backward and forward) while for other valid situations it doesn't save that much.

@FilipDolnik
Copy link

I realize the reason behind the trap introduced by .value is that it occurs to be the first token after a value, and whether it's valid also depends on whether the type is known.

I'm not sure I understand. The limitation is that you cannot use this syntax if there is some other expression in front of the .value expression, even if this expression is on a line above.

All of the examples you shared work if the type can be derived (except for the function calls, but those might work in the future). This is not a surprising behavior and doesn't seem to cause issues in languages like Swift. The same is the case even without the . in front of the expression.

To clarify:

fun foo(e: E): E {
     .A // First & Unknown - this will not compile because the type of the expression cannot be derived

     return .A // Not first & Known - this will compile just fine even though there is an expression above because the "return" keyword acts as a delimiter.
}

@Peanuuutz
Copy link

Peanuuutz commented Jun 20, 2024

The limitation is that you cannot use this syntax if there is some other expression in front of the .value expression.

I exactly meant that. ;)

Note that some places don't require or can't have return:

val l: () -> E = {
    expr()
    .A // No pass - Collide
}

val e: E = when {
    true -> {
        expr()
        .A // No pass - Collide
    }
}

@serras
Copy link
Contributor Author

serras commented Jun 20, 2024

To me, I think the proponents of the .value syntax need some more convincing arguments that the restrictions and potential for error (for example, when you try to use it after an expression in the previous lines) are a worthy trade-off with the potential clarity it brings.

Apart from the syntactic part, it feels that most people here agree this is a useful feature to have (not surprising, as it was highly voted in YouTrack).

@Peanuuutz
Copy link

I feel like it's too unobvious for the syntax without . - if you get to know the type, you can directly use the members. This only works well if the use-site is close to the type declaration, but further on it gets really confusing as to where the thing comes from, which now requires a brand new resolution scope to be explained just for that. It also introduces much more possibilities to naming collision with current scope. These points already make me nervous.
At least the syntax with . can make that stand out.

@edrd-f
Copy link

edrd-f commented Jun 20, 2024

To me, I think the proponents of the .value syntax need some more convincing arguments that the restrictions and potential for error (for example, when you try to use it after an expression in the previous lines) are a worthy trade-off with the potential clarity it brings.

Agreed.

Apart from the syntactic part, it feels that most people here agree this is a useful feature to have (not surprising, as it was highly voted in YouTrack).

Also agreed. However, that is only one criterion to decide if it should be added to the language or not. Union types, for example, is a highly requested feature, but there are good reasons not to add it. Also, people who vote for a feature are often looking at the positive aspects of it and are not weighing the costs like we're doing here.

Any feature that causes detriments to compilation speed, type system integrity, program correctness, IDE experience, etc. must have big benefits. I don't think that's the case here, and that's why I think it should be an IDE feature.

@edrd-f
Copy link

edrd-f commented Jun 20, 2024

These are very good points, and my biggest concern. There are already pitfalls with the later addition of member functions replacing extension function calls without any errors. We shouldn't double down on these behaviors, otherwise the language will be less and less reliable over time.

Yeah, but the problems mentioned more or less do not exist in these simple expressions. Here is an example of how this feature will definitely end up being used:

fun foo(): Status {
  // Many lines of code
  ...

  return Ok
}

In this example, it's no longer that easy to track where this expression comes from, and it can get much worse even without introducing this feature to function calls (where it can be quite difficult to determine the type):

fun foo(): SomeClass {
  // Many lines of code
  ...

  return from("something")
}

In this case, you can no longer make reasonable assumptions about the origin of this function - even after you read through all of the code around this expression. This will be true for basically all classes and functions. It's not like with the contexts where you at least know that this feature is used in the given function.

But it still can get worse:

// In some other file
fun foo(action: () -> SomeClass) {

}

fun callFoo() {
  foo {
    from("something")
  }
}

It's not even clear here if this function returns a value and of which type, let alone that the from function is actually from the static scope of SomeClass.

Now that I think about this example. What happens if I add the following function into some scope accessible from callFoo:

fun from(p: Any): SomeClass = ...

Will it get the priority even though this function wouldn't get priority under standard overloading resolution rules? (Assuming I understand the KEEP correctly, this seems to be the case, which is highly counterintuitive)

@gildor
Copy link
Contributor

gildor commented Jun 24, 2024

I do support this feature, as .value or without.
But now, thinking again and comparing KEEP and @edrd-f's proposal about doing this purely on the IDE level with automatic static import, I indeed don't see convincing reasons why it cannot be an IDE improvement until this feature has more use cases covered.
Essentially now the only advantage is that import is not added, but we pay for it by the fact that it very implicit and indeed hard to read especially for classes with multiple enum/sealed classes in the scope (which is normal for View Models or other heavy classes which manage a lot of different states, often with identical names like Error/Loading)

@serras
Copy link
Contributor Author

serras commented Jun 24, 2024

@gildor note that this feature is more powerful than a simple IDE improvement. If you add an import to ON, every reference to ON in the code resolves to that particular import. But with this feature, you may have ON resolve to different enumeration types depending on the context.

@fvasco
Copy link

fvasco commented Jun 24, 2024

@serras

If you add an import to ON, every reference to ON in the code resolves to that particular import.

It looks an anti feature to me.

@gildor
Copy link
Contributor

gildor commented Jun 25, 2024

@serras yes, it's fair, indeed.
But it reduces readability, probably not for all cases, but definitely for more complex ones

@serras
Copy link
Contributor Author

serras commented Jul 1, 2024

I forgot to update the thread here after a discussion we had around a week ago in the team:

  • The KEEP now clarifies what happens with lambdas (diff)
  • We still see a lot of problems with .value syntax, so the plan is to roll the current version (without any qualification), and then process the feedback on that approach. If it turns out the current approach creates confusion, then we'll try some other initial prefix which doesn't create parsing ambiguities with existing code.

@Peanuuutz
Copy link

On second thought, ...

I've added more cases where this feature could show up in my comment before. Please review.

... It makes me wonder if we should limit that only to when and function argument, as in these scenarios it's very easy to reason about and keeps good compatibility (both backward and forward) while for other valid situations it doesn't save that much.

What were the comments on this? @serras

@serras
Copy link
Contributor Author

serras commented Jul 1, 2024

@Peanuuutz in general, parsing cannot depend on typing. Parsing is performed independently as a first phase, and typing and resolution come later. So your suggestion is not feasible without major changes on the architecture.

About limiting when, we discussed a bit how much the type should "travel" from a particular point. Everybody was eager to make something like val color: Color = RED (instead of = Color.RED) work. But then you say, why shouldn't val color: Color = if (blah) RED else BLUE work? So it's hard to give a good cut-off here.

@Peanuuutz
Copy link

Peanuuutz commented Jul 2, 2024

Thanks. That I understand now.

This lies a possible path to include .value as the syntax for type-guided resolution. The main caveat seems that newlines may "hide" the actual parse tree, and lead people to believe that they are creating a .value expression when they are not. This is an actual risk.

I do feel like the benefit of .value (the clearness it brings over value) is worth the risk as all the reasonable examples I could imagine will be a compile error anyway such as "x doesn't exist on the return type of foo()" or "The return value is missing or the type mismatches". To me it seems more error-proof than value as there are way more possibilities things could go wrong (especially in our DSL Disney Land), as stated before there are different kinds of scopes which could interfere the resolution.

For value to mess up:

  • There's a member in the scope with the same signature.

For .value to mess up:

  • The previous line must end up as an expression.
  • There's a member in the return type of the previous expression with the same signature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants