Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chat: room lifecycle specification #200

Draft
wants to merge 52 commits into
base: main
Choose a base branch
from
Draft

chat: room lifecycle specification #200

wants to merge 52 commits into from

Conversation

AndyTWF
Copy link
Contributor

@AndyTWF AndyTWF commented Aug 5, 2024

  • Adds chat room lifecycle feature specification.
  • Also adds feature specs for presence, messages and room reactions

@AndyTWF AndyTWF changed the title wip: chat features chat: room lifecycle specification Aug 14, 2024
@lawrence-forooghian
Copy link
Collaborator

lawrence-forooghian commented Aug 27, 2024

I’ve started implementing this spec in the Swift SDK. In order to be sure that this spec is sufficiently detailed and unambigious, I’ve made the deliberate decision not to consult the JS implementation. I’ll post feedback here as it comes up 🙂

** @(CHA-RS1h)@ The @RELEASING@ status means that the room is being released and the underlying resources are being cleaned up.
** @(CHA-RS1i)@ The @RELEASED@ status means that the room has been cleaned up and the object can no longer be used.
* @(CHA-RS2)@ A room must expose its current status.
** @(CHA-RS2a)@ @[Testable]@ A room must expose its current status, a single value from the list provided above.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was it a conscious decision to not describe the API of the SDK in terms of signatures / types (e.g. the way that we do in the main spec, which also has the IDL)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The IDL we have for the core SDKs is quite java-centric and in that sense may not be the most idiomatic approach for other ecosystems, so I tried to write the spec in a way that describes the intended interactions and behaviour, rather than prescribe an exact implementation.

Is it worth a note somewhere in the spec to treat the JS implementation as a reference implementation and explain the intention?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe having IDL for chat-SDK will be super helpful.
Almost all of the languages these days are encouraging use of types.
So, having language-agnostic types will surely help.


* @(CHA-RL1)@ A room must be explicitly @ATTACHED@ to.
** @(CHA-RL1a)@ @[Testable]@ If the room is already in the @ATTACHED@ status, then this operation is no-op.
** @(CHA-RL1b)@ @[Testable]@ If the room is in the @RELEASING@ status, the operation shall be rejected with error code @102102@.
Copy link
Collaborator

@lawrence-forooghian lawrence-forooghian Aug 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear what rejected with error code @102102@ means here. (I know from looking at the JS interface that it reuses the ErrorInfo type from the core SDK, but it would be good to make that explicit.)

Copy link
Collaborator

@lawrence-forooghian lawrence-forooghian Aug 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, if the intention is to reuse ErrorInfo, then all of the places where this spec says to throw such an error should also specify the statusCode.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

Copy link
Collaborator

@lawrence-forooghian lawrence-forooghian Sep 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't see where the status codes are specified; it seems like you've only added one for CHA-RL1h2. It seems like there should be status codes specified for:

  • everything in the "Chat-specific Error Codes" section
  • all the places where we specify a numerical error code directly in the spec point (e.g. CHA-RC1c, CHA-RC2b, CHA-RL1b, CHA-RL1c).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've provided the status codes for chat-specific errors. For generic codes e.g. 40000 - the standard convention of matching the status code to the error code should be followed (in this case, 400).

** @(CHA-RL1f)@ @[Testable]@ The underlying @contributors@ will have their Realtime Channels attached in sequence.
** @(CHA-RL1g)@ When all @contributors@ Realtime Channels successfully attach, the operation is now complete and the room is considered attached.
*** @(CHA-RL1g1)@ @[Testable]@ The room status shall be transitioned to @ATTACHED@.
*** @(CHA-RL1g2)@ @[Testable]@ Any contributors to room that have a pending discontinuity event against them, must be notified of this fact, using the error that caused the original discontinuity.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a spec point that describes what this means? (I might have not got there yet.) If so, it would be good to have a reference to it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It sorta felt that it didn't fit into a spec point, but I've added some discussion/blurb at the head of the lifecycle section about it!

** @(CHA-RL1g)@ When all @contributors@ Realtime Channels successfully attach, the operation is now complete and the room is considered attached.
*** @(CHA-RL1g1)@ @[Testable]@ The room status shall be transitioned to @ATTACHED@.
*** @(CHA-RL1g2)@ @[Testable]@ Any contributors to room that have a pending discontinuity event against them, must be notified of this fact, using the error that caused the original discontinuity.
*** @(CHA-RL1g3)@ @[Testable]@ Any transient disconnect timeouts shall be cleared.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question re whether there is a spec reference that could be added here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some guidance / rationale text!

*** @(CHA-RL1g3)@ @[Testable]@ Any transient disconnect timeouts shall be cleared.
** @(CHA-RL1h)@ If a one of the @contributors@ Realtime Channels fails to attach, then the operation has failed and must be rolled back.
*** @(CHA-RL1h1)@ @[Testable]@ The @attach@ call must throw an @ErrorInfo@. The code for this error is determined by the feature that failed (see the @ErrorCodes@ enum in the JS SDK for more information).
*** @(CHA-RL1h2)@ @[Testable]@ If the underlying Realtime Channel entered the @SUSPENDED@ state, then the room status will transition to @SUSPENDED@, using the Realtime Channel error as the @cause@ field of the @ErrorInfo@.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be more explicit — does it mean "if the call to contributor.attach() fails, then check the contributor’s state. If it's SUSPENDED, then transition to SUSPENDED, using the contributor’s errorReason property as the cause field of the error of the emitted status change"?

Also, what code and statusCode should this error have?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarified on all points

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"the Realtime Channel error" is still a bit unclear — it could mean the error that the contributor attach() call failed with. Can we explicitly say to use the contributor’s errorReason?

Copy link
Contributor Author

@AndyTWF AndyTWF Sep 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I’m not sure what you’ve changed — CHA-RL1h2 doesn't seem to have changed from bcb7390 to b4a495e?

Copy link
Collaborator

@lawrence-forooghian lawrence-forooghian Sep 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it meant to be using the same language as CHA-RL1h4 now uses — that is, to use the error thrown by channel attach()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, have clarified to use same language as CHA-RL1h4

** @(CHA-RL3k)@ @[Testable]@ If a room lifecycle operation is already in progress, this operation shall wait until the current operation completes before continuing, subject to @CHA-RL7@.
* @(CHA-RL5)@ A room must @RETRY@ whenever it enters the @SUSPENDED@ state.
** @(CHA-RL5a)@ @[Testable]@ When entering a retry-loop, the room must first @DETACH@ all contributors underlying realtime channels.
** @(CHA-RL5b)@ @[Testable]@ If the operation above fails, then it must be retried after a short wait.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after a short wait

How long?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

250ms - added

* @(CHA-RL5)@ A room must @RETRY@ whenever it enters the @SUSPENDED@ state.
** @(CHA-RL5a)@ @[Testable]@ When entering a retry-loop, the room must first @DETACH@ all contributors underlying realtime channels.
** @(CHA-RL5b)@ @[Testable]@ If the operation above fails, then it must be retried after a short wait.
** @(CHA-RL5c)@ @[Testable]@ If the operation above fails because a channel has entered the @FAILED@ state, then the retry loop must stop and the room must be placed in the @FAILED@ status.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With what error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've linked CHA-RL5a to CHA-RL2f, and removed the error part here.

** @(CHA-RL5b)@ @[Testable]@ If the operation above fails, then it must be retried after a short wait.
** @(CHA-RL5c)@ @[Testable]@ If the operation above fails because a channel has entered the @FAILED@ state, then the retry loop must stop and the room must be placed in the @FAILED@ status.
** @(CHA-RL5d)@ Once all channels have been detached, the room waits until the original channel that caused the retry loop naturally enters the @ATTACHED@ state.
** @(CHA-RL5e)@ @[Testable]@ If the channel state becomes @FAILED@, then the room status is transitioned to @FAILED@ and the retry loop stops.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then the room status is transitioned to FAILED

with what error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarified as the error associated with the ChannelStateChange

** @(CHA-RL5e)@ @[Testable]@ If the channel state becomes @FAILED@, then the room status is transitioned to @FAILED@ and the retry loop stops. The error associated with the transition is the error from the Realtime @ChannelStateChange@.
** @(CHA-RL5f)@ @[Testable]@ If the channel state becomes @ATTACHED@, then the room status is transitioned to @ATTACHING@ to begin a new attachment cycle.
*** @(CHA-RL5f1)@ @[Testable]@ If the attachment cycle succeeds, then the room is transitioned to @ATTACHED@ and any discontinuity errors are broadcast to contributors.
*** @(CHA-RL5f2)@ @[Testable]@ If the attachment cycle fails because a channel entered @FAILED@, then the room is transitioned to @FAILED@ and the retry loop terminates.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then the room is transitioned to FAILED

with which error?

** @(CHA-RL5f)@ @[Testable]@ If the channel state becomes @ATTACHED@, then the room status is transitioned to @ATTACHING@ to begin a new attachment cycle.
*** @(CHA-RL5f1)@ @[Testable]@ If the attachment cycle succeeds, then the room is transitioned to @ATTACHED@ and any discontinuity errors are broadcast to contributors.
*** @(CHA-RL5f2)@ @[Testable]@ If the attachment cycle fails because a channel entered @FAILED@, then the room is transitioned to @FAILED@ and the retry loop terminates.
*** @(CHA-RL5f3)@ @[Testable]@ If the attachment cycle fails because a channel entered @SUSPENDED@, then the room is transitioned to @SUSPENDED@ and the retry loop restarted using the @SUSPENDED@ channel as the subject.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the room is transitioned to SUSPENDED

with which error?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

7 participants