Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft: Resolve "server has to remove ungracefully disconnected zombie clients" #112

Conversation

fraunhofer-iis-bot
Copy link
Collaborator

In GitLab by @vornkat-iis on May 8, 2023, 12:48

Closes #49

@fraunhofer-iis-bot
Copy link
Collaborator Author

In GitLab by @vornkat-iis on May 8, 2023, 14:23

added 1 commit

  • a157d5d - Minimal keepalive settings

Compare with previous version

@fraunhofer-iis-bot
Copy link
Collaborator Author

In GitLab by @vornkat-iis on May 8, 2023, 14:23

requested review from @Michael-M-Baron

@fraunhofer-iis-bot
Copy link
Collaborator Author

In GitLab by @vornkat-iis on May 8, 2023, 14:24

I pushed first version that should for the moment solve your issue. @Michael-M-Baron can you verify? The numbers should be tuned to your application.
I will add configuration options in next commits

@fraunhofer-iis-bot
Copy link
Collaborator Author

In GitLab by @vornkat-iis on May 8, 2023, 14:44

added 1 commit

Compare with previous version

@fraunhofer-iis-bot
Copy link
Collaborator Author

In GitLab by @vornkat-iis on May 8, 2023, 15:21

added 1 commit

Compare with previous version

@fraunhofer-iis-bot
Copy link
Collaborator Author

In GitLab by @vornkat-iis on May 26, 2023, 17:08

added 3 commits

  • f7bd190 - Make keepalive configurable by libjapi user
  • 1171bc2 - Extend docs
  • 943311d - Add test and error handling for parameters

Compare with previous version

@fraunhofer-iis-bot
Copy link
Collaborator Author

In GitLab by @vornkat-iis on May 26, 2023, 17:32

added 1 commit

Compare with previous version

@fraunhofer-iis-bot
Copy link
Collaborator Author

In GitLab by @vornkat-iis on May 26, 2023, 17:36

Functionality tests are still missing. I have a draft locally but am not fully sure how to do this.

@vornkat-iis
Copy link
Collaborator

Added my draft here, maybe it helps one day to finish this issue. Tests pass right now because all relevant lines are commented out. Feel free to adapt

@Michael-M-Baron
Copy link
Collaborator

Michael-M-Baron commented Nov 5, 2024

Had a look at the current state of this branch and it does not seem to fix the problem. What I did in my application:

  • Set max allowed client to 1
  • Enable TCP keep alive (japi_set_tcp_keepalive(ctx, 1, 60, 10, 3))
  • Start server on different machine
  • Connect to server with client
  • Break down network connection on server side (ip link set eth0 down)
  • Try doing something with still open client (After some timeouts client will not work anymore)
  • Restart network on server side (ip link set eth0 up)

At this point, I can ping my server again but not connect with my client (wait for a couple of minutes and tried several times meanwhile).

My Suggestion:
Currently only the server socket is manipulated with the TCP-Probing. I think this has to be done with the clients and the server logic updated. I am going to test around.

…ew clients when maximum number of clients is reached
…ull-disconnected-zombie-clients' into 49-server-has-to-remove-ungracefull-disconnected-zombie-clients
…Server in a thread, setup first client and seocnd client with a maximum allowed number of clients of 1. The acutal test of the TCP-Keep alive functionality is still not performed, since the client 1 socket is closed gracefully (May be mocking is needed).
@Michael-M-Baron
Copy link
Collaborator

Had a look at the current state of this branch and it does not seem to fix the problem. What I did in my application:

* Set max allowed client to 1

* Enable TCP keep alive (`japi_set_tcp_keepalive(ctx, 1, 60, 10, 3)`)

* Start server on different machine

* Connect to server with client

* Break down network connection on server side (`ip link set eth0 down`)

* Try doing something with still open client (After some timeouts client will not work anymore)

* Restart network on server side (`ip link set eth0 up`)

At this point, I can ping my server again but not connect with my client (wait for a couple of minutes and tried several times meanwhile).

My Suggestion: Currently only the server socket is manipulated with the TCP-Probing. I think this has to be done with the clients and the server logic updated. I am going to test around.

I figured out the problem:

  • The current libjapi keep-alive mechanism was in fact already working.
  • My application uses websockets which kept the connection alive although the ethernet connection was shut down

I added some more debug prints and updated the unit tests to cover the TCP keep alive mechanism. Breaking down the client socket connection for testing however cannot by simply done in the tests (May be with mocking), so the actual mechanism is not tested yet in the unit tests (I added TODO marks and going to open a separate ticket for that).

@Michael-M-Baron Michael-M-Baron merged commit cf009d0 into dev Nov 6, 2024
1 check passed
@Michael-M-Baron Michael-M-Baron deleted the 49-server-has-to-remove-ungracefull-disconnected-zombie-clients branch November 6, 2024 15:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

server has to remove ungracefull disconnected zombie clients
3 participants