Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SSH Connection Timeouts During Full Deployment #857

Closed
jfanals opened this issue Jun 23, 2024 · 2 comments · May be fixed by #859
Closed

SSH Connection Timeouts During Full Deployment #857

jfanals opened this issue Jun 23, 2024 · 2 comments · May be fixed by #859

Comments

@jfanals
Copy link

jfanals commented Jun 23, 2024

I'm experiencing intermittent SSH connection timeouts during the full kamal deploy process, specifically in the pull method. Interestingly, running kamal build pull separately works fine, but the same operation fails during the full deployment.

Environment

  • Kamal version: 1.7.1

Steps to Reproduce

  1. Run kamal app remove && kamal deploy --verbose
  2. Observe that the process fails during the pull method with a timeout error
  3. Run kamal build pull separately
  4. Observe that it works without issues

Error Message

 DEBUG [4d2890b0] Running /usr/bin/env echo [2024-06-23T14:33:29Z] [myuser] Pulled image with version 3fbef4c6366422100f25a27d0376790eebd81461 >> .kamal/frontend_v2-audit.log on 5.161.193.74
 DEBUG [4d2890b0] Command: /usr/bin/env echo [2024-06-23T14:33:29Z] [myuser] Pulled image with version 3fbef4c6366422100f25a27d0376790eebd81461 >> .kamal/frontend_v2-audit.log
  Finished all in 231.3 seconds
  ERROR (IO::TimeoutError): Exception while executing on host 5.161.193.74: Blocking operation timed out!
/Users/myuser/.local/share/mise/installs/ruby/3.3.3/lib/ruby/gems/3.3.0/gems/net-ssh-7.2.3/lib/net/ssh/buffered_io.rb:64:in `recv'
/Users/myuser/.local/share/mise/installs/ruby/3.3.3/lib/ruby/gems/3.3.0/gems/net-ssh-7.2.3/lib/net/ssh/buffered_io.rb:64:in `fill'
/Users/myuser/.local/share/mise/installs/ruby/3.3.3/lib/ruby/gems/3.3.0/gems/net-ssh-7.2.3/lib/net/ssh/connection/session.rb:275:in `block in ev_do_handle_events'
/Users/myuser/.local/share/mise/installs/ruby/3.3.3/lib/ruby/gems/3.3.0/gems/net-ssh-7.2.3/lib/net/ssh/connection/session.rb:271:in `each'
/Users/myuser/.local/share/mise/installs/ruby/3.3.3/lib/ruby/gems/3.3.0/gems/net-ssh-7.2.3/lib/net/ssh/connection/session.rb:271:in `ev_do_handle_events'
/Users/myuser/.local/share/mise/installs/ruby/3.3.3/lib/ruby/gems/3.3.0/gems/net-ssh-7.2.3/lib/net/ssh/connection/event_loop.rb:117:in `ev_select_and_postprocess'
/Users/myuser/.local/share/mise/installs/ruby/3.3.3/lib/ruby/gems/3.3.0/gems/net-ssh-7.2.3/lib/net/ssh/connection/event_loop.rb:30:in `process'
/Users/myuser/.local/share/mise/installs/ruby/3.3.3/lib/ruby/gems/3.3.0/gems/net-ssh-7.2.3/lib/net/ssh/connection/session.rb:226:in `process'
/Users/myuser/.local/share/mise/installs/ruby/3.3.3/lib/ruby/gems/3.3.0/gems/net-ssh-7.2.3/lib/net/ssh/connection/session.rb:179:in `block in loop'
<internal:kernel>:187:in `loop'
/Users/myuser/.local/share/mise/installs/ruby/3.3.3/lib/ruby/gems/3.3.0/gems/net-ssh-7.2.3/lib/net/ssh/connection/session.rb:179:in `loop'
/Users/myuser/.local/share/mise/installs/ruby/3.3.3/lib/ruby/gems/3.3.0/gems/sshkit-1.22.2/lib/sshkit/backends/netssh.rb:182:in `block in execute_command'
/Users/myuser/.local/share/mise/installs/ruby/3.3.3/lib/ruby/gems/3.3.0/gems/sshkit-1.22.2/lib/sshkit/backends/connection_pool.rb:65:in `with'
/Users/myuser/.local/share/mise/installs/ruby/3.3.3/lib/ruby/gems/3.3.0/gems/kamal-1.7.1/lib/kamal/sshkit_with_ext.rb:84:in `with_ssh'
/Users/myuser/.local/share/mise/installs/ruby/3.3.3/lib/ruby/gems/3.3.0/gems/sshkit-1.22.2/lib/sshkit/backends/netssh.rb:146:in `execute_command'
/Users/myuser/.local/share/mise/installs/ruby/3.3.3/lib/ruby/gems/3.3.0/gems/sshkit-1.22.2/lib/sshkit/backends/abstract.rb:148:in `block in create_command_and_execute'
<internal:kernel>:90:in `tap'
/Users/myuser/.local/share/mise/installs/ruby/3.3.3/lib/ruby/gems/3.3.0/gems/sshkit-1.22.2/lib/sshkit/backends/abstract.rb:148:in `create_command_and_execute'
/Users/myuser/.local/share/mise/installs/ruby/3.3.3/lib/ruby/gems/3.3.0/gems/sshkit-1.22.2/lib/sshkit/backends/abstract.rb:80:in `execute'
/Users/myuser/.local/share/mise/installs/ruby/3.3.3/lib/ruby/gems/3.3.0/gems/kamal-1.7.1/lib/kamal/cli/build.rb:61:in `block in pull'
/Users/myuser/.local/share/mise/installs/ruby/3.3.3/lib/ruby/gems/3.3.0/gems/sshkit-1.22.2/lib/sshkit/backends/abstract.rb:31:in `instance_exec'
/Users/myuser/.local/share/mise/installs/ruby/3.3.3/lib/ruby/gems/3.3.0/gems/sshkit-1.22.2/lib/sshkit/backends/abstract.rb:31:in `run'
/Users/myuser/.local/share/mise/installs/ruby/3.3.3/lib/ruby/gems/3.3.0/gems/kamal-1.7.1/lib/kamal/sshkit_with_ext.rb:117:in `block (2 levels) in execute'

Additional Information

  • The issue occurs consistently during full deployment but not when running kamal build pull separately
  • SSH agent is not running, but SSH_AUTH_SOCK is set
  • SSH configuration shows forward_agent: true

Questions

  1. Why might the SSH connection work for kamal build pull but fail during kamal deploy?
  2. Are there any known issues with SSH connections timing out during longer processes?
  3. Could there be a problem with how the SSH connection is maintained or reused during the full deployment process?

Attempted Solutions

  • Increased SSH timeout to 300 seconds
  • Added more verbose logging and error handling
  • Verified SSH key presence and configuration

Any assistance in resolving this issue or suggestions for further debugging would be greatly appreciated.

@ccastillop
Copy link

This response from IA worked for me:

To keep your SSH connection alive for a longer time without being disconnected, you can configure both the client-side and server-side settings.

1. Client-Side Configuration

You can configure your local machine to send keep-alive packets to the server to prevent the connection from being dropped.

  • Open (or create) the SSH configuration file on your local machine:

    nano ~/.ssh/config
  • Add the following configuration:

    Host *
      ServerAliveInterval 60
      ServerAliveCountMax 240

    Explanation:

    • ServerAliveInterval 60: This tells your SSH client to send a keep-alive message every 60 seconds to the server.
    • ServerAliveCountMax 240: This allows the SSH client to send keep-alive messages up to 240 times before disconnecting (which translates to about 4 hours of inactivity).

2. Server-Side Configuration

If the issue is caused by server settings, you can adjust the server-side SSH configuration.

  • Edit the SSH daemon configuration on the server:

    sudo nano /etc/ssh/sshd_config
  • Look for (or add) the following settings:

    ClientAliveInterval 60
    ClientAliveCountMax 240

    Explanation:

    • ClientAliveInterval 60: The server sends a keep-alive message every 60 seconds to the client.
    • ClientAliveCountMax 240: The server allows 240 keep-alive messages before considering the connection dead (about 4 hours).
  • After making the changes, restart the SSH service:

    sudo systemctl restart ssh

These changes should help keep your SSH connection alive for a longer period. If both client-side and server-side settings are configured, it increases the chances of maintaining the connection.

@jericopulvera
Copy link

Please reopen. I have encountered this issue.

I'm building a Next.js application that takes over 300 seconds to build, and after the build process, the error occurs.

  INFO [40fc85fb] Finished in 332.421 seconds with exit status 0 (successful).  
  Finished all in 399.9 seconds  
  ERROR (IO::TimeoutError): Exception while executing on host 54.xx.xxx.xx: Blocking operation timed out!  

I have tried the above SSH configurations and also restarted the SSH service using the following commands.

sudo launchctl stop com.openssh.sshd
sudo launchctl start com.openssh.sshd

I would have to re-run kamal deploy again in order for the deployment to be completed since it skips the build process because of the cache.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants