Skip to content

fix(ssh): add timeouts to SSH/WebSocket connections and per-channel state#219

Open
drew wants to merge 3 commits intomainfrom
ssh-flakyness/dn
Open

fix(ssh): add timeouts to SSH/WebSocket connections and per-channel state#219
drew wants to merge 3 commits intomainfrom
ssh-flakyness/dn

Conversation

@drew
Copy link
Collaborator

@drew drew commented Mar 11, 2026

Summary

  • Add timeout handling across the SSH connection stack to prevent indefinite hangs during connection establishment
  • Add SSH_PROXY_ACCEPT_TIMEOUT (5s) and SSH_PROXY_CONNECT_TIMEOUT (10s) for SSH proxy operations
  • Add UPSTREAM_HANDSHAKE_TIMEOUT (10s) for WebSocket upstream connections
  • Refactor SshHandler to maintain per-channel state (HashMap<ChannelId, ChannelState>) instead of global state, properly isolating pty_master, input_sender, and pty_request per channel

Changes

  • navigator-server/src/grpc.rs: Add timeouts to start_single_use_ssh_proxy for accept and connect operations
  • navigator-server/src/ssh_tunnel.rs: Extract establish_upstream function, add TunnelSetupError for cleaner error handling with proper HTTP status codes (504 for timeouts, 502 for other errors)
  • navigator-sandbox/src/ssh.rs: Refactor to per-channel state, add logging for reader drain timeouts
  • navigator-cli/src/ssh.rs and edge_tunnel.rs: Add upstream handshake timeouts

Testing

All SSH-related tests pass, including new tests for:

  • channel_data_routes_only_to_matching_channel
  • channel_eof_only_closes_matching_channel_input
  • cleanup_channel_removes_only_matching_state
  • establish_upstream_times_out_waiting_for_handshake_response
  • establish_upstream_rejects_non_ok_handshake_response
@drew drew self-assigned this Mar 11, 2026
@drew drew force-pushed the ssh-flakyness/dn branch 2 times, most recently from 429c0ca to fbf375b Compare March 18, 2026 06:30
@drew drew requested a review from a team as a code owner March 18, 2026 06:30
drew added 2 commits March 23, 2026 15:29
…tate Add timeout handling across the SSH connection stack to prevent indefinite hangs during connection establishment: - Add SSH_PROXY_ACCEPT_TIMEOUT (5s) and SSH_PROXY_CONNECT_TIMEOUT (10s) - Add UPSTREAM_HANDSHAKE_TIMEOUT (10s) for WebSocket upstream connections - Extract establish_upstream for cleaner connection setup - Refactor SshHandler to maintain per-channel state instead of global state, properly isolating pty_master, input_sender, and pty_request per ChannelId This prevents clients from hanging indefinitely when connection establishment fails or times out.
@drew drew force-pushed the ssh-flakyness/dn branch from fbf375b to fa9b617 Compare March 23, 2026 22:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant