-
Notifications
You must be signed in to change notification settings - Fork 78
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Summary:
WSS handler fails to reject malformed connections gracefully, causing TCP socket exhaustion and service degradation.
Incident Details:
- When: 2025-10-24 02:04 UTC
- Duration: ~5 hours (self-resolved)
- Impact: All WSS endpoints unresponsive, TLS handshakes timing out
- Scope: Multiple production nodes simultaneously affected
Evidence:
- TCP socket count: 200 → 1,600+ (8x increase)
- Log volume: 11K → 127K lines/hour (11x increase)
- Message throughput: Unchanged (~25K/hour)
- WSS errors: 1,733
AsyncStream Error: "Incomplete data sent or received"in one hour - P2P functionality: Unaffected
Root Cause:
Malformed/incomplete WSS connection attempts trigger errors in wstransport.nim (lines 294, 296):
Http Error: "Timed out parsing request"
AsyncStream Error: "Incomplete data sent or received"
Connections are accepted but never properly closed, causing socket leak and Recv-Q buildup.
Expected Behavior:
WSS handler should reject invalid connections quickly and close sockets properly, preventing resource exhaustion.
Version:
- Image:
harbor.status.im/wakuorg/nwaku:deploy-status-prod(2 months old)
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working
Type
Projects
Status
In Progress