502 Errors During Envoy Gateway Upgrade from 1.4.2 to 1.5.1, But No 502 in Data Plane Logs #7256
Unanswered
lideheng6379-del
asked this question in
Q&A
Replies: 1 comment
-
|
if you can't see any |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Problem Description:
While performing a rolling upgrade, a small number of 502 errors were observed on the client side. However, when I checked the data plane (Envoy proxy) logs, there were no entries with response_code: 502. This makes it difficult to trace the root cause of these errors.
Deployment YAML:
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: EnvoyProxy
metadata:
name: proxy-config
namespace: upgrade
spec:
shutdown:
drainTimeout: "90s"
minDrainDuration: "30s"
provider:
type: Kubernetes
kubernetes:
envoyDeployment:
replicas: 3
container:
resources:
requests:
cpu: 1
memory: 2Gi
tke.cloud.tencent.com/eni-ip: "1"
limits:
cpu: 1
memory: 2Gi
tke.cloud.tencent.com/eni-ip: "1"
pod:
annotations:
tke.cloud.tencent.com/networks: tke-route-eni
envoyService:
name: gw
annotations:
service.cloud.tencent.com/direct-access: "true"
service.kubernetes.io/tke-existed-lbid: "lb-hrl7kfs3"
logging:
level:
default: info
apiVersion: gateway.networking.k8s.io/v1
kind: GatewayClass
metadata:
name: gwc
spec:
controllerName: gateway.envoyproxy.io/gatewayclass-controller
parametersRef:
group: gateway.envoyproxy.io
kind: EnvoyProxy
name: proxy-config
namespace: upgrade
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: gw
namespace: upgrade
spec:
gatewayClassName: gwc
listeners:
- name: http
protocol: HTTP
port: 80
allowedRoutes:
namespaces:
from: Same
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: nginx
namespace: upgrade
spec:
parentRefs:
- name: gw
rules:
- backendRefs:
- name: nginx
namespace: test
port: 80
matches:
- path:
type: PathPrefix
value: /
Load Test Command:
hey -c 10 -z 300s http://xxxx:80
Load Test Results:
Total requests: 123,282
[200] responses: 123,272
[502] responses: 10
Summary:
Total: 300.0124 secs
Slowest: 10.0390 secs
Fastest: 0.0171 secs
Average: 0.0243 secs
Requests/sec: 410.9230
Status code distribution:
[200] 123272 responses
[502] 10 responses
Question:
Why would 502 errors occur on the client side during the upgrade, but there are no corresponding 502 logs in the Envoy data plane?
Is this expected during a rolling upgrade, or is there a way to trace these errors more effectively?
Any suggestions on how to further debug or avoid this issue would be appreciated.
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions