-
Notifications
You must be signed in to change notification settings - Fork 796
Description
Describe the bug
At fairly random intervals but frequently enough to be noticeable, our sensors are crashing with this error when we turned on HA (having more than one replica). It always follows the sensor sending a message that it's lost leader election.
{"level":"info","ts":"2025-10-06T21:02:27.896064717Z","logger":"argo-events.sensor","caller":"leaderelection/leaderelection.go:176","msg":"Becoming a Follower, stand by ...","sensorName":"helm-build"}
{"level":"fatal","ts":"2025-10-06T21:02:27.89615058Z","logger":"argo-events.sensor","caller":"sensors/listener.go:80","msg":"leader lost: helm-build-sensor-rpcht-5f64f9cf97-dpbwb","sensorName":"helm-build","stacktrace":"github.com/argoproj/argo-events/pkg/sensors.(*SensorContext).Start.func2\n\t/home/runner/work/argo-events/argo-events/pkg/sensors/listener.go:80\ngithub.com/argoproj/argo-events/pkg/shared/leaderelection.(*natsEventBusElector).RunOrDie.func1\n\t/home/runner/work/argo-events/argo-events/pkg/shared/leaderelection/leaderelection.go:179\ngithub.com/argoproj/argo-events/pkg/shared/leaderelection.(*natsEventBusElector).RunOrDie\n\t/home/runner/work/argo-events/argo-events/pkg/shared/leaderelection/leaderelection.go:199\ngithub.com/argoproj/argo-events/pkg/sensors.(*SensorContext).Start\n\t/home/runner/work/argo-events/argo-events/pkg/sensors/listener.go:73\ngithub.com/argoproj/argo-events/pkg/sensors/cmd.Start\n\t/home/runner/work/argo-events/argo-events/pkg/sensors/cmd/start.go:85\ngithub.com/argoproj/argo-events/cmd/commands.init.0.NewSensorCommand.func2\n\t/home/runner/work/argo-events/argo-events/cmd/commands/sensor.go:14\ngithub.com/spf13/cobra.(*Command).execute\n\t/home/runner/go/pkg/mod/github.com/spf13/[email protected]/command.go:1019\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/home/runner/go/pkg/mod/github.com/spf13/[email protected]/command.go:1148\ngithub.com/spf13/cobra.(*Command).Execute\n\t/home/runner/go/pkg/mod/github.com/spf13/[email protected]/command.go:1071\ngithub.com/argoproj/argo-events/cmd/commands.Execute\n\t/home/runner/work/argo-events/argo-events/cmd/commands/root.go:19\nmain.main\n\t/home/runner/work/argo-events/argo-events/cmd/main.go:8\nruntime.main\n\t/opt/hostedtoolcache/go/1.24.4/x64/src/runtime/proc.go:283"}
We are using the jetstream nats eventbus.
To Reproduce
Steps to reproduce the behavior:
Create an event bus e.g:
---
apiVersion: argoproj.io/v1alpha1
kind: EventBus
metadata:
annotations:
name: test
namespace: argo-workflows
spec:
jetstream:
version: 2.10.10Create an eventsource, here's a simple one from your examples
---
apiVersion: argoproj.io/v1alpha1
kind: EventSource
metadata:
name: file
spec:
eventBusName: test
template:
container:
volumeMounts:
- mountPath: /test-data
name: test-data
volumes:
- name: test-data
emptyDir: {}
file:
example:
watchPathConfig:
directory: /test-data/
path: x.txt
eventType: CREATECreate a sensor with more than 1 replica, again just using one of your examples:
---
apiVersion: argoproj.io/v1alpha1
kind: Sensor
metadata:
name: file
spec:
template:
serviceAccountName: operate-workflow-sa
replicas: 2
eventBusName: test
dependencies:
- name: test-dep
eventSourceName: file
eventName: example
triggers:
- template:
name: file-workflow-trigger
k8s:
operation: create
source:
resource:
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: file-watcher-
spec:
entrypoint: print-message
templates:
-
container:
args:
- "hello"
command:
- echo
image: busybox
name: print-message
parameters:
- src:
dependencyName: test-dep
dataKey: name
dest: spec.templates.0.container.args.0
retryStrategy:
steps: 3After some undefined amount of time the error will occur. Potentially immediately, sometimes not for hours.
Expected behavior
The Sensor would not crash
Environment (please complete the following information):
- Kubernetes: v1.33.5
- Argo WF: 3.7.2
- Argo Events: v1.9.7
Additional context
Add any other context about the problem here.
Message from the maintainers:
If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.