Skip to content

bug: Potential etcd events lost in PR #12514 #13067

@dramenk

Description

@dramenk

Current Behavior

I have been experiencing resource consumption issues due to etcd compaction errors, which led me to review the solution proposed in PR #12514.

It appears that APISIX fetches the latest global revision A by reading /phantomkey (a non-existent key?) before initiating a watch. If the subsequent watch times out without receiving any events, the next iteration uses the previously fetched revision A as the start_revision.

However, I have a question: If a "silent" error occurs during a watch (with start_revision:B), such as a silent network interruption occurring just before etcd sync the historical events, the watch will fail to get those events. If we then update the start_revision for the next watch to A+1, wouldn't the historical events between B and A be permanently lost? Or am I misunderstanding the current implementation?

Expected Behavior

No response

Error Logs

No response

Steps to Reproduce

null

Environment

  • APISIX version (run apisix version):
  • Operating system (run uname -a):
  • OpenResty / Nginx version (run openresty -V or nginx -V):
  • etcd version, if relevant (run curl http://127.0.0.1:9090/v1/server_info):
  • APISIX Dashboard version, if relevant:
  • Plugin runner version, for issues related to plugin runners:
  • LuaRocks version, for installation issues (run luarocks --version):

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionlabel for questions asked by users

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status

    📋 Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions