fix deadline callback migration to improve query performance by wjddn279 · Pull Request #63640 · apache/airflow

wjddn279 · 2026-03-15T13:24:25Z

releated: #63532

Through my testing, I confirmed that the bottleneck of this migration is not serialization, but the update query being executed one row at a time.

Even after excluding all performance bottlenecks such as serialization (including Python dict deserialization), the query performance was still poor. By changing the update to a bulk update, I confirmed an improvement from 2,000 rows/sec to 40,000 rows/sec (PostgreSQL), and 20,000 rows/sec on MySQL.

Even though the existing serialization code had no significant impact on performance, so it was replaced with code that simply gets values from the existing column's dict. (slight performance improvement)

Was generative AI tooling used to co-author this PR?

Yes (please specify the tool below)

Read the Pull Request Guidelines for more information. Note: commit author/co-author name and email in commits become permanently public when merged.
For fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
When adding dependency, check compliance with the ASF 3rd Party License Policy.
For significant user-facing changes create newsfragment: {pr_number}.significant.rst, in airflow-core/newsfragments. You can add this file in a follow-up commit after the PR is created so you know the PR number.

ephraimbuddy · 2026-03-19T07:19:39Z

...ore/src/airflow/migrations/versions/0094_3_2_0_replace_deadline_inline_callback_with_fkey.py

+                        UPDATE deadline2
+                        SET callback_id = v.callback_id, missed = v.missed
+                        FROM (VALUES {values_clause}) AS v(deadline_id, callback_id, missed)
+                        WHERE deadline2.id = v.deadline_id


This fails on real data upgrade.

Yes, there was a discussion about this, and it seems to fail depending on the environment. Since there's another PR with an alternative fix, I'll close this one.

fix deadline callback migration to improve query performance

77b51d1

wjddn279 requested a review from ephraimbuddy as a code owner March 15, 2026 13:24

boring-cyborg bot added area:db-migrations PRs with DB migration area:deadline-alerts AIP-86 (former AIP-57) labels Mar 15, 2026

vatsrahul1001 mentioned this pull request Mar 15, 2026

Migration 0094 upgrade& downgrade is slow at scale due to row-by-row Python deserialization of deadline callbacks #63532

Closed

2 tasks

Merge branch 'main' into improve-deadline-callback-upgrade-performance

66b05b9

ephraimbuddy reviewed Mar 19, 2026

View reviewed changes

wjddn279 closed this Mar 19, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix deadline callback migration to improve query performance#63640

fix deadline callback migration to improve query performance#63640
wjddn279 wants to merge 2 commits intoapache:mainfrom
wjddn279:improve-deadline-callback-upgrade-performance

wjddn279 commented Mar 15, 2026

Uh oh!

ephraimbuddy Mar 19, 2026

Uh oh!

wjddn279 Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

wjddn279 commented Mar 15, 2026

Was generative AI tooling used to co-author this PR?

Uh oh!

ephraimbuddy Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

wjddn279 Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants