MDEV-40128 Use per-cell latch in lock_move_reorganize_page() by iMineLink · Pull Request #5297 · MariaDB/server

iMineLink · 2026-06-29T16:05:15Z

lock_move_reorganize_page() was acquiring lock_sys.latch in exclusive mode (via LockMutexGuard) for the entire body of phase 2 (lock chain iteration, bitmap reset, and lock_rec_add_to_queue() calls). The function however only touches record locks belonging to a single page, which all live in a single lock_sys.rec_hash cell. Holding that cell latch in exclusive mode via LockGuard is sufficient:

The cell latch protects the cell's lock chain and the bitmaps of the lock_t objects in it (lock_rec_bitmap_reset and the new bit set by lock_rec_add_to_queue()).
It also protects lock->type_mode, including the LOCK_WAIT bit. The canonical clear in lock_reset_lock_and_trx_wait() runs under the cell latch, and lock_grant() invokes it before taking trx->mutex, so the bit is cell-latch state rather than trx->mutex state. Phase 1 only clears the bit and leaves trx->lock.wait_lock intact; the copy in old_locks keeps LOCK_WAIT and phase 2 re-adds the lock with it, so the wait relationship (guarded by lock_sys.wait_mutex) is preserved across the move. Neither trx->mutex nor wait_mutex is required here.
Each owning trx's mutex is acquired per-iteration to protect that trx's trx_locks list and lock_heap during lock_rec_add_to_queue().

The global exclusive latch was over-strong: it blocked every concurrent lock_sys.rd_lock() acquirer in lock_rec_lock() and lock_table() server-wide for the duration of the reorganize, contributing disproportionately to the lock_sys.latch convoy under heavy concurrency.

The TMLockGuard fast-path empty check at the top of the function is preserved; for cells with no locks the cost is still just a TSX-elided read.

CLAassistant · 2026-06-29T16:05:29Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

gemini-code-assist

Code Review

This pull request optimizes the locking behavior in lock_move_reorganize_page by replacing the global LockMutexGuard with a more granular LockGuard on the specific hash cell. This prevents blocking concurrent lock acquisitions server-wide. Additionally, detailed comments have been added to explain the lock protection model and the handling of the LOCK_WAIT bit. There are no review comments, and I have no additional feedback to provide.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Copilot

Pull request overview

This PR reduces lock contention during InnoDB page reorganize lock relocation by replacing a server-wide lock_sys exclusive latch with an exclusive latch on only the relevant rec_hash cell for the affected page. This aligns the locking scope with the data actually touched (record locks for a single page), aiming to reduce global lock convoying under high concurrency.

Changes:

Replace LockMutexGuard (global lock_sys.wr_lock) with LockGuard (per-rec_hash cell latch) in lock_move_reorganize_page().
Preserve the existing TSX-elided empty-fast-path check via TMLockGuard.
Add expanded rationale comments documenting correctness of cell-latch protection (including LOCK_WAIT handling) and concurrency implications.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

dr-m

Good catch, and great source code comments.

lock_move_reorganize_page() was acquiring lock_sys.latch in exclusive mode (via LockMutexGuard) for the entire body of phase 2 (lock chain iteration, bitmap reset, and lock_rec_add_to_queue() calls). The function however only touches record locks belonging to a single page, which all live in a single lock_sys.rec_hash cell. Holding that cell latch in exclusive mode via LockGuard is sufficient: - The cell latch protects the cell's lock chain and the bitmaps of the lock_t objects in it (lock_rec_bitmap_reset and the new bit set by lock_rec_add_to_queue()). - It also protects lock->type_mode, including the LOCK_WAIT bit. The canonical clear in lock_reset_lock_and_trx_wait() runs under the cell latch, and lock_grant() invokes it before taking trx->mutex, so the bit is cell-latch state rather than trx->mutex state. Phase 1 only clears the bit and leaves trx->lock.wait_lock intact; the copy in old_locks keeps LOCK_WAIT and phase 2 re-adds the lock with it, so the wait relationship (guarded by lock_sys.wait_mutex) is preserved across the move. Neither trx->mutex nor wait_mutex is required here. - Each owning trx's mutex is acquired per-iteration to protect that trx's trx_locks list and lock_heap during lock_rec_add_to_queue(). The global exclusive latch was over-strong: it blocked every concurrent lock_sys.rd_lock() acquirer in lock_rec_lock() and lock_table() server-wide for the duration of the reorganize, contributing disproportionately to the lock_sys.latch convoy under heavy concurrency. The TMLockGuard fast-path empty check at the top of the function is preserved; for cells with no locks the cost is still just a TSX-elided read.

iMineLink requested a review from Copilot June 29, 2026 16:05

Copilot started reviewing on behalf of iMineLink June 29, 2026 16:05 View session

gemini-code-assist Bot reviewed Jun 29, 2026

View reviewed changes

iMineLink requested a review from dr-m June 29, 2026 16:06

Copilot AI reviewed Jun 29, 2026

View reviewed changes

iMineLink added the MariaDB Corporation label Jun 29, 2026

dr-m approved these changes Jun 30, 2026

View reviewed changes

iMineLink force-pushed the MDEV-40128 branch from 2972fb8 to 232e3da Compare June 30, 2026 10:54

iMineLink merged commit 232e3da into 11.8 Jun 30, 2026
17 of 19 checks passed

iMineLink deleted the MDEV-40128 branch June 30, 2026 11:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

MDEV-40128 Use per-cell latch in lock_move_reorganize_page()#5297

MDEV-40128 Use per-cell latch in lock_move_reorganize_page()#5297
iMineLink merged 1 commit into
11.8from
MDEV-40128

iMineLink commented Jun 29, 2026

Uh oh!

CLAassistant commented Jun 29, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

dr-m left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

4 participants

Uh oh!

Uh oh!

Conversation

iMineLink commented Jun 29, 2026

Uh oh!

CLAassistant commented Jun 29, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

dr-m left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

4 participants