Skip to content

Interconnecting with the UBShmTransport Based on the LD/ST Shared Memory Semantics.#3290

Open
zchuango wants to merge 22 commits into
apache:masterfrom
zchuango:ubshm_transport_dev
Open

Interconnecting with the UBShmTransport Based on the LD/ST Shared Memory Semantics.#3290
zchuango wants to merge 22 commits into
apache:masterfrom
zchuango:ubshm_transport_dev

Conversation

@zchuango
Copy link
Copy Markdown
Contributor

@zchuango zchuango commented May 9, 2026

What problem does this PR solve?

Issue Number: #3226 #3167 #3217

Problem Summary:
After recent efforts, the UB-Ring framework has been successfully integrated with the BRPC transport framework. Currently, high-performance and low-latency communication based on the load/store (LD/ST) semantics is supported. I feel happy be able to contribute this to the community and look forward to receiving feedback and reviews. @wwbmmm @chenBright

What is changed and the side effects?

Changed:

  1. The ubring framework is added. This framework implements low-latency data communication based on the shared memory LD/ST semantics.
  2. Currently, the ubring framework supports two modes: POSIX IPC shared memory and ubs-mem remote shared memory.
  3. The ub_shm_type parameter is used to control whether to use the IPC or ubs-mem capability. Currently, ubs-mem can run on the Kunpeng 950 supernode that supports the ub protocol.
    Side effects:
  • Performance effects: NAN

  • Breaking backward compatibility:


Check List:

Comment thread src/brpc/ubshm_transport.h Outdated
#include "brpc/transport.h"

namespace brpc {
class UBShmTransport : public Transport {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

class no need to indent

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I'll fix it. By the way, does the current brpc project have a standard formatting file like .clang-format? @wwbmmm

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no standard formatting file like .clang-format, just following the existing code style.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new UBRing-based shared-memory transport mode to brpc (IPC + optional ubs-mem backend) and wires it into the Socket/Transport framework, along with docs and a performance example.

Changes:

  • Introduce UBRing transport (SOCKET_MODE_UBRING) with endpoint handshake, polling, and ring manager infrastructure.
  • Add shared-memory backend abstraction (POSIX IPC + ubs-mem via dlopen’d SDK stubs/headers) plus timer utilities.
  • Update build/docs/examples to expose the feature and provide a basic performance harness.

Reviewed changes

Copilot reviewed 43 out of 43 changed files in this pull request and generated 15 comments.

Show a summary per file
File Description
src/brpc/ubshm/ubs_mem/ubshmem_stub.cpp Adds stub implementations of ubs-mem APIs for non-ubs environments/UT.
src/brpc/ubshm/ubs_mem/ubs_mem.h Introduces ubs-mem C API header used by the UBS backend integration.
src/brpc/ubshm/ubs_mem/ubs_mem_def.h Defines ubs-mem types/constants used by the UBS backend integration.
src/brpc/ubshm/ubs_mem/declare_shm_ubs.h Declares the dynamically loaded ubs-mem function pointer table.
src/brpc/ubshm/ubr_trx.h Defines core UBR transaction structures and states.
src/brpc/ubshm/ubr_msg.h Defines UBR message chunk format used by the ring transport.
src/brpc/ubshm/ub_ring.h Declares UBRing read/write and lifecycle APIs used by the endpoint.
src/brpc/ubshm/ub_ring_manager.h Declares global manager for UBR transactions and link bookkeeping.
src/brpc/ubshm/ub_ring_manager.cpp Implements UBR transaction manager and UB event callback plumbing.
src/brpc/ubshm/ub_helper.h Declares UBRing global init/availability helpers.
src/brpc/ubshm/ub_helper.cpp Implements global init/fini, availability flags, and polling init.
src/brpc/ubshm/ub_endpoint.h Declares UB shared-memory endpoint and polling infrastructure.
src/brpc/ubshm/ub_endpoint.cpp Implements handshake, polling loop, and I/O integration with Socket/InputMessenger.
src/brpc/ubshm/timer/timer_mgr.h Declares timer module used by UBS cleanup/recovery flows.
src/brpc/ubshm/timer/timer_mgr.cpp Implements epoll/kqueue-based timer dispatch for UBRing subsystems.
src/brpc/ubshm/shm/shm_ubs.h Declares UBS backend shared-memory operations.
src/brpc/ubshm/shm/shm_ubs.cpp Implements UBS backend via dynamically loaded ubs-mem SDK.
src/brpc/ubshm/shm/shm_mgr.h Declares backend-agnostic SHM manager interface.
src/brpc/ubshm/shm/shm_mgr.cpp Implements SHM manager selecting IPC vs UBS backend via flag.
src/brpc/ubshm/shm/shm_ipc.h Declares POSIX IPC SHM backend operations.
src/brpc/ubshm/shm/shm_ipc.cpp Implements POSIX IPC SHM backend operations.
src/brpc/ubshm/shm/shm_def.h Adds SHM structs/constants used across SHM backends and UBRing.
src/brpc/ubshm/common/thread_lock.h Adds RAII-style mutex/spin/rwlock/semaphore guard macros.
src/brpc/ubshm/common/common.h Adds common macros/types/constants used throughout UBRing code.
src/brpc/ubshm_transport.h Declares UBShmTransport implementing the Transport interface.
src/brpc/ubshm_transport.cpp Implements transport selection between UBRing and TCP fallback paths.
src/brpc/transport_factory.cpp Wires SOCKET_MODE_UBRING into transport creation/context init.
src/brpc/socket.h Adds UB endpoint/connect friend declarations for Socket integration.
src/brpc/socket_mode.h Adds SOCKET_MODE_UBRING enum value.
src/brpc/rdma_transport.cpp Adjusts RDMA transport’s TCP fallback member initialization (currently broken).
src/brpc/input_messenger.h Adds UB endpoint friend declaration to support message processing hooks.
src/brpc/input_messenger.cpp Extends RDMA-special message queuing behavior to UBRing sockets.
src/brpc/controller.h Guards latency_us() against unset begin time.
README.md Adds docs link for UBRing.
README_cn.md Adds docs link for UBRing (CN).
example/ubring_performance/test.proto Adds proto for UBRing performance test example.
example/ubring_performance/server.cpp Adds UBRing-capable perf test server example.
example/ubring_performance/client.cpp Adds UBRing-capable perf test client example.
example/ubring_performance/CMakeLists.txt Adds standalone CMake build for the performance example.
docs/en/ubring.md Documents build/run/configuration and backend selection for UBRing.
docs/cn/ubring.md Chinese documentation for UBRing build/run/configuration.
CMakeLists.txt Adds WITH_UBRING option and compile definition wiring.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/brpc/rdma_transport.cpp Outdated
Comment thread src/brpc/ubshm_transport.cpp
Comment thread src/brpc/ubshm/ub_endpoint.cpp
Comment thread src/brpc/ubshm/ub_endpoint.cpp Outdated
Comment thread src/brpc/ubshm/ub_endpoint.cpp Outdated
Comment thread src/brpc/ubshm/ub_ring_manager.cpp
Comment thread src/brpc/ubshm/shm/shm_ubs.cpp Outdated
Comment thread src/brpc/ubshm/shm/shm_mgr.cpp
Comment thread CMakeLists.txt Outdated
Comment thread src/brpc/ubshm_transport.cpp
Comment thread docs/cn/ubring.md
g_last_time.store(0, butil::memory_order_relaxed);

brpc::ServerOptions options;
options.socket_mode = FLAGS_use_ubring? brpc::SOCKET_MODE_UBRING : brpc::SOCKET_MODE_TCP;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

brpc::ServerOptions socket_mode default use tcp mode is better。

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it reference example/rdma_performance code style,switching to the default TCP mode also works fine.

return -1;
}
ubring::GlobalUBInitializeOrDie();
if (!ubring::InitPollingModeWithTag(bthread_self_tag())) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does ubring only support polling mode?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The LD/ST shared memory has this limitation. Currently, only the polling mode is supported. The time waiting mode requires the support of the OS kernel or hardware.

Comment thread docs/cn/ubring.md

### 2. UBS-Mem 远端共享内存 (ub\_shm\_type = 2)

此模式使用 ubs-mem(Unified Block Storage Memory),这是来自 openEuler 的开源远端共享内存框架。它支持机架内节点之间的共享内存通信,类似于 RDMA 但部署要求更简单。
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you list the libraries that need to be used?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I'll list the depends libraries later

@zchuango
Copy link
Copy Markdown
Contributor Author

The related comments code has been modified. Please check and review it again. @wwbmmm @chenBright @yanglimingcn

while (curIov < iovcnt && pktRemainN > 0) {
iovRemain = (iov[curIov].iov_len - curIovPos);
fulled = iovRemain > pktRemainN ? pktRemainN : iovRemain;
memcpy((msg->payload.inner + (curPktLen - (uint8_t)pktRemainN)),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does ub simply copy the memory from iobuf to complete the transfer?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, is it correct to understand that there is a memory copy involved?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does ub simply copy the memory from iobuf to complete the transfer?

This memcpy fills the UbrMsgFormat to build a 64-byte remote packet (4B header + 60B body). Then Copy64Byte(...) flushes it to the remote node dataMsg buffer (refer this line: Copy64Byte((int8_t *)&dataMsg[_trx->ubrTx.writePos], (int8_t *)msg);). The 64B limit comes from UB transport's atomic semantics—remote writes must be indivisible.

Does ub simply copy the memory from iobuf to complete the transfer?

Yes, that's correct—memory copies are involved at two distinct stages:

  • Local assembly: The memcpy composes the payload into the 64-byte UbrMsgFormat message (header + body) in local memory.
  • Remote transfer: Copy64Byte then writes that assembled 64-byte block to the remote buffer.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I then assume that ub is not a zero-copy operation? Compared to RdmaEndpoint?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite. Both are DMA implementations at the hardware level; the difference is in access semantics and usage model.

  • UB mem uses memory-semantic DMA: remote memory is directly addressable via standard load/store instructions, just like accessing local RAM. No kernel involvement, no explicit verbs—data movement is implicit and transparent.

  • RdmaEndpoint uses message-semantic DMA: while the NIC's DMA engine also bypasses the CPU, the application must explicitly trigger transfers via verbs (send/recv/read/write) with memory registration and QP management.

Core idea: Both achieve zero-copy through DMA, but UB mem is implicit memory access (ld/st), RDMA is explicit message passing (verbs).

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

return _tcp_transport->CutFromIOBufList(buf, ndata);
}

int UBShmTransport::WaitEpollOut(butil::atomic<int> *_epollout_butex,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can polling mode avoid frequent event registration? Registering one event per link seems reasonable. But registering UBShmTransport::WaitEpollOut would be too frequent, wouldn't it?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is borrowed from the TCP transport's WaitEpollOut pattern, using _io_event for notification. The fundamental issue is that UB shared memory lacks hardware-generated events—unlike RDMA, where the NIC deposits completions into a CQ for async notification, UB SHM has no hardware signaling path. Instead, we have to poll the URing buffer state (via isWritable and similar checks) to detect message readiness, which makes the current approach functional but admittedly not the most elegant.
I would like to brainstorm a cleaner design with you on this—feel free to ping me on WeChat if you're open to a deeper dive. @yanglimingcn

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my WeChat id "yanglm_28"

@wwbmmm
Copy link
Copy Markdown
Contributor

wwbmmm commented May 18, 2026

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants