Skip to content

Async (asyncio) support for the wrapper β€” a community contribution we'd love your guidance onΒ #1251

@AhmadMasry

Description

@AhmadMasry

Describe the feature

Hi! πŸ‘‹ First off, thank you for this project β€” it's been a joy to build on. We'd love to contribute an asyncio counterpart of the wrapper, under a new aws_advanced_python_wrapper.aio subpackage, that aims for feature parity with the sync wrapper across the shipped plugins, backed by async DBAPI drivers (psycopg v3 async, aiomysql) and async SQLAlchemy (create_async_engine).

The async support lives in its own new aio subpackage, and we were careful not to remove or rename any existing public symbol β€” current sync code should keep working unchanged. We'll be honest, though: it isn't purely additive. While building it we bumped into a few sync-side things that we thought could be improved and took the liberty of suggesting changes for (detailed below). Those are entirely our subjective opinion, and we'd genuinely welcome you pushing back, reshaping them, or dropping them β€” we're very happy to split them out or leave them for you to decide.

Use Case

The wrapper is currently sync-only (mysql.connector / psycopg sync). What motivated us is the rise of agentic applications: agents and LLM-backed services are overwhelmingly built on asyncio β€” concurrent tool calls, token streaming, many in-flight model/DB requests per user request β€” and that's driven a big expansion in async Python usage (FastAPI, async SQLAlchemy + AsyncSession, aiohttp). Today those stacks can't really use the wrapper's Aurora features (failover, read/write splitting, IAM auth, EFM, …) without blocking the event loop, so teams end up choosing between Aurora-aware connectivity and non-blocking async I/O. We'd love to help remove that trade-off and bring the wrapper's full feature set to the async ecosystem.

Proposed Solution

We've put together a complete, working reference implementation (branch link below). It's been in active development for about 2 months, and we'd be delighted to shape it however suits your roadmap and review process.

Async parity scope:

  • Connectivity/HA: Failover v2, read/write splitting, simple read/write splitting, EFM (host monitoring), Aurora connection tracker, cluster topology monitor (with panic mode), custom endpoint, stale DNS, Aurora initial-connection strategy, blue/green deployment, Limitless, fastest-response strategy, developer plugin.
  • Auth: IAM, AWS Secrets Manager, Federated (SAML) + Okta.
  • Infra: AsyncConnectionProvider / AsyncConnectionProviderManager, AsyncSessionStateService (autocommit/read-only capture+restore across failover/RWS), async IdP factory registry, auto-detection of MultiAz / GlobalAurora host-list providers.
  • Drivers: psycopg v3 async, aiomysql.

SQLAlchemy (sync and async). We actually started this before the library had any SQLAlchemy support β€” and when you shipped SQLAlchemy ORM for MySQL (#1224), we happily folded your change into our branch and filled in the bits our version also covered (your mysql_orm_dialect.py was relocated/rewritten, but the mysql+aws_wrapper_mysqlconnector:// URL still works exactly as before). We also added SQLAlchemy support for PostgreSQL (sync and async), which we don't think is in the library yet β€” hopefully a useful bonus. Async dialects resolve via SQLAlchemy's get_async_dialect_cls hook (chosen by create_async_engine, no URL flag) using the dialect+driver URL convention, and ORM works on the default driver (no use_pure needed).

Validation: 1986 unit tests, plus the full integration suite against real Aurora (PostgreSQL + MySQL) on Python 3.13 and 3.14 β€” failover, RWS, IAM, Secrets Manager, EFM, and SQLAlchemy ORM all passing, with no connection leaks observed.

Other Information

A few sync-side suggestions (very much our opinion πŸ™)

This is the part we're least sure about, so please treat all of it as suggestions, not assertions β€” you know this codebase and its intended behavior far better than we do, and we may simply be wrong about some of these. While building the async side we ran into a handful of sync-side things that we felt could be improved, so we drafted changes for them. They happen to be default-on, which is exactly why we want to flag them loudly rather than slip them in. No public symbol was removed or renamed, but existing sync users would notice:

What we changed Why we thought it'd help Can it be turned off?
Query-timeout / connection-abort / MySQL liveness-ping paths now shut sockets down thread-safely instead of freeing them while a worker is still using them We saw this lead to a cross-thread use-after-free (process-level SIGSEGV) under failover/timeout races β€” but please verify; it may not reproduce in your environments not currently β€” but easy to gate if you'd prefer
errors.py gains PEP 249 mixin bases, so except OperationalError/InterfaceError/NotSupportedError (and SQLAlchemy/Django) classify these errors as expected felt more PEP 249-conformant, but we recognize this changes matching semantics not gated β€” happy to reconsider
Opt-out connection retry at the SQLAlchemy-pool / Django-backend / reader-failover / topology-monitor-login layers (to ride out Aurora's post-failover boot window) smoothed over transient connect rejections in our testing yes β€” connection_retry_max_attempts / _max_backoff_s, default-on
SQLAlchemy pool invalidation on failover so a dead connection isn't handed back avoided a rollback-on-dead-socket error for SA users only affects SQLAlchemy pool connections; no-op otherwise
RWS seeds its role cache on connect, and (default-on) rechecks a picked reader's actual role against topology lag tried to avoid landing on the writer when topology lags a failover recheck via rws_recheck_reader_role (default True); seeding always on
MySQL exception classifier recognizes pymysql-shaped errors + a guard for a TypeError we hit in the classifier needed for the async path; the guard also seemed safer for sync not gated

We'd be more than happy to pull any/all of these into separate, smaller PRs (the socket/SIGSEGV ones could stand alone), tweak the defaults to opt-in, or simply remove them if you'd rather own that area yourselves. Your call entirely. πŸ™‚

A few other notes

  1. PR shape β€” whatever's easiest for you. The async diff is large (~222 files, +44.5k/βˆ’2.7k). We can send it as one PR, stage it (core async β†’ drivers/SQLAlchemy β†’ auth β†’ HA plugins), and/or split out the sync-side bits above. Totally flexible.
  2. aiomysql IAM TLS posture. aiomysql doesn't auto-negotiate TLS, so the async IAM path builds the SSL context explicitly but matches the sync driver's default β€” encrypt the cleartext token, don't verify the cert by default, with verification opt-in via ssl_ca (no CA bundled). Flagging in case you'd prefer verify-by-default upstream.
  3. EFM v2 on async MySQL. Sync MySQL dropped EFM v2 (chore: add initial connection to the default plugin list and remove efm2 for mysqlΒ #1242 β€” couldn't abort a connection). The async aiomysql path keeps it, since asyncio cancellation makes monitor-driven teardown feasible β€” but we'd defer to you if you'd rather keep parity with the sync decision.
  4. Built with AI assistance. In the interest of full transparency: this was developed with Claude (AI) assistance over the ~2 months. Glad to share the project's CLAUDE.md (the guardrails/conventions we worked under) if it's useful for your review.
  5. We've tentatively targeted 3.1.0 in the CHANGELOG [Unreleased] section, but that's just a placeholder β€” whatever you prefer.

Reference branch: AhmadMasry/aws-advanced-python-wrapper @ feat/async-parity β€” or view the full diff vs main (no PR opened yet β€” happy to open one in whatever shape you prefer).

Thank you so much for reading this far, and for considering it! We're not attached to any particular approach here β€” we'd love to do this with you, in whatever shape works best for the project. Happy to hop on anything, answer questions, or rework as needed. πŸ™

Acknowledgements

  • I may be able to implement this feature request (reference implementation is complete and validated β€” and we'd love to collaborate on shaping it)
  • This feature might incur a breaking change (no public symbol removed/renamed; a few default-on sync behavior changes we've flagged above for your review)
Field Value
Wrapper version 3.1.0 (branch)
Python version 3.13 & 3.14
OS Amazon Linux 2023 (Graviton) + macOS

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions