Skip to content

Add ChaChaPoly AEAD-4 encryption with nonce persistence#1677

Open
weebl2000 wants to merge 13 commits intomeshcore-dev:devfrom
weebl2000:feature/aead-4-encryption
Open

Add ChaChaPoly AEAD-4 encryption with nonce persistence#1677
weebl2000 wants to merge 13 commits intomeshcore-dev:devfrom
weebl2000:feature/aead-4-encryption

Conversation

@weebl2000
Copy link
Contributor

@weebl2000 weebl2000 commented Feb 12, 2026

Build firmware: Build from this branch

Testing

  • Current testling on Heltec v4 companion and Heltec v4 repeater. It's working so far. Would be great if others can try with different devices, especially more constrained devices with less ram/flash storage.

Summary

Adds ChaCha20-Poly1305 (AEAD-4) encryption alongside the existing AES-128-ECB + HMAC-2 scheme, plus session key negotiation for Perfect Forward Secrecy. Updated nodes send AEAD-4 to peers that advertise support and fall back to ECB for legacy peers. All nodes can decode both formats. Old nodes continue to work unchanged.

Nonces are persisted to flash so they survive reboots without risk of reuse. Session keys are negotiated via ephemeral X25519 Diffie-Hellman and persisted immediately on establishment.

Relates to #259.

What This Means in Practical Terms

The current encryption has a few weaknesses that this PR addresses:

  • Message tampering is too easy to attempt. The existing 2-byte authentication code means an attacker only needs about 65,000 guesses to forge a valid-looking message. At LoRa speeds that's roughly 9 hours of continuous attempts. The new 4-byte tag raises this to over 4 billion guesses — at LoRa rates, that would take over a century.

  • Identical messages look identical on the air. The current block cipher (ECB mode) produces the same ciphertext for the same plaintext, which can reveal patterns — for example, you could tell when someone sends the same message twice. The new scheme produces completely different ciphertext every time, even for identical messages.

  • Addressing fields are now protected. Currently, only the message body is authenticated. With AEAD, the payload type and addressing hashes (which identify sender and recipient) are included in the authentication check, so an attacker cannot swap or modify them without detection. Outer routing fields like TTL and hop path are intentionally left unauthenticated so repeaters can still forward packets through the mesh.

  • Messages get slightly smaller. ECB pads every message up to a 16-byte boundary, wasting airtime. The new scheme has no padding, so most messages shrink by a few bytes on the wire.

  • Compromise of a node doesn't reveal past messages. Session key negotiation establishes fresh shared secrets via ephemeral key exchange. Even if a node's long-term private key is later compromised, previously recorded traffic cannot be decrypted (Perfect Forward Secrecy).

  • Nothing breaks. Updated nodes send AEAD-4 to peers that advertise support, and fall back to ECB for legacy peers. Old nodes are completely unaffected — they never receive AEAD-4 messages because the sender checks their capability first.

  • Nodes advertise their capabilities. Updated nodes include a flag in their advertisements saying "I understand the new encryption." When two updated nodes discover each other, they automatically start using AEAD-4 for their communication.

  • Nonces survive reboots. Per-peer nonce counters are saved to flash periodically and before clean reboots. After a dirty reset (power loss, watchdog, brownout), nonces are bumped forward by a safety margin to guarantee no reuse.

Wire Format

Current ECB:

[HMAC:2] [ECB_ciphertext:N×16]     (padded to block boundary)

New AEAD-4 (same position in payload):

[nonce:2] [ciphertext:M] [tag:4]    (exact plaintext length, no padding)

Average overhead: ~6 bytes (AEAD) vs ~9.5 bytes (ECB). Most messages get smaller.

Cryptographic Design

Per-message key derivation (eliminates nonce-reuse catastrophe):

msg_key[32] = HMAC-SHA256(shared_secret, nonce || dest_hash || src_hash)

The shared_secret is either the static ECDH secret or a session key (see Session Key Negotiation below).

Including dest_hash || src_hash makes keys direction-dependent — Alice→Bob and Bob→Alice derive different keys even with the same nonce value (for 255/256 peer pairs; the 1/256 where dest_hash == src_hash is a residual limitation of 1-byte hashes).

IV construction (12 bytes, from on-wire fields):

iv[12] = { nonce_hi, nonce_lo, dest_hash, src_hash, 0, 0, 0, 0, 0, 0, 0, 0 }

Associated data (authenticated but not encrypted):

  • Peer messages: header || dest_hash || src_hash
  • Anonymous requests: header || dest_hash
  • Group messages: header || channel_hash

Route type bits are masked out of the header in associated data (header & ~PH_ROUTE_MASK), since routing mode changes per hop as repeaters forward packets.

Nonce management: 16-bit counter per peer, persisted to flash. See "Nonce Persistence" section below.

Session Key Negotiation (Perfect Forward Secrecy)

Session keys provide Perfect Forward Secrecy by establishing fresh shared secrets via ephemeral X25519 Diffie-Hellman. Compromise of either node's long-term private key cannot recover traffic encrypted with a session key.

Protocol (2 messages + implicit confirmation)

Initiator                                   Responder
    |  1. REQ [REQ_TYPE_SESSION_KEY_INIT]        |
    |     [ephemeral_pub_A:32] (AEAD-4)          |
    | -----------------------------------------> |  derive session_key, persist, dual-decode
    |  2. RESPONSE [RESP_TYPE_SESSION_KEY_ACCEPT] |
    |     [ephemeral_pub_B:32] (static ECDH)     |
    | <----------------------------------------- |
    |  derive session_key, persist, nonce=1       |
    |  3. Any normal message (session key)       |
    | -----------------------------------------> |  confirm: drop old key

The INIT is encrypted with AEAD-4 (static ECDH or existing session key). The ACCEPT is always encrypted with the static ECDH secret, because the initiator hasn't derived the session key yet.

Key Derivation

ephemeral_secret = X25519(their_ephemeral_pub, my_ephemeral_prv)
session_key[32]  = HMAC-SHA256(static_shared_secret, ephemeral_secret)

Uses existing ed25519_key_exchange() (X25519 Montgomery ladder) from lib/ed25519. No new dependencies.

Who Initiates

  • Companion ↔ Repeater/Room/Sensor: Companion initiates, server responds
  • Companion ↔ Companion: Either side can initiate, both can respond

Repeaters, room servers, and sensors only implement the responder role — they never initiate session key negotiation.

Automatic Triggers

Session key negotiation is triggered automatically based on message count. The trigger check runs inside getEncryptionNonceFor() — the single funnel all encrypted sends pass through — so no send path can silently skip it. Negotiation is deferred to the next loop() tick to avoid re-entrancy.

Hop count Current key Trigger Retry after failure
0 (direct) Static ECDH Every 100 msgs 100 msgs
0 (direct) Session key nonce > 60000, then every 100 msgs 100 msgs
1–9 Static ECDH Every 500 msgs 500 msgs
1–9 Session key nonce > 60000, then every 300 msgs 300 msgs
10+ Static ECDH Every 1000 msgs 1000 msgs
10+ Session key nonce > 60000, then every 300 msgs 300 msgs

3 INIT attempts per negotiation (3-minute timeout each).

Nonce Lifecycle

  • New contacts: Static ECDH nonce seeded from RNG in range 1000–50000
  • Session key nonce: Starts at 1 on establishment, full 65535 budget per session
  • Nonce exhaustion: Fall back to static ECDH, keep retrying negotiation at tier intervals

Encryption Key Selection

All node types use paired getEncryptionKey() / getEncryptionNonce() functions that return the correct key and nonce based on current session state:

has_session_key && sends_since_last_recv < 50  → AEAD with session key
has_session_key && sends_since_last_recv >= 50 → AEAD with static ECDH (stale probe)
CONTACT_FLAG_AEAD && nonce OK                  → AEAD with static ECDH
CONTACT_FLAG_AEAD && nonce exhausted           → ECB (pending renegotiation)
else                                           → ECB (legacy peer)

Decode Order

has_session_key: session_key → prev_session_key (dual-decode) → static ECDH → ECB
CONTACT_FLAG_AEAD: static ECDH → ECB
else: ECB → static ECDH

Dual-Decode Window

When the responder accepts a session key INIT, it enters DUAL_DECODE state: the new session key is active for sending, but both old and new keys are accepted for decoding. Once the initiator sends a message encrypted with the new session key (message 3), the responder confirms the transition and drops the old key.

This makes ACCEPT packet loss safe — the responder stays in dual-decode, the initiator times out and retries, and no messages are lost.

Stale Session Detection

If a node sends 50 consecutive messages without receiving any session-key-encrypted reply, it falls back to static ECDH for sending (the peer may have lost the session key). At 100 unanswered sends, falls back to ECB. At 255, clears the AEAD capability flag and removes the session key entirely. The counter resets to 0 on any successful session-key-encrypted message from the peer.

Session Key Persistence

Session keys use a two-tier storage model: a small RAM pool for active sessions and a larger flash-backed store for less recently used entries.

RAM pool: 8 slots (MAX_SESSION_KEYS_RAM), managed as an LRU cache. Each access touches a counter so the least-recently-used entry can be evicted when the pool is full. Entries in INIT_SENT state (ephemeral keys only) are never evicted — they must complete or time out.

Flash store: Up to 48 entries (MAX_SESSION_KEYS_FLASH), persisted to /sess_keys (companion) or /s_sess_keys (server firmware).

Variable-length records: Entries without a previous session key (no dual-decode) use 39 bytes (SESSION_KEY_RECORD_MIN_SIZE); entries with a previous key use 71 bytes (SESSION_KEY_RECORD_SIZE). The SESSION_FLAG_PREV_VALID flag bit distinguishes the two.

Without prev_key: [pub_key_prefix:4] [flags:1] [nonce:2] [session_key:32]         = 39 bytes
With prev_key:    [pub_key_prefix:4] [flags:1] [nonce:2] [session_key:32] [prev_session_key:32] = 71 bytes

On-demand flash lookup: When findSessionKey() misses the RAM pool, it reads the flash file to look for a matching entry. If found, the entry is loaded into RAM (evicting LRU if needed) and returned.

Merge-save strategy: When persisting, the code reads existing flash entries, filters out any that are already in the RAM pool or have been explicitly removed, then writes the merged result (RAM entries + surviving flash-only entries). This prevents flash from resurrecting deleted entries while preserving entries that were evicted from RAM.

Removed-entry tracking: When a session key is explicitly removed (e.g., invalidation after static ECDH fallback), its prefix is recorded in a small tracking array. The merge-save step skips these prefixes so the deleted entry doesn't reappear from stale flash data. The tracking array is cleared after each successful save.

Nonce Persistence

Nonces are persisted to a dedicated file on flash (/nonces for companion radios, /s_nonces for server firmware).

Periodic saves: After every NONCE_PERSIST_INTERVAL (50) messages to a given peer, the nonce file is written. A dirty flag tracks whether any nonce has advanced since the last save.

Clean reboot: Software restarts and deep sleep wakes load the persisted nonces as-is. A onBeforeReboot() callback in CommonCLI flushes any dirty nonces before the restart.

Dirty reboot: Power-on, watchdog, and brownout resets are detected via wasDirtyReset() (platform-specific: esp_reset_reason() on ESP32, RESETREAS register on NRF52). After a dirty reset, all loaded nonces are bumped forward by NONCE_BOOT_BUMP (100), which is at least 2× the persist interval, guaranteeing that even the worst-case unpersisted nonce is safely skipped. Session key nonces also receive the boot bump; if the bump causes a wrap, the nonce is forced to 65535 to trigger renegotiation.

Format: Simple array of {pub_key_prefix[6], nonce[2]} entries, matched to in-memory contacts/clients on load.

Security Comparison

Property ECB + HMAC-2 (current) AEAD-4 (new) AEAD-4 + Session Key
Confidentiality Identical blocks → identical ciphertext Unique keystream per message Same
Forgery resistance 1/65K (~9 hours at LoRa rates) 1/4.3B (~136 years) Same
Key usage 16 of 32 bytes (AES-128) Full 32 bytes (ChaCha20-256) Same
Addressing authentication None Payload type & address hashes via AAD Same
MAC timing memcmp (timing side-channel) secure_compare (constant-time) Same
Padding waste 0-15 bytes per message None None
Perfect Forward Secrecy No No Yes
Nonce reuse on reboot N/A (no nonces) Mitigated by persistence + boot bump Same

Scope

Payload type AEAD-4 decode AEAD-4 send Session keys Notes
TXT_MSG, REQ, RESPONSE, PATH Yes Yes (if peer advertises AEAD) Yes Per-peer secret, no collision risk
ANON_REQ Yes No (no prior capability exchange) No Ephemeral ECDH secret
GRP_TXT, GRP_DATA Yes No (see group considerations) No Shared channel key

All node types (companion radio, repeater, room server, sensor) support AEAD-4 decode, AEAD-4 send, and session key negotiation (companion initiates or responds; server firmware responds only).

Group Message Considerations

Group channels share a single key among all members. With a 2-byte nonce and multiple senders, cross-sender nonce collisions follow the birthday bound (~300 messages for 50% probability on an active channel). A collision leaks P1 ⊕ P2 for that specific message pair via crib-dragging, but:

  • No key recovery — per-message key derivation via HMAC-SHA256 is one-way
  • No cascade — each collision is isolated, doesn't affect other messages
  • Bounded threat model — the attacker must not have the channel PSK (if they do, they can already read everything)

This is mainly beneficial for public/hashtag channels where the PSK is already widely known and the ECB pattern leakage and weak MAC are a greater concern than the bounded nonce collision risk.

Potential future mitigations explored and deferred:

  • Per-sender derived keys (HMAC(channel_secret, sender_pub_key)) — eliminates cross-sender collisions but requires receivers to know all senders' public keys, changing the group security model from "know the PSK = full access" to "know the PSK + sender discovery = access." Ruled out as a usability regression.
  • Expanded nonce (4 bytes instead of 2) — pushes birthday bound to ~65,000 messages (~2 years at 100 msgs/day). Costs 2 extra bytes of airtime and creates a different wire format for groups vs peers.
  • Sender hash byte on wire — differentiates senders for key derivation at 1 byte cost, but leaks sender identity metadata (traffic correlation, identification via adverts) that is currently hidden inside the encrypted payload.

Decode Order

Adaptive per-peer: for peers with CONTACT_FLAG_AEAD set, try AEAD-4 first then ECB fallback. For unknown/legacy peers, try ECB first then AEAD-4 fallback. When a session key exists, decode order is: session key → prev session key (dual-decode window) → static ECDH → ECB. This avoids the 1/65536 ECB false-positive rate on AEAD packets (nonce bytes matching truncated HMAC) for known AEAD peers, while minimizing wasted CPU for legacy peers.

Capability Advertisement

  • feat1 bit 0 (FEAT1_AEAD_SUPPORT) is set in adverts for all node types (chat, repeater, room, sensor)
  • Receivers record peer capability in ContactInfo.flags bit 1 (CONTACT_FLAG_AEAD)
  • Old nodes parse feat1 but ignore the value (forward-compatible via existing AdvertDataParser)

Files Changed

Core Library

  • src/MeshCore.h — AEAD constants, session key constants (SESSION_KEY_SIZE, REQ_TYPE_SESSION_KEY_INIT, RESP_TYPE_SESSION_KEY_ACCEPT, NONCE_REKEY_THRESHOLD, SESSION_KEY_* thresholds and limits), two-tier pool sizing (MAX_SESSION_KEYS_RAM=8, MAX_SESSION_KEYS_FLASH=48), variable-length record sizes (SESSION_KEY_RECORD_SIZE, SESSION_KEY_RECORD_MIN_SIZE), SESSION_FLAG_PREV_VALID
  • src/Utils.h / src/Utils.cppaeadEncrypt() and aeadDecrypt() using ChaChaPoly
  • src/Mesh.hgetPeerFlags(), getPeerNextAeadNonce(), getPeerSessionKey(), getPeerPrevSessionKey(), onSessionKeyDecryptSuccess(), getPeerEncryptionKey(), getPeerEncryptionNonce() virtuals; aead_nonce param on createDatagram/createPathReturn
  • src/Mesh.cpp — AEAD send path in createDatagram/createPathReturn; session key → prev session key → static ECDH → ECB adaptive decode order
  • src/helpers/ContactInfo.huint16_t aead_nonce field, nextAeadNonce() helper
  • src/helpers/SessionKeyPool.hSessionKeyEntry struct and SessionKeyPool class (LRU-managed RAM pool with last_used tracking, eviction that skips INIT_SENT entries, removed-entry tracking for merge-save safety)

Companion Radio (BaseChatMesh)

  • src/helpers/BaseChatMesh.h / BaseChatMesh.cpp — Advertise AEAD, track peer capability, AEAD send for all peer message types, nonce persistence, session key negotiation (both initiator and responder roles), encryption key/nonce funnel (getEncryptionKeyFor/getEncryptionNonceFor), deferred rekey trigger via _pending_rekey_idx

Server-Side (ClientACL + examples)

  • src/helpers/ClientACL.h / ClientACL.cpp — Server-side AEAD nonce tracking and persistence, session key responder (handleSessionKeyInit), paired encryption key/nonce selection (getEncryptionKey/getEncryptionNonce), flash-backed session key wrappers with merge-save, peer-index forwarding helpers
  • src/helpers/CommonCLI.h / CommonCLI.cpp — Advertise AEAD for repeaters/rooms/sensors; onBeforeReboot() callback for nonce/session key flush
  • examples/simple_repeater/MyMesh.h / MyMesh.cpp — AEAD + session key support, nonce persistence, session key INIT handling in onPeerDataRecv
  • examples/simple_room_server/MyMesh.h / MyMesh.cpp — Same
  • examples/simple_sensor/SensorMesh.h / SensorMesh.cpp — Same

Platform Support

  • src/helpers/ArduinoHelpers.hwasDirtyReset() helper (ESP32/NRF52 reset reason detection)
  • examples/companion_radio/DataStore.h / DataStore.cpp — Nonce and session key file I/O, variable-length session key records, merge-save with flash-backed lookup (loadSessionKeyByPrefix)
  • examples/companion_radio/MyMesh.h / MyMesh.cpp — Wire up nonce/session key persistence and reboot callback, flash-backed session key overrides (loadSessionKeyRecordFromFlash, mergeAndSaveSessionKeys)

Build Verification

  • ESP32 (Heltec_v3_companion_radio_ble): builds successfully
  • ESP32 (Heltec_v3_repeater): builds successfully
  • ESP32 (Heltec_v3_room_server): builds successfully
  • NRF52 (Xiao_nrf52_companion_radio_ble): builds successfully

Future Work

  • Group messages: send AEAD-4 (all updated nodes can already decode it)
  • ANON_REQ: remain ECB (no prior capability exchange possible)
  • rekey <peer> CLI command for manual session key renegotiation

Build firmware: Build from this branch

@weebl2000 weebl2000 changed the base branch from main to dev February 12, 2026 00:08
@weebl2000 weebl2000 force-pushed the feature/aead-4-encryption branch from 06320d0 to 7f3da6a Compare February 12, 2026 00:19
Add ChaCha20-Poly1305 AEAD decryption with 4-byte auth tag for peer
messages and group channels, falling back to ECB for backward
compatibility. Sending remains ECB-only in this phase.

- Per-message key derivation: HMAC-SHA256(secret, nonce||dest||src)
- Direction-dependent keys prevent bidirectional keystream reuse
- 12-byte IV from nonce + dest_hash + src_hash
- Advertise AEAD capability via feat1 bit 0 in adverts
- Track peer AEAD support in ContactInfo.flags
- Seed aead_nonce from HW RNG on contact creation and load
@weebl2000 weebl2000 force-pushed the feature/aead-4-encryption branch from 7f3da6a to 26bdb41 Compare February 12, 2026 00:20
Send ChaChaPoly-encrypted messages to peers with CONTACT_FLAG_AEAD set,
and try AEAD decode first for those peers (avoiding 1/65536 ECB
false-positive). Legacy peers continue to use ECB in both directions.

- Add aead_nonce parameter to createDatagram/createPathReturn (default 0 = ECB)
- Add getPeerFlags/getPeerNextAeadNonce virtual methods for decode-order selection
- Add ContactInfo::nextAeadNonce() helper (returns nonce++ if AEAD, 0 otherwise)
- Update all BaseChatMesh send paths to pass nonce for AEAD-capable peers
- Adaptive decode order: AEAD-first for known AEAD peers, ECB-first for others
@weebl2000 weebl2000 force-pushed the feature/aead-4-encryption branch from eee6fd5 to 6526793 Compare February 12, 2026 01:04
The header's route type bits (PH_ROUTE_MASK) are zero when
createDatagram/createPathReturn encrypt with AEAD, but get changed to
ROUTE_TYPE_FLOOD (1) or ROUTE_TYPE_DIRECT (2) by sendFlood/sendDirect
afterwards. The receiver builds assoc from the received header (with
route bits set), so the tag check always fails and every AEAD packet
is silently dropped.

Mask out route type bits in assoc data on all 5 encrypt/decrypt sites.
Also track AEAD decode success to enable peer capability auto-detection.
@weebl2000 weebl2000 force-pushed the feature/aead-4-encryption branch from 881d18d to 7637e64 Compare February 12, 2026 01:19
@jimdigriz
Copy link

jimdigriz commented Feb 12, 2026

Per-message key derivation (eliminates nonce-reuse catastrophe):

msg_key[32] = HMAC-SHA256(shared_secret, nonce || dest_hash || src_hash)

I do not understand how this prevents nonce re-use. After 65k messages from A->B the nonce looks like it will be reused.

I do not understand why concatenation with src/dst would change this.

The concatenation means you are partitioning the nonce value per (uni-directional) flow, in effect running different counters for A->B, B->A and C->A. Right?

Nonce management: 16-bit counter per peer, seeded from hardware RNG on boot and on contact load. Not persisted to flash — always fresh on each boot cycle.

What happens for devices without access to a good early boot entropy source?

What if two different reboots generate the same nonce?

What happens for A->B if:

  • reboot initialises nonce=20
  • 3 messages are sent from A->B
  • reboot initialises nonce=15
  • 10 messages are sent from A->B

What does this method improve over a plain incremental counter?

Why not persist the nonce once every 100 messages, and on reboot increment by 200 (rounded down to nearest 100)? When the nonce wraps, regenerate the key.

@weebl2000
Copy link
Contributor Author

Yeah, it doesn't stop nonce re-use. I think in the end we might need more bytes for nonces.

@jimdigriz
Copy link

jimdigriz commented Feb 12, 2026

But in the end maybe we need more bytes for nonces.

You do not, you can also change the key.

Just negotiate a dedicated key for this. It is a lot easier to understand and make safe.

It would require a round trip but then only need to be done every 65k messages; you could then also share that key for both directions (ie. A->B and B->A).

Then when nonce=0 negotiate a new key, which allows you to pick if you want to persist the nonce or reset to zero on boot.

@weebl2000
Copy link
Contributor Author

weebl2000 commented Feb 12, 2026

It would require a round trip but then only need to be done every 65k messages; you could then also share that key for both directions (ie. A->B and B->A).

Might be a good option. But the protocol will become a bit more complex and brittle. Then again, we can always fallback to ECB if nothing was negotiated.

Copy link

@jcjones jcjones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a casual review, but I like the design, and the directionality of the KDF. Good doc comments, too.

@weebl2000
Copy link
Contributor Author

weebl2000 commented Feb 12, 2026

Thanks for all the comments so far. I will look into them. Just tested this branch with a Heltec v4 repeater and Heltec v4 companion client, and I can confirm communicating between them works using AEAD-4.

It's a request for status from the repeater and the repeater response is understood correctly by the client.

AEAD-4 Packet Decode Verification

Wire Format

[header:1] [path_len:1] [path:N] [dest_hash:1] [src_hash:1] [nonce:2] [ciphertext:M] [tag:4]

Sent Packet — REQ (23 bytes)

Raw: 0200DD130E659F0B0C02D86AC2508DF6B7B3B671F6638A

Field Hex Value
Header 02 Route=DIRECT(2), Type=REQ(0), Ver=0
Path length 00 0 (no path)
dest_hash DD Destination peer
src_hash 13 Source
AEAD nonce 0E 65 3685
Ciphertext 9F 0B 0C 02 D8 6A C2 50 8D F6 B7 B3 B6 13 bytes plaintext
Tag 71 F6 63 8A Poly1305 (truncated to 4 bytes)

Format confirmed AEAD-4: 17 bytes after hashes is not a multiple of 16, ruling out legacy ECB.

Received Packet — RESPONSE (70 bytes)

Raw: 060013DD830B84757DB841545969BA39A62BDD0D6AD9E2CD70B25208219F964F51E8AFB0E800130BBAFC23C9C0712B7E28CE72DE17508E30A3359222A2A7DD4B2375E5AE33AC

Field Hex Value
Header 06 Route=DIRECT(2), Type=RESPONSE(1), Ver=0
Path length 00 0 (no path)
dest_hash 13 Receiver
src_hash DD Responding peer
AEAD nonce 83 0B 33547
Ciphertext 84 75 ... 4B 23 75 60 bytes plaintext
Tag E5 AE 33 AC Poly1305 (truncated to 4 bytes)

Note: legacy ECB is structurally possible here (64 bytes is a multiple of 16), but context confirms AEAD-4.

Associated Data

Per the route-mask fix, assoc data masks out route type bits:

Packet assoc bytes
REQ {0x00, 0xDD, 0x13}(0x02 & ~0x03)=0x00, dest, src
RESPONSE {0x04, 0x13, 0xDD}(0x06 & ~0x03)=0x04, dest, src

Observations

  • Both packets use AEAD-4 wire format: [nonce:2] [ciphertext:N] [tag:4]
  • dest/src hashes (0xDD, 0x13) correctly swapped between REQ and RESPONSE
  • Both routed DIRECT with empty path (single hop, no relaying)
  • Nonce values (3685, 33547) are non-zero, consistent with independent per-peer counters seeded from HW RNG

- Fix potential unsigned overflow in createDatagram size check by
subtracting constants from MAX_PACKET_PAYLOAD instead of adding to
data_len
- Add upper-bound validation on src_len and assoc_len in aeadEncrypt and
aeadDecrypt
- Log peer name on AEAD nonce wraparound for debug builds
Prevent nonce reuse after reboots by persisting per-peer nonce counters
to a dedicated /nonces (companion) or /s_nonces (server) file. On dirty
reset (power-on, watchdog, brownout), nonces are bumped by NONCE_BOOT_BUMP
(100) to cover any unpersisted messages. Clean wakes (deep sleep, software
restart) load nonces as-is.

- Add nonce persistence to BaseChatMesh (companion) and ClientACL (server)
- Add wasDirtyReset() helper to ArduinoHelpers.h for platform-specific
  reset reason detection (ESP32/NRF52)
- Add onBeforeReboot() callback to CommonCLI for pre-reboot nonce flush
- Wire nonce persistence into all firmware variants: companion radio,
  repeater, room server, and sensor
- Only clear dirty flag on successful file write
@weebl2000 weebl2000 changed the title Add ChaChaPoly AEAD-4 decryption support (Phase 1) Add ChaChaPoly AEAD-4 encryption with nonce persistence Feb 13, 2026
@ignisf
Copy link

ignisf commented Feb 14, 2026

@weebl2000 thank you for your contributions. Have you considered jumping straight to something proven like the double ratchet instead? Used in Signal.

@weebl2000
Copy link
Contributor Author

@weebl2000 thank you for your contributions. Have you considered jumping straight to something proven like the double ratchet instead? Used in Signal.

I don't think double ratchet is practical, we would need to send a new key every message and rely on strict ordering of messages. With LoRa packet limits and out-of-order delivery it would be a disaster.

I'm working on session key negotiation though. That will fix the nonce problem, but requires an exchange first.

Copy link

@3dpgg 3dpgg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting together a PR to address AES-ECB! I had a cursory look, so not all files yet.

// No session key — standard AEAD-first decode for AEAD-capable peers
len = Utils::aeadDecrypt(secret, data, macAndData, macAndDataLen, assoc, 3, dest_hash, src_hash);
if (len > 0) decoded_aead = true;
else len = Utils::MACThenDecrypt(secret, data, macAndData, macAndDataLen);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a peer indicates support for AEAD, and we fail to decrypt using AEAD, do we really need to fall back to trying ECB? I'm struggling to imagine why such a peer would support AEAD but intentionally want to use ECB, given its problems. In such a case, surely the peer would instead decline to set the AEAD flag in the first place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put it in for cases where peers flash old firmware and no longer support newer encryption, but haven't advertised yet. In future this fallback might be removed.

if (len > 0) {
decoded_aead = true;
} else {
len = Utils::MACThenDecrypt(secret, data, macAndData, macAndDataLen);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this scenario, we have a session key but it failed to decrypt, and the previous session key also failed to decode. And then we also failed to decode using AEAD on the long-term secret. And now here, we fall back to ECB.

But if we had a session key, presumably that means we were already talking to a peer that supported AEAD. Why would such an AEAD-capable peer be falling back to ECB? If we don't fall-back to ECB, do we lose anything?

I ask this from two perspectives: removing unnecessary computation for corrupted packets, and tightening up what inputs are permitted from AEAD_capable peers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, mainly useful during transition phase where clients flash older firmware.

src/Mesh.cpp Outdated
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove the comment about "4 matches", since I think the max is actually 8. It confused me when trying to understand the context.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was already there in dev, but I can change it.

if (len > 0) decoded_aead = true;
else len = Utils::MACThenDecrypt(secret, data, macAndData, macAndDataLen);
} else {
// Legacy ECB-first decode
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be ideal if there was a flag or overrideable method that controls whether this peer will ever use the legacy ECB-first decoding method. That way, peers can decide to refuse ECB outright.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'll be safer to add this later. Definitely want to have this tested in the field widely before allowing it to be disabled.

int len = Utils::MACThenDecrypt(secret, data, macAndData, pkt->payload_len - i);
int macAndDataLen = pkt->payload_len - i;

// Try ECB first (Phase 1), then AEAD-4 fallback.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this phase 1, could it have been AEAD first and then ECB fallback? If there's a reason for ECB first, I don't immediately understand it yet. May be worth filling in a comment that the order does/doesn't matter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was assuming in the beginning most clients won't support AEAD yet, so we try old encryption first. Can probably be reverse order when most clients support it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants