Skip to content

feat(spartan): configurable HA validator replica count#22384

Merged
alexghr merged 4 commits intomerge-train/spartanfrom
claudebox/configurable-validator-ha-count
Apr 8, 2026
Merged

feat(spartan): configurable HA validator replica count#22384
alexghr merged 4 commits intomerge-train/spartanfrom
claudebox/configurable-validator-ha-count

Conversation

@AztecBot
Copy link
Copy Markdown
Collaborator

@AztecBot AztecBot commented Apr 7, 2026

Summary

Makes validator and HA validator pod counts independently configurable. Adds two new optional variables:

  • VALIDATOR_PRIMARY_REPLICA_COUNT: Override pod count for the primary validator release (defaults to VALIDATOR_REPLICAS)
  • VALIDATOR_HA_REPLICA_COUNT: Override pod count for HA validator releases (defaults to VALIDATOR_REPLICAS)

VALIDATOR_REPLICAS remains the canonical "node slot count" used for key derivation and publisher key stride. The new variables only affect how many pods each release runs.

staging-public configuration

Sets staging-public to run 2 primary validators + 4 HA validators:

VALIDATOR_REPLICAS=4              # 4 node slots (256 attesters)
VALIDATOR_PRIMARY_REPLICA_COUNT=2 # 2 primary pods
VALIDATOR_HA_REPLICAS=1           # 1 HA release
VALIDATOR_HA_REPLICA_COUNT=4      # 4 HA pods

This means attester slots 0-1 are served by both primary and HA, while slots 2-3 are served only by HA nodes. When HA runs a different image (via VALIDATOR_HA_DOCKER_IMAGE), this forces mixed-version consensus.

Changes

  • variables.tf: Added VALIDATOR_PRIMARY_REPLICA_COUNT and VALIDATOR_HA_REPLICA_COUNT (both number, default null)
  • main.tf: Moved validator.replicaCount from shared settings to per-release, using coalesce() to fall back to VALIDATOR_REPLICAS
  • deploy_network.sh: Passes both new variables through env → tfvars
  • staging-public.env: Set to 2 primary + 4 HA

@AztecBot AztecBot added ci-draft Run CI on draft PRs. claudebox Owned by claudebox. it can push to this PR. labels Apr 7, 2026
AztecBot added 3 commits April 7, 2026 20:42
…eration

VALIDATOR_REPLICAS is the default pod count per release. When
VALIDATOR_PRIMARY_REPLICA_COUNT or VALIDATOR_HA_REPLICA_COUNT override it,
the keystone (web3signer) and deploy script must generate attester keys
and addresses for max(primary, HA) nodes to cover all pods.

- Terraform: add max_validator_nodes local, use it for web3signer
  NODE_COUNT and publisher key index spacing
- deploy_network.sh: compute VALIDATOR_INDICES dynamically from
  max node count, fix TOTAL_VALIDATOR_PUBLISHERS to sum actual pod
  counts across releases
…CAS as primary count

VALIDATOR_REPLICAS already serves as the primary pod count — no need for
a separate override. Now only two variables control pod counts:
- VALIDATOR_REPLICAS: primary release pod count (was already the default)
- VALIDATOR_HA_REPLICA_COUNT: HA release pod count (defaults to VALIDATOR_REPLICAS)

staging-public: VALIDATOR_REPLICAS=2, VALIDATOR_HA_REPLICA_COUNT=4
@alexghr alexghr marked this pull request as ready for review April 8, 2026 08:49
@alexghr alexghr merged commit a5f9e38 into merge-train/spartan Apr 8, 2026
15 checks passed
@alexghr alexghr deleted the claudebox/configurable-validator-ha-count branch April 8, 2026 08:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-draft Run CI on draft PRs. claudebox Owned by claudebox. it can push to this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants