|
| 1 | +# A/B Testing Example |
| 2 | + |
| 3 | +A minimal hook server that demonstrates how to **A/B test and canary-deploy tool versions** by routing tool calls to different servers. |
| 4 | + |
| 5 | +## What It Shows |
| 6 | + |
| 7 | +- **Pre-execution hook**: Route tool calls to different servers/versions based on experiment config |
| 8 | +- **Consistent hashing**: Same user always gets the same variant (sticky assignment) |
| 9 | +- **Weighted traffic splitting**: Control what percentage of traffic goes to each variant |
| 10 | +- **Tool registry integration**: Fetch available tools from an external API (e.g., [Arcade](https://docs.arcade.dev/en/references/api)) |
| 11 | +- **Statistics tracking**: Monitor how many requests each variant receives |
| 12 | + |
| 13 | +## Quick Start |
| 14 | + |
| 15 | +```bash |
| 16 | +# Run with experiment config |
| 17 | +go run ./examples/ab_testing -config experiments.yaml |
| 18 | +``` |
| 19 | + |
| 20 | +## Config File Format |
| 21 | + |
| 22 | +```yaml |
| 23 | +# Optional: external tool registry for discovering tools |
| 24 | +registry_url: "https://api.example.com" |
| 25 | +registry_key: "your-api-key" |
| 26 | + |
| 27 | +experiments: |
| 28 | + # Canary deployment: 10% traffic to new version |
| 29 | + - name: "search-v2-canary" |
| 30 | + enabled: true |
| 31 | + toolkit: "Search" |
| 32 | + tool: "WebSearch" |
| 33 | + mode: canary |
| 34 | + variants: |
| 35 | + - name: "stable" |
| 36 | + weight: 90 |
| 37 | + version: "1.0.0" |
| 38 | + - name: "canary" |
| 39 | + weight: 10 |
| 40 | + version: "2.0.0" |
| 41 | + server_name: "search-v2" |
| 42 | + server_uri: "http://search-v2.internal:8080" |
| 43 | + server_type: "arcade" |
| 44 | + |
| 45 | + # 50/50 A/B test |
| 46 | + - name: "email-provider-compare" |
| 47 | + enabled: true |
| 48 | + toolkit: "Email" |
| 49 | + tool: "*" |
| 50 | + mode: ab |
| 51 | + variants: |
| 52 | + - name: "provider-a" |
| 53 | + weight: 50 |
| 54 | + - name: "provider-b" |
| 55 | + weight: 50 |
| 56 | + server_name: "email-alt" |
| 57 | + server_uri: "http://email-alt.internal:8080" |
| 58 | + server_type: "arcade" |
| 59 | +``` |
| 60 | +
|
| 61 | +## How It Works |
| 62 | +
|
| 63 | +### Variant Selection |
| 64 | +1. When a tool call matches an active experiment (by toolkit and tool patterns), a variant is selected |
| 65 | +2. Selection uses consistent hashing: `SHA256(user_id + ":" + experiment_name)` |
| 66 | +3. The hash is mapped to a variant based on configured weights |
| 67 | +4. The same user always gets the same variant for a given experiment |
| 68 | + |
| 69 | +### Server Routing |
| 70 | +- If the selected variant has a `server_uri`, the pre-hook overrides the server routing |
| 71 | +- This allows routing to different backend servers, different tool versions, etc. |
| 72 | +- If no server override is specified, the tool executes normally (useful for tracking only) |
| 73 | + |
| 74 | +### Statistics |
| 75 | +- GET `/stats` returns per-experiment, per-variant request counts |
| 76 | +- This shows the actual traffic distribution across variants |
| 77 | + |
| 78 | +## Testing |
| 79 | + |
| 80 | +```bash |
| 81 | +# Start with example config |
| 82 | +go run ./examples/ab_testing -config experiments.yaml & |
| 83 | +
|
| 84 | +# Send pre-hook requests for different users |
| 85 | +for i in $(seq 1 20); do |
| 86 | + curl -s -X POST http://localhost:8888/pre \ |
| 87 | + -H "Content-Type: application/json" \ |
| 88 | + -d "{ |
| 89 | + \"execution_id\": \"exec-$i\", |
| 90 | + \"tool\": {\"name\": \"WebSearch\", \"toolkit\": \"Search\", \"version\": \"1.0.0\"}, |
| 91 | + \"context\": {\"user_id\": \"user-$i\"}, |
| 92 | + \"inputs\": {\"query\": \"test\"} |
| 93 | + }" | python3 -m json.tool |
| 94 | + echo |
| 95 | +done |
| 96 | +
|
| 97 | +# Check statistics |
| 98 | +curl -s http://localhost:8888/stats | python3 -m json.tool |
| 99 | +``` |
| 100 | + |
| 101 | +## Tool Registry Integration |
| 102 | + |
| 103 | +The server can fetch available tools from an external tool registry API: |
| 104 | + |
| 105 | +```bash |
| 106 | +# Configure registry URL in config, then fetch |
| 107 | +curl -s -X POST http://localhost:8888/registry/fetch | python3 -m json.tool |
| 108 | +``` |
| 109 | + |
| 110 | +This is useful for discovering what tools and versions are available before setting up experiments. |
0 commit comments