Skip to content

feat: first-run wizard + auto-detect inference — turn new users into inference buyers on Day 1 #307

@bussyjd

Description

@bussyjd

Summary

Improve obol-stack onboarding by auto-detecting available inference endpoints and enabling new users to buy inference immediately — before they even set up their own node.

Inspired by hermes-agent PR NousResearch/hermes-agent#4194 which fixes first-run wizard triggering and adds model auto-detection for local endpoints.

Motivation

Current onboarding assumes the user wants to serve inference. But the fastest path to value is letting them buy inference first:

  1. User installs obol-stack
  2. Wizard detects no local GPU / no llama-server running
  3. Instead of stopping, it queries public ERC-8004 registry (via 8004scan.io API or BaseScan) for available inference ServiceOffers
  4. Shows a list: "These models are available on the network. Want to connect?"
  5. User picks one, pays via x402 per-request — they're using the network in 60 seconds
  6. Later, they add their own GPU and become a seller too

This is the DePIN flywheel: make it trivially easy to be a buyer first, then convert buyers into sellers.

Design

Phase 1: Local auto-detection (from hermes-agent #4194)

When obol setup runs:

  1. Probe local endpoints — scan localhost for common inference servers:

    • llama-server (port 8080)
    • ollama (port 11434)
    • vllm (port 8000)
    • litellm (port 4000)
    • Any custom endpoint the user provides
  2. Auto-detect models — if a server responds, hit /v1/models endpoint:

    • One model found → auto-select, confirm with user
    • Multiple models → numbered list, user picks
    • No server → skip to Phase 2
  3. Don't silently borrow credentials — if the user has never configured obol-stack, don't assume existing services are intended for obol. Always ask.

Phase 2: Network discovery — become a buyer

If no local inference is found (or user wants more options):

  1. Query public ERC-8004 registry for registered inference ServiceOffers:

    • Primary: 8004scan.io API (if available)
    • Fallback: BaseScan API for ERC-8004 token metadata
    • Filter by OASF skill: natural_language_processing/natural_language_generation/text_completion
  2. Display available services with pricing:

    Available inference on the network:
    
    1. qwen3.5-27b-q6k @ node-xyz    — /usr/bin/bash.001/req — 15/15 ToolCall-15
    2. llama-3-70b-q4   @ node-abc    — /usr/bin/bash.002/req — 14/15 ToolCall-15  
    3. mistral-24b       @ node-def    — /usr/bin/bash.0008/req — 13/15 ToolCall-15
    
    Connect to one? (payment via x402 USDC on Base)
    
  3. One-command connect:

    obol buy http --from node-xyz --model qwen3.5-27b

    This configures the local Hermes agent to route inference through the selected node, paying per-request via x402.

Phase 3: Buyer-to-seller conversion

Once a user is buying inference and sees the economics:

  • "You've spent $4.20 on inference this week. Your RTX 3090 could earn $12/week serving Qwen3.5-27B."
  • obol sell http --upstream llama-server --price 0.001 — one command to flip from buyer to seller

Implementation Notes

First-run guard (from hermes-agent #4194)

The key insight from the hermes-agent PR: check whether obol-stack itself has been configured before looking at what's running on the machine. If the config is empty or default, trigger the wizard regardless of what services are detected.

// Don't skip setup just because llama-server is running
if !config.IsConfigured() {
    runSetupWizard()  // always ask on first run
}

ERC-8004 public discovery

For Phase 2, we need a lightweight client that can query registered ServiceOffers without running the full reth-indexer:

// discovery/public.go
type PublicDiscovery interface {
    ListServiceOffers(skill string) ([]ServiceOffer, error)
}

// Implementations:
// - 8004ScanClient (preferred, purpose-built API)
// - BaseScanClient (fallback, reads ERC-8004 NFT metadata)
// - RethIndexerClient (if running locally)

obol buy http (new command)

Mirror of obol sell http but for the buyer side:

obol buy http \
  --from <node-address-or-erc8004-id> \
  --model <model-name> \
  --max-price 0.002 \       # won't pay more than this per request
  --configure-hermes         # auto-configure Hermes agent to use this endpoint

This generates the x402 payment headers for each request and proxies through to the seller's endpoint.

Dependency on existing PRs

Test Plan

  • First-run wizard triggers on fresh install even if llama-server is running
  • Local endpoint probe correctly identifies llama-server, ollama, vllm, litellm
  • Model auto-detection lists available models from /v1/models
  • Single model auto-selects with user confirmation
  • Public ERC-8004 query returns registered inference ServiceOffers
  • obol buy http configures x402 payment and proxies requests
  • Buyer-to-seller prompt triggers after N requests or $X spent
  • End-to-end: fresh install → buy inference → use via Hermes agent

Labels

component:onboarding component:inference component:erc8004 priority:high

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions