A lightweight, browser-based semantic search engine that runs entirely client-side without sending data to external servers.
- 100% Client-Side Processing - All computation happens in your browser
- Real-Time Semantic Search - Results update as you type
- No External API Dependencies - Works offline after initial model load
- Instant Document Management - Add and delete documents with automatic indexing
- Query Highlighting - Search terms are highlighted in results
- Relevance Scoring - Visual indication of match quality
Browser Vector Search uses transformer-based embeddings to perform semantic search without requiring a server:
- Embedding Generation - Converts text to 512-dimensional vectors using Jina Embeddings
- Vector Normalization - Prepares vectors for similarity comparison
- Cosine Similarity - Measures semantic relevance between queries and documents
- IndexedDB Storage - Persists documents and embeddings between sessions
- Debounced Search - Optimizes performance as you type
- @xenova/transformers - For efficient model inference and tokenization
- IndexedDB - Browser-based document storage
- Cloudflare Workers - Serves static assets and model files
- Vanilla JavaScript - No framework dependencies
- Node.js (for development)
- Wrangler CLI (for Cloudflare deployment)
-
Clone the repository:
git clone https://github.com/vakharwalad23/browser-vecsearch.git cd browser-vecsearch -
Install dependencies:
npm install
-
Start the development server:
npx wrangler dev
-
Deploy to Cloudflare:
npx wrangler deploy
-
Add Documents:
- Enter text in the document input field
- Click "Add Document" to index content
-
Search:
- Type in the search box
- Results update automatically as you type
- View relevance scores for each match
-
Manage Documents:
- Delete documents as needed
- Add new documents at any time
Change the model in the pipeline call in public/index.html:
const extractorPipeline = await pipeline('feature-extraction', 'Xenova/your-preferred-model', {
quantized: true,
});The core similarity function can be adjusted in public/index.html:
function cosineSimilarity(a, b) {
// You can use the built-in function or customize
return cos_sim(a, b);
}- Model: Jina Embeddings v2 Small (via Xenova/transformers.js)
- Vector Size: 512 dimensions
- Search Algorithm: Cosine similarity with normalized vectors
- Performance Optimizations:
- Debounced search (300ms)
- Batched document processing
- Normalized vectors for faster comparison
- Non-blocking UI with progress indicators
All processing happens locally in the browser. Your documents and search queries never leave your device (except for the initial model download).
MIT
- Hugging Face Transformers
- ONNX Runtime Web
- Xenova Transformers.js
- Jina AI
- Cloudflare Workers
Created by Dhruv Vakharwala • https://www.dhruvvakharwala.dev
