{
"$type": "site.standard.document",
"bskyPostRef": {
"cid": "bafyreicsv2oj2pk6zy2zl37zxfx5rwymgdc2bd2ukdmsv3qkupcwnmjjpu",
"uri": "at://did:plc:25rdn5elo5izoxrmtis34zuk/app.bsky.feed.post/3mppumcuu6tz2"
},
"coverImage": {
"$type": "blob",
"ref": {
"$link": "bafkreia2fhaogjsmdgqrbljrj7xovomv47sca2sh3yilnig7sksg35vfl4"
},
"mimeType": "image/webp",
"size": 72158
},
"path": "/anshul_basia/i-added-semantic-search-to-my-react-app-without-a-backend-and-its-under-1ms-4pd3",
"publishedAt": "2026-07-03T04:54:36.000Z",
"site": "https://dev.to",
"tags": [
"javascript",
"react",
"webdev",
"showdev",
"altor-vec",
"web worker guide",
"altorlab.dev/getting-started",
"altorlab.dev/api",
"altorlab.dev/guides/react/document-search",
"altorlab.dev/vs/fuse-js",
"altorlab.dev/migrate-from/algolia",
"github.com/altor-lab/altor-vec",
"@huggingface"
],
"textContent": "How I built browser-native semantic vector search using WASM — no server, no API keys, no per-query cost. Full code walkthrough with React.\n\nMy documentation site had search that sent every user query to Algolia. Fine for open source (free tier), annoying for a paid product where $1/1K searches compounds faster than you expect.\n\nFuse.js was the obvious alternative — runs in the browser, zero config. But Fuse.js is fuzzy _text_ matching. A user typing \"cancel my plan\" will never find a doc titled \"end your subscription.\" That's the gap I wanted to close.\n\nWhat I wanted: search that understands _meaning_ , runs entirely in the browser, and has zero per-query cost.\n\nSo I built altor-vec — HNSW vector search compiled to 54KB of WASM. No server. No API keys. The index is a static file on your CDN.\n\n## Fuse.js vs semantic search — the actual difference\n\nFuse.js asks: \"does this string look like that string?\" (Bitap / Levenshtein distance)\n\naltor-vec asks: \"does this _meaning_ resemble that meaning?\" (HNSW + embeddings)\n\n`Query: \"how do I cancel\"\n\nFuse.js matches: docs containing the word \"cancel\"\naltor-vec matches: docs about cancellation, ending service,\nunsubscribing, account closure, billing stop`\n\nThe cost of semantic search: you need embeddings — a one-time build step. If you want typo tolerance over a short autocomplete list, Fuse.js is the right call. If you want understanding, keep reading.\n\n## Step 1: Generate the index at build time\n\n\n // scripts/build-search-index.mjs\n import { pipeline } from '@huggingface/transformers';\n import { WasmSearchEngine } from 'altor-vec/node';\n import fs from 'fs';\n\n // Free embedding model, runs in Node.js — no API call needed\n const embed = await pipeline('feature-extraction', 'Xenova/all-MiniLM-L6-v2');\n\n const docs = [\n { id: 0, text: 'How to cancel your subscription and manage billing' },\n { id: 1, text: 'Account settings and profile preferences' },\n { id: 2, text: 'Getting started with the API and authentication' },\n { id: 3, text: 'Troubleshooting login and password reset' },\n // ...your actual content\n ];\n\n const vectors = [];\n for (const doc of docs) {\n const out = await embed(doc.text, { pooling: 'mean', normalize: true });\n vectors.push(...Array.from(out.data));\n }\n\n // Build HNSW index\n const engine = WasmSearchEngine.from_vectors(\n new Float32Array(vectors),\n 384, // dimensions (all-MiniLM-L6-v2 output size)\n 16, // M — connections per node\n 200, // ef_construction — build quality\n 50 // ef_search — query recall\n );\n\n // Serialize to a binary file\n fs.writeFileSync('./public/search-index.bin', Buffer.from(engine.serialize()));\n console.log(`Built index: ${docs.length} docs`);\n\n\nWire it into your build:\n\n\n\n {\n \"scripts\": {\n \"prebuild\": \"node scripts/build-search-index.mjs\",\n \"build\": \"vite build\"\n }\n }\n\n\nNow every deploy regenerates the index. For a docs site with a few hundred pages, this takes a few seconds.\n\n## Step 2: The React component\n\n\n import { useState, useEffect, useRef, useCallback } from 'react';\n import init, { WasmSearchEngine } from 'altor-vec';\n /console.log(`Built index: ${docs.length} docs`);\n export function SearchWidget({ docs }) {\n const engineRef = useRef(null);\n const embedRef = useRef(null);\n const [results, setResults] = useState([]);\n const [loading, setLoading] = useState(true);\n const timerRef = useRef(null);\n\n useEffect(() => {\n async function setup() {\n await init(); // loads the 54KB WASM module\n const res = await fetch('/search-index.bin');\n engineRef.current = WasmSearchEngine.from_bytes(\n new Uint8Array(await res.arrayBuffer())\n );\n // Embedding model runs in-browser, cached after first load (~23MB)\n embedRef.current = await pipeline(\n 'feature-extraction',\n 'Xenova/all-MiniLM-L6-v2'\n );\n setLoading(false);\n }\n setup();\n }, []);\n\n const handleSearch = useCallback((query) => {\n // Debounce — embedding takes ~50ms, don't fire on every keystroke\n clearTimeout(timerRef.current);\n timerRef.current = setTimeout(async () => {\n if (!query.trim() || !engineRef.current) return setResults([]);\n const out = await embedRef.current(query, { pooling: 'mean', normalize: true });\n const hits = JSON.parse(\n engineRef.current.search(new Float32Array(out.data), 5)\n );\n setResults(hits.map(([id, score]) => ({ ...docs[id], score })));\n }, 200);\n }, [docs]);\n\n if (loading) return <p>Loading search…</p>;\n\n return (\n <div>\n <input\n type=\"search\"\n placeholder=\"Search docs…\"\n onChange={e => handleSearch(e.target.value)}\n />\n <ul>\n {results.map(r => (\n <li key={r.id}>\n <a href={r.url}>{r.title}</a>\n </li>\n ))}\n </ul>\n </div>\n );\n }\n\n\nNo API routes. No environment variables. No billing dashboard.\n\n> **Production tip** : move the engine + embedding model into a Web Worker so search never blocks the main thread. See the web worker guide.\n\n## The numbers\n\n * **Query time** : <1ms p95 for 10K vectors (384 dimensions) in Chrome\n * **WASM size** : 54KB gzipped — loads in ~100ms on a 4G connection\n * **Index size** : ~17MB for 10K documents (served from CDN, cached after first load)\n * **Embedding model** : ~23MB first load, then cached in browser storage\n * **Per-query cost** : $0\n\n\n\nThe first load is heavier than Fuse.js because you're downloading a model. If that's a dealbreaker, precompute all embeddings at build time and skip the in-browser model entirely — then query time is literally just the WASM search.\n\n## When NOT to use this\n\nI want to be honest about the tradeoffs:\n\n * **Millions of documents** → the index file gets too big to serve efficiently. Use a server.\n * **Real-time index updates** → the index is rebuilt at deploy time. Not suitable for user-generated content that changes constantly.\n * **Private/sensitive content** → if documents are secret, you can't ship the index to every user's browser.\n * **Need Algolia-style faceting and merchandising** → altor-vec doesn't have that. Algolia is genuinely better for search-as-a-product.\n\n\n\nFor documentation sites, internal tools, marketing sites, personal projects, and anywhere the content is public and updates on deploys: this works very well.\n\n## Get started\n\n\n npm install altor-vec\n\n\n * **Getting started** (5-minute guide): altorlab.dev/getting-started\n * **API reference** : altorlab.dev/api\n * **React full guide** (with Web Worker, debounce, error handling): altorlab.dev/guides/react/document-search\n * **vs Fuse.js** (detailed comparison): altorlab.dev/vs/fuse-js\n * **Migrating from Algolia** : altorlab.dev/migrate-from/algolia\n * **GitHub** : github.com/altor-lab/altor-vec\n\n\n\nI've been running this on my own docs for a few months. First-load is heavier than Fuse.js, but after the model caches, search latency is genuinely sub-millisecond — you feel the difference.\n\nIf you try it and hit something broken, open a GitHub issue or drop a comment here. Happy to help debug.",
"title": "I added semantic search to my React app without a backend (and it's under 1ms)"
}