Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreicsv2oj2pk6zy2zl37zxfx5rwymgdc2bd2ukdmsv3qkupcwnmjjpu",
    "uri": "at://did:plc:25rdn5elo5izoxrmtis34zuk/app.bsky.feed.post/3mppumcuu6tz2"
  },
  "coverImage": {
    "$type": "blob",
    "ref": {
      "$link": "bafkreia2fhaogjsmdgqrbljrj7xovomv47sca2sh3yilnig7sksg35vfl4"
    },
    "mimeType": "image/webp",
    "size": 72158
  },
  "path": "/anshul_basia/i-added-semantic-search-to-my-react-app-without-a-backend-and-its-under-1ms-4pd3",
  "publishedAt": "2026-07-03T04:54:36.000Z",
  "site": "https://dev.to",
  "tags": [
    "javascript",
    "react",
    "webdev",
    "showdev",
    "altor-vec",
    "web worker guide",
    "altorlab.dev/getting-started",
    "altorlab.dev/api",
    "altorlab.dev/guides/react/document-search",
    "altorlab.dev/vs/fuse-js",
    "altorlab.dev/migrate-from/algolia",
    "github.com/altor-lab/altor-vec",
    "@huggingface"
  ],
  "textContent": "How I built browser-native semantic vector search using WASM — no server, no API keys, no per-query cost. Full code walkthrough with React.\n\nMy documentation site had search that sent every user query to Algolia. Fine for open source (free tier), annoying for a paid product where $1/1K searches compounds faster than you expect.\n\nFuse.js was the obvious alternative — runs in the browser, zero config. But Fuse.js is fuzzy _text_ matching. A user typing \"cancel my plan\" will never find a doc titled \"end your subscription.\" That's the gap I wanted to close.\n\nWhat I wanted: search that understands _meaning_ , runs entirely in the browser, and has zero per-query cost.\n\nSo I built altor-vec — HNSW vector search compiled to 54KB of WASM. No server. No API keys. The index is a static file on your CDN.\n\n##  Fuse.js vs semantic search — the actual difference\n\nFuse.js asks: \"does this string look like that string?\" (Bitap / Levenshtein distance)\n\naltor-vec asks: \"does this _meaning_ resemble that meaning?\" (HNSW + embeddings)\n\n`Query: \"how do I cancel\"\n\nFuse.js matches: docs containing the word \"cancel\"\naltor-vec matches: docs about cancellation, ending service,\nunsubscribing, account closure, billing stop`\n\nThe cost of semantic search: you need embeddings — a one-time build step. If you want typo tolerance over a short autocomplete list, Fuse.js is the right call. If you want understanding, keep reading.\n\n##  Step 1: Generate the index at build time\n\n\n    // scripts/build-search-index.mjs\n    import { pipeline } from '@huggingface/transformers';\n    import { WasmSearchEngine } from 'altor-vec/node';\n    import fs from 'fs';\n\n    // Free embedding model, runs in Node.js — no API call needed\n    const embed = await pipeline('feature-extraction', 'Xenova/all-MiniLM-L6-v2');\n\n    const docs = [\n      { id: 0, text: 'How to cancel your subscription and manage billing' },\n      { id: 1, text: 'Account settings and profile preferences' },\n      { id: 2, text: 'Getting started with the API and authentication' },\n      { id: 3, text: 'Troubleshooting login and password reset' },\n      // ...your actual content\n    ];\n\n    const vectors = [];\n    for (const doc of docs) {\n      const out = await embed(doc.text, { pooling: 'mean', normalize: true });\n      vectors.push(...Array.from(out.data));\n    }\n\n    // Build HNSW index\n    const engine = WasmSearchEngine.from_vectors(\n      new Float32Array(vectors),\n      384,   // dimensions (all-MiniLM-L6-v2 output size)\n      16,    // M — connections per node\n      200,   // ef_construction — build quality\n      50     // ef_search — query recall\n    );\n\n    // Serialize to a binary file\n    fs.writeFileSync('./public/search-index.bin', Buffer.from(engine.serialize()));\n    console.log(`Built index: ${docs.length} docs`);\n\n\nWire it into your build:\n\n\n\n    {\n      \"scripts\": {\n        \"prebuild\": \"node scripts/build-search-index.mjs\",\n        \"build\": \"vite build\"\n      }\n    }\n\n\nNow every deploy regenerates the index. For a docs site with a few hundred pages, this takes a few seconds.\n\n##  Step 2: The React component\n\n\n    import { useState, useEffect, useRef, useCallback } from 'react';\n    import init, { WasmSearchEngine } from 'altor-vec';\n    /console.log(`Built index: ${docs.length} docs`);\n    export function SearchWidget({ docs }) {\n      const engineRef = useRef(null);\n      const embedRef = useRef(null);\n      const [results, setResults] = useState([]);\n      const [loading, setLoading] = useState(true);\n      const timerRef = useRef(null);\n\n      useEffect(() => {\n        async function setup() {\n          await init(); // loads the 54KB WASM module\n          const res = await fetch('/search-index.bin');\n          engineRef.current = WasmSearchEngine.from_bytes(\n            new Uint8Array(await res.arrayBuffer())\n          );\n          // Embedding model runs in-browser, cached after first load (~23MB)\n          embedRef.current = await pipeline(\n            'feature-extraction',\n            'Xenova/all-MiniLM-L6-v2'\n          );\n          setLoading(false);\n        }\n        setup();\n      }, []);\n\n      const handleSearch = useCallback((query) => {\n        // Debounce — embedding takes ~50ms, don't fire on every keystroke\n        clearTimeout(timerRef.current);\n        timerRef.current = setTimeout(async () => {\n          if (!query.trim() || !engineRef.current) return setResults([]);\n          const out = await embedRef.current(query, { pooling: 'mean', normalize: true });\n          const hits = JSON.parse(\n            engineRef.current.search(new Float32Array(out.data), 5)\n          );\n          setResults(hits.map(([id, score]) => ({ ...docs[id], score })));\n        }, 200);\n      }, [docs]);\n\n      if (loading) return <p>Loading search…</p>;\n\n    return (\n        <div>\n          <input\n            type=\"search\"\n            placeholder=\"Search docs…\"\n            onChange={e => handleSearch(e.target.value)}\n          />\n          <ul>\n            {results.map(r => (\n              <li key={r.id}>\n                <a href={r.url}>{r.title}</a>\n              </li>\n            ))}\n          </ul>\n        </div>\n      );\n    }\n\n\nNo API routes. No environment variables. No billing dashboard.\n\n> **Production tip** : move the engine + embedding model into a Web Worker so search never blocks the main thread. See the web worker guide.\n\n##  The numbers\n\n  * **Query time** : <1ms p95 for 10K vectors (384 dimensions) in Chrome\n  * **WASM size** : 54KB gzipped — loads in ~100ms on a 4G connection\n  * **Index size** : ~17MB for 10K documents (served from CDN, cached after first load)\n  * **Embedding model** : ~23MB first load, then cached in browser storage\n  * **Per-query cost** : $0\n\n\n\nThe first load is heavier than Fuse.js because you're downloading a model. If that's a dealbreaker, precompute all embeddings at build time and skip the in-browser model entirely — then query time is literally just the WASM search.\n\n##  When NOT to use this\n\nI want to be honest about the tradeoffs:\n\n  * **Millions of documents** → the index file gets too big to serve efficiently. Use a server.\n  * **Real-time index updates** → the index is rebuilt at deploy time. Not suitable for user-generated content that changes constantly.\n  * **Private/sensitive content** → if documents are secret, you can't ship the index to every user's browser.\n  * **Need Algolia-style faceting and merchandising** → altor-vec doesn't have that. Algolia is genuinely better for search-as-a-product.\n\n\n\nFor documentation sites, internal tools, marketing sites, personal projects, and anywhere the content is public and updates on deploys: this works very well.\n\n##  Get started\n\n\n    npm install altor-vec\n\n\n  * **Getting started** (5-minute guide): altorlab.dev/getting-started\n  * **API reference** : altorlab.dev/api\n  * **React full guide** (with Web Worker, debounce, error handling): altorlab.dev/guides/react/document-search\n  * **vs Fuse.js** (detailed comparison): altorlab.dev/vs/fuse-js\n  * **Migrating from Algolia** : altorlab.dev/migrate-from/algolia\n  * **GitHub** : github.com/altor-lab/altor-vec\n\n\n\nI've been running this on my own docs for a few months. First-load is heavier than Fuse.js, but after the model caches, search latency is genuinely sub-millisecond — you feel the difference.\n\nIf you try it and hit something broken, open a GitHub issue or drop a comment here. Happy to help debug.",
  "title": "I added semantic search to my React app without a backend (and it's under 1ms)"
}