{
"$type": "site.standard.document",
"bskyPostRef": {
"cid": "bafyreicba2qzvmj2op6jiu6ye4eia4wd6qsis4j4cvbnzq2hep6u34lm2i",
"uri": "at://did:plc:25rdn5elo5izoxrmtis34zuk/app.bsky.feed.post/3mpa5aevebls2"
},
"coverImage": {
"$type": "blob",
"ref": {
"$link": "bafkreibaxk3bjhjr25x4gsz2ncxyg673vsintsuty3p6me2uxvb7vq5dgu"
},
"mimeType": "image/webp",
"size": 87780
},
"path": "/hamzanabdev/how-i-built-a-personal-ai-knowledge-base-with-amazon-aurora-pgvector-and-nextjs-aws-h0-hackathon-19jf",
"publishedAt": "2026-06-26T23:25:05.000Z",
"site": "https://dev.to",
"tags": [
"aws",
"aurora",
"h0hackathon",
"pgvector",
"https://chatscroll.vercel.app",
"https://chatscroll.vercel.app/aws-showcase"
],
"textContent": "I built ChatScroll for the AWS H0 Hackathon — an app that\nlets you save AI answers as searchable \"Scrolls\" using\nAmazon Aurora PostgreSQL with pgvector for semantic search.\n\n## The Problem\n\nEvery day people ask AI assistants valuable questions and\nget great answers — then lose them forever. Chat history\nis linear, unsearchable, and ephemeral. I kept re-Googling\nthe same questions knowing I had already found the answer\nsomewhere but couldn't find it again.\n\n## The Solution\n\nChatScroll transforms AI conversations into a personal\nknowledge library. Save any AI answer as a \"Scroll\",\norganize it automatically, and find it later with\nsemantic search.\n\n## The Core Technical Challenge\n\nMaking search understand MEANING not just keywords. When\nyou search \"blood thinner medication\" it should find your\nwarfarin scroll even though \"blood thinner\" doesn't appear\nin the title.\n\n## How pgvector on Aurora Solves This\n\nAmazon Aurora PostgreSQL with the pgvector extension stores\n3072-dimensional vector embeddings for every saved Scroll.\n\nWhen a user saves a Scroll:\n\n 1. The answer text is sent to Google's gemini-embedding-001\n 2. The model returns a 3072-dimensional vector\n 3. The vector is stored in Aurora alongside the content\n\n\n\nWhen a user searches:\n\n 1. The search query is converted to a vector\n 2. Aurora finds the most similar vectors using cosine distance\n 3. Results are ranked by semantic similarity\n\n\n\n\n -- Semantic search with threshold\n WHERE 1 - (embedding <=> $queryVec) > 0.5\n ORDER BY embedding <=> $queryVec\n LIMIT 5\n\n\n## Three PostgreSQL Extensions Working Together\n\nWhat makes Aurora special for this use case is three\nextensions working together:\n\n**pgvector** — stores 3072-dim embeddings, enables cosine\nsimilarity search between vectors\n\n**ltree** — stores folder paths as dot-separated label trees\n(`programming.containers`), enables subtree queries without\nrecursive CTEs\n\n**tsvector** — powers full-text search with ranking via\nts_rank, combined with pgvector for hybrid search\n\n## The Dual Database Architecture\n\nI made a deliberate choice to use TWO AWS databases:\n\n**Amazon Aurora PostgreSQL** for structured data:\n\n * Scrolls with embeddings\n * Folder hierarchy (ltree)\n * User accounts (Cognito sub)\n * Conversation metadata\n\n\n\n**Amazon DynamoDB** for chat messages:\n\n * PK: conversationId\n * SK: timestamp#messageId\n * TTL: 90-day auto-expiry\n * PAY_PER_REQUEST billing\n\n\n\nThis separation keeps Aurora lean for complex queries\nwhile DynamoDB handles the high-volume chat stream.\n\n## The Result\n\nSearching \"containerization technology\" correctly surfaces\nthe Docker scroll. Searching \"blood thinner medication\"\nfinds warfarin — no programming results contaminating it.\n\nSemantic search scoped to the same folder category\nensures results are always relevant.\n\n## Try It\n\nLive app: https://chatscroll.vercel.app\nAWS Architecture: https://chatscroll.vercel.app/aws-showcase\n\nI created this content for the purposes of entering\nthe AWS H0 Hackathon.\n\n# H0Hackathon #AWS #Aurora #pgvector #Vercel #NextJS #H0Hackathon",
"title": "How I Built a Personal AI Knowledge Base with Amazon Aurora pgvector and Next.js — AWS H0 Hackathon"
}