TokenizationTokenization is the first step in any LLM pipeline: converting raw text into a sequence of integer IDs that the model actually processes. Understanding tokenization helps you reason about context wind…Sahil Kapoor's Playbook·May 17·3 min readLangchainVllmOllamaPrompt Engineering
System PromptIn the chat completions API format used by OpenAI, Anthropic, Google, and others, messages come in three roles: system, user, and assistant. The system message is the system prompt, it's processed fir…Sahil Kapoor's Playbook·May 17·3 min readCursorWindsurfLangchainMcp Model Context Protocol
Prompt EngineeringPrompt engineering is the discipline of communicating effectively with large language models. Because LLMs are trained to predict plausible continuations of text, how you frame a request has an enormo…Sahil Kapoor's Playbook·May 17·3 min readSystem PromptCursorWindsurfLangchain
OpenRouterA unified API gateway for large language models that lets you call 100+ LLMs from different providers through a single OpenAI-compatible endpoint with automatic fallback and cost routing.Sahil Kapoor's Playbook·May 17·2 min readOllamaVllmInference EndpointLangchain
OpenHandsOpenHands (formerly OpenDevin) is an open-source platform for AI software engineering agents. Unlike Cursor or Windsurf which are IDEs with AI assistance, OpenHands is a platform where AI agents opera…Sahil Kapoor's Playbook·May 17·3 min readCursorWindsurfOllamaMcp Model Context Protocol
OllamaOllama makes running open-source LLMs as straightforward as running a Docker container. You pull a model, and it starts serving a local REST API that your code can call, no cloud, no API key, no per-t…Sahil Kapoor's Playbook·May 17·3 min readVllmLangchainCursorOpenhands
MCP (Model Context Protocol)Model Context Protocol (MCP) is an open standard introduced by Anthropic in late 2024 that defines a universal way for large language models to communicate with external systems, files, databases, API…Sahil Kapoor's Playbook·May 17·3 min readCursorWindsurfOpenhandsLangchain