Inference EndpointAn inference endpoint is the serving layer for a trained model. After training (or downloading) an LLM, you need infrastructure to accept requests, run the forward pass, and return outputs at scale. T…Sahil Kapoor's Playbook·May 17·3 min readVllmTokenizationOllamaOpenrouter
LangChainLangChain is an open-source framework that provides building blocks for LLM applications. Rather than calling OpenAI's API directly and wiring everything by hand, LangChain gives you composable abstra…Sahil Kapoor's Playbook·May 17·3 min readOllamaSystem PromptMcp Model Context ProtocolPrompt Engineering