External Publication
Visit Post

Fine-tuning

Sahil Kapoor's Playbook May 12, 2026
Source

Fine-tuning is the process of taking a pretrained model and continuing to train it on a target dataset, so its weights adapt to a specific task, domain, style, or output format. It is the standard way to customise a model when prompting alone is not enough.

Variants

  • Full fine-tuning. Updates every parameter. Maximum capacity, maximum cost.
  • LoRA (Low-Rank Adaptation). Trains a small low-rank update on top of frozen weights. Fast, cheap, and the result is a small adapter file.
  • QLoRA. LoRA on a quantised base model, enabling fine-tuning of large models on a single GPU.
  • Instruction tuning. Fine-tuning on instruction-response pairs to make a base model follow instructions.
  • RLHF and DPO. Aligning model outputs with human preferences, used in modern chat models.

๐Ÿ”—

Related Terms RAG, Embeddings.

Discussion in the ATmosphere

Loading comments...