External Publication

Fine-tuning

Sahil Kapoor's Playbook May 12, 2026

Fine-tuning is the process of taking a pretrained model and continuing to train it on a target dataset, so its weights adapt to a specific task, domain, style, or output format. It is the standard way to customise a model when prompting alone is not enough.

Variants

Full fine-tuning. Updates every parameter. Maximum capacity, maximum cost.
LoRA (Low-Rank Adaptation). Trains a small low-rank update on top of frozen weights. Fast, cheap, and the result is a small adapter file.
QLoRA. LoRA on a quantised base model, enabling fine-tuning of large models on a single GPU.
Instruction tuning. Fine-tuning on instruction-response pairs to make a base model follow instructions.
RLHF and DPO. Aligning model outputs with human preferences, used in modern chat models.

🔗

Related Terms RAG, Embeddings.

Variants

Discussion in the ATmosphere