Tag: fine-tuning

All the articles with the tag "fine-tuning".

Fine-tuning Phi-2 with DPO on the Anthropic HH Dataset

29 Feb, 2024

Fine-tuning Microsoft's Phi-2 using Direct Preference Optimization (DPO) on the Anthropic Helpful and Harmless dataset with LoRA and 8-bit quantization.