Search Results
4 results for “ZeRO”
DeepSpeed
DeepSpeed is an open-source deep learning optimisation library developed by Microsoft that enables efficient distributed training and inference of large-scale neural networks through memory and compute optimisations.
Instruction Tuning
Instruction tuning is a supervised fine-tuning technique that trains large language models on datasets of instruction-response pairs, enabling models to follow natural language directions and generalise to unseen tasks in a zero-shot or few-shot setting.
Prompt Engineering
The practice of designing and optimising input instructions given to large language models to elicit accurate, relevant, and well-structured outputs for a given task or application.
Zero-Shot Learning
Zero-shot learning is a machine learning paradigm in which a model makes accurate predictions on categories it has never seen during training by leveraging semantic descriptions or attribute representations.