Few-Shot Learning
Few-shot learning is a machine learning paradigm in which a model learns to perform new tasks or recognise new classes from only a small number of labelled training examples, often just one to five samples per class.
Few-shot learning is a branch of machine learning concerned with training models that can generalise effectively from a very small number of labelled examples. Unlike conventional supervised learning, which typically requires thousands or millions of labelled samples to achieve acceptable performance, few-shot learning enables an AI system to recognise new categories or perform new tasks after seeing as few as one to five examples per class. The paradigm addresses one of the most persistent practical constraints in machine learning: the cost and difficulty of obtaining large, annotated datasets.
The concept is closely related to one-shot learning, where only a single example per class is available, and zero-shot learning, where a model must handle entirely unseen classes without any examples at all. Together, these techniques form a spectrum of low-data learning strategies that are increasingly relevant as AI moves into specialised domains where labelled data is scarce, expensive, or medically sensitive.
Motivation and Problem Setting
Conventional deep learning models require vast labelled datasets to learn robust feature representations. A standard image classifier might require tens of thousands of labelled photographs per category to achieve high accuracy. This requirement is impractical for many real-world applications: a radiologist cannot label thousands of scans for each rare disease, and a wildlife conservation team cannot collect thousands of images of an endangered species.
Few-shot learning addresses this by treating learning itself as a skill that can be acquired. Instead of training a model from scratch on a small dataset, few-shot learning trains a model on many related tasks so that it develops a powerful ability to generalise. When presented with a new task containing only a few examples, the model applies the generalisation ability it has already acquired.
The standard evaluation framework is called the N-way K-shot task. In this setup, the model is given K labelled examples (the support set) for each of N classes, and is then asked to classify query examples from those same N classes. A 5-way 1-shot task, for instance, presents five classes with one example each.
Key Approaches
Metric-Based Methods
Metric-based few-shot learning trains a model to learn an embedding space in which similar examples are close together and dissimilar examples are far apart. At inference time, a new example is classified by comparing its embedding to the embeddings of the support set examples.
Prototypical networks, introduced by Snell et al. in 2017, compute a prototype representation for each class as the mean embedding of its support examples. Query examples are classified to the nearest prototype using Euclidean distance. Matching networks use an attention mechanism to compare query examples to each support example individually rather than using class means.
Optimisation-Based Methods
Model-Agnostic Meta-Learning (MAML), introduced by Finn et al. in 2017, takes a different approach. Rather than learning a fixed embedding space, MAML learns a set of initial model parameters that can be quickly adapted to a new task with only a few gradient update steps. The model is trained to find an initialisation that is maximally sensitive to new task information, enabling rapid fine-tuning from just a handful of examples.
Augmentation-Based Methods
Augmentation approaches address the data scarcity problem directly by generating synthetic training examples. Generative models such as variational autoencoders and generative adversarial networks can hallucinate plausible new samples from the few available examples. These synthetic samples are then used to train a conventional classifier.
Large Pre-Trained Models
The emergence of large foundation models such as GPT-4, Claude, and Gemini has transformed few-shot learning in natural language processing. These models, trained on vast corpora, can perform few-shot tasks through in-context learning: simply providing a few examples in the prompt is sufficient for the model to infer the task and generalise to new inputs without any parameter updates. This approach, popularised by GPT-3, is now a dominant paradigm in applied NLP.
Applications
Few-shot learning has found traction across numerous applied domains. In medical imaging, it enables classification of rare pathologies where annotated examples are limited. In drug discovery, it helps predict the properties of novel molecular structures. In computer vision, it powers fine-grained recognition systems that must distinguish between highly similar subcategories. In robotics, it enables agents to learn new manipulation tasks from a small number of demonstrations.
In the domain of language, few-shot in-context learning underpins many of the most capable language model behaviours observed in 2024 and 2025, including code generation, structured data extraction, and multi-step reasoning from minimal prompt examples.
Benchmarks and Evaluation
Standard benchmarks for few-shot image classification include miniImageNet and tieredImageNet, both derived from the ImageNet dataset. The Omniglot dataset, comprising handwritten characters from 50 alphabets, is widely used for one-shot classification. For language tasks, the GLUE and SuperGLUE benchmarks include few-shot evaluation protocols, and GPT-3 popularised the few-shot evaluation methodology in the NLP community.
See Also
References
- Snell, J., Swersky, K., and Zemel, R. (2017). Prototypical Networks for Few-shot Learning. Advances in Neural Information Processing Systems 30 (NeurIPS 2017).
- Finn, C., Abbeel, P., and Levine, S. (2017). Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks. Proceedings of ICML 2017.
- Brown, T. et al. (2020). Language Models are Few-Shot Learners. Advances in Neural Information Processing Systems 33 (NeurIPS 2020).
- Wang, Y. et al. (2020). Generalizing from a Few Examples: A Survey on Few-Shot Learning. ACM Computing Surveys, 53(3).
- MDEC. (2024). AI Talent Development Programme Report. Malaysia Digital Economy Corporation, Kuala Lumpur.