AIWiki
Malaysia

Search Results

16 results for transformer

Foundations

Attention Mechanism

A neural network technique that enables models to dynamically weight the relevance of different parts of an input sequence when producing each output element, forming the core of transformer architectures.

6 min readUpdated May 2026
Models

BERT

BERT (Bidirectional Encoder Representations from Transformers) is a pre-trained transformer-based language model developed by Google that reads text bidirectionally to understand word context in natural language tasks.

6 min readUpdated June 2026
Foundations

Encoder-Decoder Architecture

A neural network design pattern that compresses an input sequence into an internal representation using an encoder, and then generates an output sequence from that representation using a decoder, foundational to machine translation, summarisation, and many other sequence-to-sequence tasks.

6 min readUpdated May 2026
Foundations

Flash Attention

FlashAttention is an IO-aware exact attention algorithm that restructures the standard attention computation into memory-efficient tiled blocks, dramatically reducing GPU memory usage and wall-clock time for transformer models on long sequences.

6 min readUpdated June 2026
Companies & Tools

Hugging Face

An American AI company and open-source platform that hosts machine learning models, datasets, and applications, widely described as the "GitHub of machine learning" for its role as the central repository of the open AI community.

5 min readUpdated May 2026
Infrastructure

KV Cache

A KV cache (key-value cache) is a memory optimisation used in transformer inference that stores pre-computed key and value tensors from the attention mechanism, eliminating redundant recomputation when generating tokens sequentially.

6 min readUpdated June 2026
Foundations

Large Language Models

Large language models (LLMs) are AI systems trained on vast corpora of text to predict and generate natural language. They underpin modern chatbots, code assistants, and generative AI applications.

5 min readUpdated May 2026
Foundations

Layer Normalisation

Layer normalisation is a technique that normalises the inputs across the features of a single training example, stabilising and accelerating the training of deep neural networks, especially transformers.

4 min readUpdated June 2026
Applications

LoRA (Low-Rank Adaptation)

LoRA is a parameter-efficient fine-tuning technique that adapts large pre-trained models by injecting small trainable low-rank matrices into transformer layers, drastically reducing the number of trainable parameters without sacrificing performance.

6 min readUpdated May 2026
Foundations

Mamba (Structured State Space Model)

Mamba is a selective state space model architecture that achieves linear-time sequence modelling, offering a computationally efficient alternative to the Transformer for long-context tasks.

6 min readUpdated June 2026
Foundations

Mixture of Experts

Mixture of Experts (MoE) is a machine learning architecture in which a model routes each input to a small subset of specialised sub-networks called experts, enabling large model capacity at a fraction of the compute cost.

6 min readUpdated June 2026
Foundations

Natural Language Processing

Natural language processing (NLP) is the subfield of AI concerned with enabling computers to understand, interpret, manipulate, and generate human language in both text and speech form.

3 min readUpdated May 2026
Applications

Optical Character Recognition

A computer vision technology that converts images of typed, handwritten, or printed text into machine-readable digital text, increasingly powered by deep learning and transformer-based vision models.

5 min readUpdated May 2026
Infrastructure

Sentence Transformers

Sentence Transformers are neural network models that encode sentences, paragraphs, or short documents into fixed-length dense vector embeddings optimised for semantic similarity comparison.

6 min readUpdated June 2026
Foundations

Transformer Architecture

A neural network architecture introduced in 2017 that uses self-attention mechanisms to process sequential data in parallel, forming the foundation of modern large language models and multimodal AI systems.

7 min readUpdated May 2026
Foundations

Vision Transformer

The Vision Transformer (ViT) is a deep learning model that applies the transformer architecture originally designed for NLP directly to sequences of image patches, achieving state-of-the-art results on visual recognition tasks.

5 min readUpdated June 2026