What is AIWiki Malaysia?

AIWiki Malaysia is a free, open AI knowledge base covering artificial intelligence concepts, tools, models, and use cases — written specifically for Malaysian professionals and students. It is maintained by AITG Sdn Bhd, an AI company based in Penang.

Who maintains AIWiki Malaysia?

AIWiki Malaysia is maintained by AITG Sdn Bhd (Registration: 202601016521 (1678618-W)), an AI company headquartered in George Town, Penang, Malaysia. The editorial team continuously updates and expands the knowledge base.

What topics does AIWiki Malaysia cover?

AIWiki Malaysia covers a wide range of AI topics including large language models (LLMs), AI agents, machine learning fundamentals, prompt engineering, AI automation, generative AI tools, Malaysian AI regulations, local vendor landscape, and real-world AI use cases relevant to the Malaysian market.

How do I search for AI topics on AIWiki Malaysia?

You can use the search bar at the top of the site to find articles by keyword or topic. Articles are also organised by category, so you can browse by subject area such as Models, Tools, Concepts, or Use Cases.

Is AIWiki Malaysia available in Bahasa Malaysia?

Yes. AIWiki Malaysia publishes content in both English and Bahasa Malaysia to serve the full breadth of the Malaysian professional and student community. Language availability is indicated on each article page.

How can I submit a topic or suggest an article?

You can suggest topics or submit article ideas by contacting the AIWiki Malaysia team at admin@aiteragrid.com. AITG Sdn Bhd reviews all submissions and publishes content that meets editorial accuracy standards.

Mamba (Structured State Space Model)

Mamba is a selective state space model architecture that achieves linear-time sequence modelling, offering a computationally efficient alternative to the Transformer for long-context tasks.

6 min readLast updated June 2026Foundations

Mamba is a deep learning architecture based on selective state space models (SSMs) that processes sequential data with linear computational complexity relative to sequence length. Introduced in December 2023 by Albert Gu and Tri Dao in the paper Mamba: Linear-Time Sequence Modeling with Selective State Spaces, it emerged as one of the most significant architectural alternatives to the Transformer since the original attention-based model was proposed in 2017. Unlike the quadratic attention mechanism of Transformers, Mamba scales efficiently to very long sequences, making it attractive for tasks involving genomics, audio, and long-document language modelling.

Background: State Space Models

State space models originate from control theory and signal processing, where they describe how a system evolves over time given a sequence of inputs. In the context of deep learning, a structured SSM maps an input sequence to an output sequence through a latent hidden state. The core recurrence for a continuous-time SSM is:

where , , and are learnable matrices. During training, SSMs can be unrolled as a convolutional operation for parallelism; during inference they run as a recurrence for constant-memory generation. Earlier SSM architectures such as S4 (Structured State Space for Sequence Modelling) applied these ideas to deep learning but struggled to match Transformer performance on language tasks because they treated every token identically — the model could not selectively focus on or ignore specific inputs.

The Mamba Innovation: Selective State Spaces

The defining contribution of Mamba is input-dependent SSM parameters. Rather than holding , , and the discretisation step delta fixed across all tokens, Mamba makes these parameters functions of the current input . This mechanism, termed selective state spaces, allows the model to selectively retain relevant context and discard irrelevant information at each step. The selection mechanism is analogous to what attention provides in Transformers — a content-aware routing of information — but without the quadratic cost of computing pairwise token similarities.

The architecture pairs the selective SSM with a hardware-aware parallel scan algorithm that avoids materialising large intermediate tensors, enabling efficient GPU execution despite the recurrent structure.

Mamba-2 and Structural State Space Duality

In 2024, Gu and Dao published a follow-up work introducing Mamba-2, which established a formal connection between structured SSMs and a restricted class of linear attention mechanisms — a result called Structured State Space Duality (SSD). This theoretical result allowed the authors to design a simplified SSM layer, the SSD layer, that supports larger state dimensions, achieves 2 to 8 times faster training than Mamba-1, and integrates more naturally with tensor-parallel training strategies used for large models.

Performance Characteristics

Mamba achieves 5 times higher inference throughput compared to equivalently sized Transformer models at long sequence lengths. In terms of language modelling perplexity, a Mamba model at 3 billion parameters matches a Transformer at roughly 6 billion parameters, while being approximately 40 percent cheaper to run. These gains become more pronounced as sequence length grows, because the Transformer's memory and compute requirements grow quadratically whereas Mamba's grow linearly.

| Property | Transformer | Mamba | |---|---|---| | Attention complexity | O(n^2) | O(n) | | Memory at inference | O(n) KV cache | O(state size), fixed | | Parallelism in training | Full | Via parallel scan | | Content-aware routing | Yes (attention) | Yes (selective SSM) | | Positional encoding needed | Yes | No |

Hybrid Architectures

Following Mamba's release, several research groups proposed hybrid architectures that interleave Mamba layers with Transformer attention layers. Models such as Jamba (from AI21 Labs) and Zamba combine the long-range efficiency of SSM layers with the associative recall strengths of attention layers. These hybrids often outperform pure Mamba or pure Transformer models of equivalent parameter count, suggesting that the two mechanisms are complementary rather than mutually exclusive.

Applications Beyond Language

The linear-time property of Mamba makes it especially valuable in domains with very long sequences. In genomics, sequences of DNA bases can extend to millions of tokens; the Caduceus model applied Mamba to DNA language modelling. In audio, SSMs have modelled raw waveforms at high sample rates without the memory bottlenecks that constrain Transformer-based audio models. Vision applications include video understanding, where frame-level tokens accumulate rapidly.

Malaysian Context — SSM Research and Adoption

Malaysia's AI research community has engaged with Mamba and state space models through university-based labs and technology companies tracking global architectural developments. Universiti Teknologi Malaysia (UTM), which launched Malaysia's first dedicated Faculty of Artificial Intelligence in 2024, includes deep learning architecture research as part of its postgraduate curriculum and has faculty working on efficient sequence modelling for Malay-language NLP tasks.

The Malaysia Digital Economy Corporation (MDEC) supports applied AI research through its various grant mechanisms under the MyDigital Blueprint, which prioritises investment in foundational AI capabilities. Efficient architectures such as Mamba are of particular interest because they reduce the compute cost of inference, making AI more accessible to small and medium enterprises in Malaysia that cannot afford the infrastructure demanded by large Transformer models.

Local cloud adoption through platforms such as Amazon Bedrock and Azure AI — both of which have Malaysian data residency options — means Malaysian developers can experiment with Mamba-based models as they appear on inference endpoints. AWS, through its partnership network including VSTECS, supports Malaysian enterprises in accessing cloud infrastructure for AI workloads, including those involving long-document or time-series inputs where Mamba-type models hold an efficiency advantage.

In the fintech sector, institutions regulated by Bank Negara Malaysia (BNM) are exploring efficient sequence models for transaction history analysis. Long financial time series are a natural fit for SSM-based architectures, and BNM's Risk Management in Technology (RMiT) policy framework encourages the adoption of computationally efficient AI approaches that can be explained and audited.

References

Gu, A., and Dao, T. (2023). Mamba: Linear-Time Sequence Modeling with Selective State Spaces. arXiv:2312.00752.
Dao, T., and Gu, A. (2024). Transformers are SSMs: Generalized Models and Efficient Algorithms through Structured State Space Duality. arXiv:2405.21060.
Gu, A., Goel, K., and Re, C. (2021). Efficiently Modeling Long Sequences with Structured State Spaces. arXiv:2111.00396.
Mindstudio. (2025). What Is Mamba 3? The State Space Model Architecture That Challenges Transformers. MindStudio Blog.

Tags:mamba state space model SSM sequence modelling architecture

Type	Sequence model architecture
Introduced	December 2023
Developed by	Albert Gu and Tri Dao
Complexity	Linear O(n) in sequence length
Related	Transformer, LSTM, S4

Background: State Space Models

The Mamba Innovation: Selective State Spaces

Mamba-2 and Structural State Space Duality

Performance Characteristics

Hybrid Architectures

Applications Beyond Language

See Also

References