What is AIWiki Malaysia?

AIWiki Malaysia is a free, open AI knowledge base covering artificial intelligence concepts, tools, models, and use cases — written specifically for Malaysian professionals and students. It is maintained by AITG Sdn Bhd, an AI company based in Penang.

Who maintains AIWiki Malaysia?

AIWiki Malaysia is maintained by AITG Sdn Bhd (Registration: 202601016521 (1678618-W)), an AI company headquartered in George Town, Penang, Malaysia. The editorial team continuously updates and expands the knowledge base.

What topics does AIWiki Malaysia cover?

AIWiki Malaysia covers a wide range of AI topics including large language models (LLMs), AI agents, machine learning fundamentals, prompt engineering, AI automation, generative AI tools, Malaysian AI regulations, local vendor landscape, and real-world AI use cases relevant to the Malaysian market.

How do I search for AI topics on AIWiki Malaysia?

You can use the search bar at the top of the site to find articles by keyword or topic. Articles are also organised by category, so you can browse by subject area such as Models, Tools, Concepts, or Use Cases.

Is AIWiki Malaysia available in Bahasa Malaysia?

Yes. AIWiki Malaysia publishes content in both English and Bahasa Malaysia to serve the full breadth of the Malaysian professional and student community. Language availability is indicated on each article page.

How can I submit a topic or suggest an article?

You can suggest topics or submit article ideas by contacting the AIWiki Malaysia team at admin@aiteragrid.com. AITG Sdn Bhd reviews all submissions and publishes content that meets editorial accuracy standards.

Beam Search

Beam search is a heuristic search algorithm used in sequence generation that keeps a fixed number of the most promising partial sequences at each step, balancing output quality against computational cost.

4 min readLast updated June 2026Foundations

Overview

Beam search is a heuristic algorithm for generating output sequences in tasks such as machine translation, speech recognition and text generation. When a model produces text one token at a time, the number of possible sequences grows exponentially with length, so examining every possibility is infeasible. Beam search offers a practical compromise: at each step it retains a fixed number of the most probable partial sequences, called the beam, and expands only those.

How it works

Generation proceeds from left to right. At each step the model assigns a probability to every possible next token for each candidate sequence currently in the beam. Beam search scores all the resulting extended sequences and keeps only the top candidates, where the number kept is the beam width, often a small value such as 4 or 5. The process repeats until sequences reach an end-of-sequence marker or a maximum length.

A beam width of 1 reduces beam search to greedy decoding, which simply picks the single most likely token at each step and can miss better overall sequences. Wider beams explore more alternatives and usually improve quality up to a point, after which returns diminish and computation grows. Because longer sequences accumulate more probability penalties, implementations typically apply length normalisation, dividing the accumulated score by a function of sequence length to avoid an unfair bias toward short outputs.

Relationship to sampling

Beam search aims to find a high-probability, often near-deterministic output, which suits tasks with a single correct answer such as translation. For open-ended generation, however, it can produce repetitive or generic text. Modern large language models therefore frequently use stochastic decoding methods such as top-k sampling, nucleus (top-p) sampling and temperature scaling, which trade some probability for diversity and creativity. Beam search and sampling are sometimes combined.

Strengths and limitations

Beam search reliably finds sequences with higher overall probability than greedy decoding while remaining far cheaper than exhaustive search. Its limitations include a tendency to favour bland or repetitive text in open-ended settings, sensitivity to the chosen beam width, and the well-documented observation that the highest-probability sequence is not always the highest-quality one, a phenomenon sometimes called the beam search curse.

Applications

Beam search remains standard in neural machine translation systems, automatic speech recognition, image captioning and structured prediction tasks where output accuracy matters more than diversity. It is implemented in widely used libraries such as Hugging Face Transformers.

Malaysian Context — Beam Search in Language Technology

Beam search is central to machine translation and speech systems that serve Malaysia's multilingual population, where accurate decoding across Malay, English, Mandarin, Tamil and regional languages is essential. National efforts to build Malay-capable language models, including MaLLaM and the ILMU initiative associated with local organisations and supported by agencies such as MIMOS and MDEC, rely on decoding strategies like beam search for translation and transcription tasks.

Government and public-service applications, such as automated translation of official documents and speech-to-text for Bahasa Melayu, benefit from beam search's accuracy-oriented behaviour. Telecommunications and media companies including Astro and TM apply sequence decoding in subtitling, captioning and voice interfaces.

Malaysian universities teach beam search within natural language processing and deep learning courses, and the country's broader investment in sovereign language models, encouraged under the MyDIGITAL blueprint and the national AI agenda, depends on robust decoding methods to make these systems usable across local languages and dialects.

References

Sutskever, I., Vinyals, O. and Le, Q. (2014). Sequence to Sequence Learning with Neural Networks. NeurIPS.
Freitag, M. and Al-Onaizan, Y. (2017). Beam Search Strategies for Neural Machine Translation. Workshop on Neural Machine Translation.
Holtzman, A., Buys, J., Du, L., Forbes, M. and Choi, Y. (2020). The Curious Case of Neural Text Degeneration. ICLR.

Tags:decoding sequence generation search algorithm nlp

Type	Heuristic search algorithm
Domain	Sequence decoding
Key parameter	Beam width
Common use	Machine translation, speech recognition
Related	Machine translation, Large language models