What is AIWiki Malaysia?

AIWiki Malaysia is a free, open AI knowledge base covering artificial intelligence concepts, tools, models, and use cases — written specifically for Malaysian professionals and students. It is maintained by AITG Sdn Bhd, an AI company based in Penang.

Who maintains AIWiki Malaysia?

AIWiki Malaysia is maintained by AITG Sdn Bhd (Registration: 202601016521 (1678618-W)), an AI company headquartered in George Town, Penang, Malaysia. The editorial team continuously updates and expands the knowledge base.

What topics does AIWiki Malaysia cover?

AIWiki Malaysia covers a wide range of AI topics including large language models (LLMs), AI agents, machine learning fundamentals, prompt engineering, AI automation, generative AI tools, Malaysian AI regulations, local vendor landscape, and real-world AI use cases relevant to the Malaysian market.

How do I search for AI topics on AIWiki Malaysia?

You can use the search bar at the top of the site to find articles by keyword or topic. Articles are also organised by category, so you can browse by subject area such as Models, Tools, Concepts, or Use Cases.

Is AIWiki Malaysia available in Bahasa Malaysia?

Yes. AIWiki Malaysia publishes content in both English and Bahasa Malaysia to serve the full breadth of the Malaysian professional and student community. Language availability is indicated on each article page.

How can I submit a topic or suggest an article?

You can suggest topics or submit article ideas by contacting the AIWiki Malaysia team at admin@aiteragrid.com. AITG Sdn Bhd reviews all submissions and publishes content that meets editorial accuracy standards.

Llama

Llama is a family of open-weight large language models developed by Meta AI, released under a permissive licence that allows researchers and developers to freely download, fine-tune, and deploy the models for both research and commercial use.

6 min readLast updated May 2026Models

Llama (Large Language Model Meta AI) is a family of open-weight foundation models created by Meta AI, the artificial intelligence research division of Meta Platforms. Since the release of the first version in February 2023, Llama has become the most widely downloaded open-weight model family in the world, surpassing one billion cumulative downloads by 2025.[^1] Unlike proprietary systems such as GPT-4 or Gemini, Llama models are released with publicly accessible weights, enabling researchers, developers, and enterprises to inspect, fine-tune, and deploy them without licensing fees or dependence on a third-party API.

Background and Motivation

Prior to Llama's release, access to state-of-the-art language models was largely gated behind proprietary API agreements. Researchers who wished to study model behaviour, alignment properties, or failure modes had to work through opaque interfaces with no visibility into the underlying parameters. Meta's decision to publish model weights represented a deliberate philosophical stance: that the research community — and society more broadly — benefits from being able to audit and improve AI systems rather than treating them as black boxes.[^2]

The original Llama paper, published in February 2023, described models ranging from 7 billion to 65 billion parameters trained on approximately 1.4 trillion tokens drawn from publicly available text corpora.[^3] Crucially, the researchers demonstrated that a carefully curated dataset and efficient training procedure could produce a 13-billion-parameter model that matched or exceeded the performance of GPT-3 (175 billion parameters) on several benchmarks, establishing that scale was not the only route to capable models.

Model Generations

Llama 1 (2023)

The original Llama release offered models with 7B, 13B, 33B, and 65B parameters. It was initially distributed under a research-only licence, limiting commercial deployment. Despite this restriction, the weights were leaked online within days of the initial release, accelerating community fine-tuning efforts and producing a large ecosystem of derivative models including Alpaca, Vicuna, and WizardLM.

Llama 2 (July 2023)

Meta revised its licensing approach with Llama 2, releasing models under a permissive commercial licence that allowed most organisations to use the weights in production applications, subject to usage policies that restricted deployment by services with more than 700 million monthly active users (targeting large competitors rather than typical enterprises). Llama 2 introduced models at 7B, 13B, and 70B parameter sizes, along with fine-tuned chat variants optimised using Reinforcement Learning from Human Feedback (RLHF).

Llama 3 (April 2024)

Llama 3 introduced significantly improved pre-training data quality — approximately 15 trillion tokens, roughly ten times the Llama 2 dataset — and architectural improvements to the tokeniser (expanding the vocabulary from 32,000 to 128,000 tokens). The 8B and 70B base and instruction-tuned variants demonstrated performance competitive with closed models on standard benchmarks including MMLU, HumanEval, and GSM8K. A 405-billion-parameter Llama 3.1 variant was subsequently released, representing the largest open-weight model publicly available at that time.

Llama 4 (April 2025)

Llama 4 marked Meta's transition to natively multimodal architectures. The release comprised three models built on a Mixture of Experts (MoE) framework: Scout (17B active / 109B total parameters, 10 million token context window), Maverick (17B active / 400B total parameters, 1 million token context window), and Behemoth (288B active / 2 trillion total parameters, not yet publicly released as of mid-2025). All Llama 4 models were trained on large quantities of unlabelled text, image, and video data spanning 200 languages, giving them broad visual understanding alongside strong language capabilities.[^4]

Architecture and Training

Llama models use a decoder-only transformer architecture with several modifications relative to the original 2017 design: pre-normalisation using RMSNorm (for training stability), rotary positional embeddings (RoPE) for improved generalisation across sequence lengths, and grouped-query attention (GQA) in later versions to reduce memory bandwidth requirements during inference. The Llama 4 generation adopted a Mixture of Experts design, in which each token is routed to a small subset of specialist sub-networks rather than passing through all parameters, reducing effective compute per forward pass while increasing total model capacity.

Safety Infrastructure

Meta distributes a suite of companion tools alongside the Llama weights. Llama Guard is a fine-tuned classifier designed to detect policy-violating content in both prompts and model responses across categories including violence, hate speech, sexual content, and dangerous instructions. Prompt Guard is a separate model that identifies prompt injection attempts, in which adversarial content embedded in external data sources attempts to hijack the model's behaviour. CyberSecEval provides structured benchmarks for evaluating model vulnerability to cybersecurity misuse.

Ecosystem and Derivative Models

The open-weight nature of Llama has catalysed one of the largest AI ecosystems outside of proprietary platforms. Hugging Face hosts thousands of Llama-derived fine-tunes spanning domains including medicine, law, coding, and instruction following. Enterprises have used LoRA and QLoRA techniques to adapt Llama for private data at a fraction of the cost of training from scratch. Deployment frameworks such as Ollama, llama.cpp, and vLLM allow the models to run on consumer hardware or be served at scale on cloud infrastructure.

Malaysian Context — Open-Source AI Adoption

Malaysia's position as a growing AI hub in Southeast Asia has been shaped partly by the availability of open-weight models such as Llama, which lower the barrier to entry for local startups, universities, and government agencies that lack the budget for high-volume proprietary API usage. The MyDigital Blueprint and the Malaysia AI Roadmap both emphasise sovereign AI capability, and open-weight models are a key enabler of that goal — allowing local fine-tuning on Bahasa Malaysia corpora without routing sensitive data through foreign cloud APIs.

Several Malaysian technology companies and research institutions have experimented with Llama-based deployments. Universiti Malaya and Universiti Teknologi Malaysia have used Llama variants in natural language processing research, particularly for Malay language tasks. The MDEC Digital Hub programme has highlighted open-source LLMs as a pathway for SMEs to build AI-powered products without incurring prohibitive API costs.

Telekom Malaysia (TM) and Maxis have explored Llama-based solutions for customer service automation, where on-premises deployment addresses data residency requirements under Malaysia's Personal Data Protection Act (PDPA). Retaining customer conversation data within national borders is a compliance priority for licensed telecommunications operators, and locally hosted Llama instances provide a pathway that cloud-based proprietary APIs cannot easily match.

The HRD Corp (formerly HRDC) has funded training programmes in AI model deployment, with content covering open-source LLMs including Llama fine-tuning using LoRA. These programmes are available to Malaysian employers and workers through the SBL-Khas scheme. Malaysia's cost-competitive cloud infrastructure — anchored by data centres operated by Microsoft, AWS, and Google in the Klang Valley and Johor — also provides a favourable environment for running larger Llama variants at scale.

References

Meta AI. (2025). Llama 4: The Beginning of a New Era of Natively Multimodal AI Innovation. Meta AI Blog.
Touvron, H., et al. (2023). LLaMA: Open and Efficient Foundation Language Models. arXiv:2302.13971.
Touvron, H., et al. (2023). LLaMA: Open and Efficient Foundation Language Models. Meta AI Research.
Meta AI. (2025). Llama 4 Technical Report. Meta Platforms.

Tags:llama meta-ai open-source large-language-model

Type	Large language model family
Developed by	Meta AI (Meta Platforms)
First released	February 2023
Latest version	Llama 4 (April 2025)
Licence	Meta Llama Community Licence (permissive, commercial use allowed)
Related	GPT-4, Mistral, Gemini, DeepSeek