What is AIWiki Malaysia?

AIWiki Malaysia is a free, open AI knowledge base covering artificial intelligence concepts, tools, models, and use cases — written specifically for Malaysian professionals and students. It is maintained by AITG Sdn Bhd, an AI company based in Penang.

Who maintains AIWiki Malaysia?

AIWiki Malaysia is maintained by AITG Sdn Bhd (Registration: 202601016521 (1678618-W)), an AI company headquartered in George Town, Penang, Malaysia. The editorial team continuously updates and expands the knowledge base.

What topics does AIWiki Malaysia cover?

AIWiki Malaysia covers a wide range of AI topics including large language models (LLMs), AI agents, machine learning fundamentals, prompt engineering, AI automation, generative AI tools, Malaysian AI regulations, local vendor landscape, and real-world AI use cases relevant to the Malaysian market.

How do I search for AI topics on AIWiki Malaysia?

You can use the search bar at the top of the site to find articles by keyword or topic. Articles are also organised by category, so you can browse by subject area such as Models, Tools, Concepts, or Use Cases.

Is AIWiki Malaysia available in Bahasa Malaysia?

Yes. AIWiki Malaysia publishes content in both English and Bahasa Malaysia to serve the full breadth of the Malaysian professional and student community. Language availability is indicated on each article page.

How can I submit a topic or suggest an article?

You can suggest topics or submit article ideas by contacting the AIWiki Malaysia team at admin@aiteragrid.com. AITG Sdn Bhd reviews all submissions and publishes content that meets editorial accuracy standards.

Gemini

Gemini is a family of multimodal large language models developed by Google DeepMind, designed to natively process and generate text, code, images, audio, and video across a range of model sizes.

6 min readLast updated May 2026Models

Gemini is a family of multimodal large language models developed by Google DeepMind and first announced in December 2023. The models are designed from the ground up to process and reason across multiple data modalities simultaneously, including text, computer code, images, audio, and video, without relying on separate specialised models for each modality. Gemini succeeded the earlier PaLM 2 model family as Google's flagship AI system and represents the company's primary entry in the competitive large language model landscape alongside OpenAI's GPT series and Anthropic's Claude.

The family spans a range of model sizes designed for different deployment contexts: Ultra for the most demanding reasoning tasks, Pro for enterprise and API use cases, Flash for latency-sensitive applications, and Nano for on-device deployment on mobile hardware.

Architecture and Training

Gemini models are trained as natively multimodal systems, meaning that unlike earlier architectures that adapted a language model by bolting on separate vision or audio encoders, Gemini's architecture processes multiple modalities through a unified transformer-based network from the outset. Google DeepMind has described the architecture as building on advancements in efficient attention mechanisms and improvements to transformer scale, though precise architectural details have not been fully disclosed.

The models demonstrate strong performance on established benchmarks. Gemini Ultra achieved human-expert performance on the MMLU (Massive Multitask Language Understanding) benchmark at its release, and subsequent versions have achieved leading scores on reasoning, coding, and mathematics evaluations. Gemini 3, released in late 2025, achieved an Elo score of 1501 on the LMArena Leaderboard, placing it among the top models in human preference evaluations.

Context Window and Multimodal Capabilities

One of the most distinctive technical features of the Gemini 1.5 and later versions is the long context window. Gemini 1.5 Pro introduced a 1 million token context window, enabling the model to process entire codebases, long documents, and extended video content in a single inference call. Gemini 3 maintains this 1 million token input context with a 64,000 token output capacity.

Multimodal capabilities include video understanding, where the model can answer questions about the content of a video by reasoning over frames and audio together; audio understanding, where it transcribes and analyses speech and other sounds; and code generation and execution, where Gemini can write, run, and debug code as part of an agentic workflow.

Model Versions and Access

The Gemini API is accessible through Google AI Studio for developers and through Vertex AI for enterprise deployments. Consumer access is provided through the Gemini web application and through integration into Google Workspace products including Gmail, Docs, and Sheets.

Gemini 1.0 was released in three sizes in December 2023. Gemini 1.5, launched in February 2024, introduced the long context window and improved multimodal reasoning. Gemini 2.0, released in late 2024, added native image generation, controllable text-to-speech, and significantly improved agentic capabilities. Gemini 3 and 3.1, the versions current in 2025 and 2026 respectively, further advanced reasoning, instruction following, and tool use for complex multi-step tasks.

Agentic and Tool Use Capabilities

Later Gemini versions were designed with agentic use cases in mind. The models support function calling, enabling them to invoke external tools and APIs as part of multi-step workflows. Gemini 3.1 Pro includes what Google describes as exceptional instruction following and improved tool use, making it suitable for building AI agents that can plan, take actions, and handle complex tasks across multiple steps.

Google has integrated Gemini into its Workspace suite through the Gemini for Google Workspace programme, providing AI assistance for document writing, email composition, data analysis in spreadsheets, and video meeting summaries.

Competitive Landscape

Gemini competes directly with OpenAI's GPT-4 and GPT-4o, Anthropic's Claude series, and Meta's Llama models. The model family's native multimodality and tight integration with Google's infrastructure and search capabilities are its primary differentiators. The long context window was a significant competitive advantage at its introduction, enabling use cases that required processing large volumes of information in a single call.

Malaysian Context — Gemini in Malaysian Enterprise and Education

Google Cloud has a growing presence in Malaysia, with data centre infrastructure established in the country and partnerships with local enterprises and government agencies. Gemini is accessible to Malaysian developers and enterprises through the Google AI Studio API and through Google Cloud Vertex AI, which is available in the Asia Pacific region.

Several Malaysian government-linked companies and large enterprises have adopted Google Cloud services, and Gemini's enterprise integration through Vertex AI makes it a candidate for deployment in sectors including banking, telecommunications, and public services. Telekom Malaysia (TM), which operates significant cloud infrastructure in the country, has been an active Google Cloud partner.

In Malaysian education, Google Workspace for Education is widely used across schools and universities. Gemini's integration into Workspace tools means Malaysian students and educators have access to AI-assisted writing and research capabilities through familiar interfaces. The Ministry of Education Malaysia's digital transformation initiatives include the expansion of technology tools in classrooms, and AI-assisted productivity tools are part of this trajectory.

Malaysian fintech and banking firms exploring generative AI for customer service, document processing, and compliance reporting have evaluated both Gemini and competing models. The Gemini API's long context window is particularly relevant for financial document analysis, where a model must reason across lengthy regulatory filings, loan applications, or audit reports.

MDEC has recognised Google as a strategic technology partner under the Malaysia Digital initiative, and Google's investments in Malaysian AI talent development — including cloud certification programmes offered through local training providers — have contributed to a growing cohort of Malaysian developers with Gemini API experience.

References

Google DeepMind. (2023). Gemini: A Family of Highly Capable Multimodal Models. Technical Report, Google DeepMind.
Reid, M. et al. (2024). Gemini 1.5: Unlocking Multimodal Understanding Across Millions of Tokens of Context. arXiv:2403.05530.
Google DeepMind. (2026). Gemini 3.1 Pro — Model Card. deepmind.google.
Google Cloud. (2025). Vertex AI Gemini API Documentation. cloud.google.com.
MDEC. (2024). Malaysia Digital Strategic Partners Report. Malaysia Digital Economy Corporation, Cyberjaya.

Tags:gemini google deepmind multimodal AI large language model google AI

Type	Multimodal large language model family
Developed by	Google DeepMind
Initial release	December 2023
Latest version	Gemini 3.1 Pro (2026)
Access	API (Google AI Studio, Vertex AI), consumer apps
Context window	Up to 1 million tokens

Architecture and Training

Context Window and Multimodal Capabilities

Model Versions and Access

Agentic and Tool Use Capabilities

Competitive Landscape

See Also

References