What is AIWiki Malaysia?

AIWiki Malaysia is a free, open AI knowledge base covering artificial intelligence concepts, tools, models, and use cases — written specifically for Malaysian professionals and students. It is maintained by AITG Sdn Bhd, an AI company based in Penang.

Who maintains AIWiki Malaysia?

AIWiki Malaysia is maintained by AITG Sdn Bhd (Registration: 202601016521 (1678618-W)), an AI company headquartered in George Town, Penang, Malaysia. The editorial team continuously updates and expands the knowledge base.

What topics does AIWiki Malaysia cover?

AIWiki Malaysia covers a wide range of AI topics including large language models (LLMs), AI agents, machine learning fundamentals, prompt engineering, AI automation, generative AI tools, Malaysian AI regulations, local vendor landscape, and real-world AI use cases relevant to the Malaysian market.

How do I search for AI topics on AIWiki Malaysia?

You can use the search bar at the top of the site to find articles by keyword or topic. Articles are also organised by category, so you can browse by subject area such as Models, Tools, Concepts, or Use Cases.

Is AIWiki Malaysia available in Bahasa Malaysia?

Yes. AIWiki Malaysia publishes content in both English and Bahasa Malaysia to serve the full breadth of the Malaysian professional and student community. Language availability is indicated on each article page.

How can I submit a topic or suggest an article?

You can suggest topics or submit article ideas by contacting the AIWiki Malaysia team at admin@aiteragrid.com. AITG Sdn Bhd reviews all submissions and publishes content that meets editorial accuracy standards.

Hybrid Search

Hybrid search is a retrieval technique that combines sparse keyword-based search (typically BM25) with dense vector semantic search to achieve superior recall and precision over either method alone.

6 min readLast updated June 2026Applications

Hybrid search is an information retrieval paradigm that fuses sparse lexical retrieval — most commonly BM25 — with dense semantic retrieval based on vector embeddings. By combining the complementary strengths of both methods, hybrid search consistently outperforms either approach in isolation, delivering higher recall on exact-match queries while simultaneously capturing semantic intent that keyword methods miss. By 2025, hybrid search has become the de facto standard architecture for production-grade retrieval-augmented generation (RAG) systems.

Motivation

Two classical retrieval paradigms dominate information retrieval. Sparse retrieval methods represent documents and queries as high-dimensional vectors in vocabulary space, where most dimensions are zero. BM25, the dominant sparse method, scores documents based on term frequency weighted by inverse document frequency, with saturation and length normalisation. Sparse methods excel at matching exact terms, product codes, proper nouns, and technical identifiers. They are fast, interpretable, and require no GPU infrastructure.

Dense retrieval methods encode queries and documents as low-dimensional continuous vector embeddings produced by neural models such as sentence transformers or bi-encoders. Documents are retrieved by approximate nearest-neighbour search in embedding space. Dense methods capture paraphrase, synonym substitution, and conceptual similarity, returning semantically relevant documents even when they share no vocabulary with the query. However, they can miss exact matches, fail on rare out-of-vocabulary terms, and require substantial compute for embedding and indexing.

Neither method is universally superior. A query for a specific product SKU, legal case citation, or medical ICD code benefits from exact keyword matching. A query asking for documents conceptually related to a theme requires semantic understanding. Real-world retrieval workloads contain both types, and hybrid search addresses this by running both pipelines and merging their results.

Architecture

A hybrid search system operates in three stages.

In the first stage, sparse and dense retrieval run in parallel. The sparse retriever, typically BM25 implemented via Elasticsearch, OpenSearch, or a purpose-built index, scores and ranks documents using lexical overlap with the query. Simultaneously, the dense retriever embeds the query using a neural encoder, queries a vector database or approximate nearest-neighbour index, and returns the top-k semantically similar documents.

In the second stage, the two ranked lists are merged using a score fusion strategy. Reciprocal Rank Fusion (RRF) is the most widely adopted technique: for each document appearing in either ranked list, its fused score is the sum of 1/(k + r_i) across all lists where r_i is its rank in list i and k is a smoothing constant (commonly 60). RRF is robust to score distribution differences between the two lists and requires no calibration of relative weights. Weighted linear score interpolation is an alternative, combining normalised BM25 and cosine similarity scores with tunable alpha and (1 - alpha) coefficients, at the cost of requiring calibration.

The third stage optionally passes the merged candidate set to a reranker — a cross-encoder model that jointly encodes the query and each candidate document to produce more accurate relevance scores. Reranking operates on a small candidate set (typically 20–100 documents) and is therefore computationally feasible despite the quadratic complexity of cross-attention.

Performance

Empirical benchmarks on MS MARCO, TREC Deep Learning, and BEIR datasets consistently show that hybrid search outperforms pure BM25 or pure dense retrieval across the majority of query types. Research and industry practitioners report 15–30% improvement in recall at a fixed precision cutoff when combining both retrieval modes compared with the stronger of the two individual baselines. The benefit is most pronounced on heterogeneous corpora where queries vary widely in specificity and semantic character.

For RAG applications, improved retrieval quality translates directly into lower hallucination rates and higher factual accuracy in generated answers, because the language model receives more relevant context.

Implementation

Major vector databases and search platforms have integrated hybrid search natively. Pinecone, Weaviate, Qdrant, and Chroma all expose hybrid search APIs that internally manage BM25 indexing alongside vector indexing. Elasticsearch and OpenSearch support both sparse and dense retrieval with built-in RRF fusion. Azure AI Search, MongoDB Atlas Search, and Google Vertex AI Search offer managed hybrid search as a cloud service. LangChain and LlamaIndex provide abstraction layers that orchestrate hybrid retrieval pipelines across multiple backends.

Trade-offs

Hybrid search adds operational complexity: teams must maintain and synchronise two indices — a sparse inverted index and a dense vector index — and re-embed documents when the embedding model is updated. Latency is higher than pure BM25 because the dense retrieval step requires GPU inference for the encoder. Storage costs are elevated because both sparse postings lists and dense float vectors must be stored. For applications where query latency is critical, pre-computation and caching strategies are commonly employed.

Malaysian Context — Enterprise Search and RAG Adoption

Hybrid search has emerged as an important architectural pattern for Malaysian enterprises building knowledge management and retrieval systems. As organisations in Malaysia increasingly deploy retrieval-augmented generation to surface information from internal document repositories, the choice of retrieval backend directly affects the quality of AI-generated responses.

Telecommunications providers such as TM (Telekom Malaysia) and Maxis have explored AI-powered internal knowledge bases and customer service automation, where hybrid search enables agents to locate relevant policy documents, technical manuals, and precedent cases using both keyword and semantic queries. Financial institutions including Maybank, Public Bank, and Hong Leong Bank face analogous requirements in compliance documentation retrieval, where BM25 excels at matching regulation codes and hybrid search adds coverage for policy intent queries.

The Malaysian government's AI initiatives under the MyDigital Blueprint and the National AI Office (NAIO) under MOSTI include support for public-sector knowledge management. Hybrid search is relevant to government portals that must serve citizens querying for forms, regulations, and services in a mixture of Bahasa Malaysia and English. The Malaysia Digital Economy Corporation (MDEC) has supported digital transformation programmes that include enterprise search modernisation.

Local cloud and AI service providers, including those operating under MSC Malaysia status in Cyberjaya, have packaged RAG solutions for Malaysian SMEs and government agencies, with hybrid search as a core component. Amazon Web Services (AWS) and Microsoft Azure both operate data centres in Malaysia, making their managed hybrid search services — Amazon OpenSearch and Azure AI Search respectively — accessible with low latency for Malaysian deployments. The adoption of hybrid search in Malaysia is therefore less constrained by infrastructure availability than by AI literacy and the availability of Bahasa Malaysia embedding models capable of powering the dense retrieval component.

References

Robertson, S., & Zaragoza, H. (2009). The Probabilistic Relevance Framework: BM25 and Beyond. Foundations and Trends in Information Retrieval, 3(4), 333-389.
Cormack, G. V., Clarke, C. L. A., & Buettcher, S. (2009). Reciprocal rank fusion outperforms condorcet and individual rank learning methods. Proceedings of SIGIR 2009.
Karpukhin, V., et al. (2020). Dense Passage Retrieval for Open-Domain Question Answering. Proceedings of EMNLP 2020.
Pinecone. (2025). Hybrid Search Documentation. Pinecone Systems Inc.
Microsoft. (2025). Azure AI Search: Hybrid Retrieval. Microsoft Corporation.

Tags:search rag vector-search bm25 information-retrieval

Type	Information Retrieval Technique
Combines	BM25 (sparse) + vector search (dense)
Fusion method	Reciprocal Rank Fusion (RRF)
Key use	RAG pipelines, enterprise search, e-commerce
Related	BM25, Vector Database, RAG, Semantic Search, Reranking