AIWiki
Malaysia

Milvus

Milvus is an open-source, cloud-native vector database built for high-performance approximate nearest neighbour search over massive embedding datasets, widely used in retrieval-augmented generation and semantic search.

4 min readLast updated June 2026Companies & Tools

Milvus is an open-source, cloud-native vector database designed for high-performance similarity search over very large collections of high-dimensional vectors. Originally created by the company Zilliz and now developed as a project under the LF AI & Data Foundation, Milvus is distributed under the Apache 2.0 licence. It is one of the most widely adopted vector databases for workloads such as retrieval-augmented generation, semantic search, recommendation systems, image retrieval, and AI-agent memory, where the central operation is finding the nearest vectors to a query embedding.

Purpose and core operation

Modern AI systems convert text, images, audio, and other data into embeddings, which are dense numerical vectors that place semantically similar items close together in a high-dimensional space. Milvus stores these vectors and answers approximate nearest neighbour (ANN) queries efficiently, returning the items most similar to a query vector. Doing this quickly across millions or billions of vectors requires specialised indexing and a system architecture designed for scale, which is the problem Milvus is built to solve.

Architecture

Milvus 2.x was redesigned from the ground up as a distributed, cloud-native system that separates storage from compute, allowing each layer to scale independently. The system follows a principle of disaggregating the data plane from the control plane and is organised into mutually independent layers for scalability and disaster recovery.

Key components include a proxy that handles client connections and request routing; query nodes that execute vector search and scalar filtering; data nodes that manage ingestion and segment operations; index nodes that build and maintain indexes; and a set of coordinators (root, data, and query) that manage cluster topology and task scheduling. The core functions of search, data insertion, and indexing or compaction are built as parallelisable processes, and the components are designed to be largely stateless so the system can scale seamlessly on Kubernetes or public cloud platforms.

Indexing and search features

Milvus supports a range of index types, including HNSW (Hierarchical Navigable Small World graphs), DiskANN for datasets that exceed memory, and various quantisation and binary indexes. It also offers GPU-accelerated index building and search through NVIDIA CUDA, including the CAGRA graph index from the cuVS library, which can substantially speed up high-throughput workloads.

A notable capability is hybrid search, which combines vector similarity with scalar field filtering, so a query can retrieve the most similar documents subject to conditions on metadata. Milvus supports top-K and range ANN search, dense and sparse vectors, multi-vector search, result grouping, and multi-tenancy, making it suitable for complex production retrieval scenarios.

Deployment options

Milvus provides three deployment models behind a unified application programming interface. Milvus Lite is an embedded, lightweight option for prototyping; Milvus Standalone targets testing and small-scale production; and Milvus Distributed serves large-scale production across a cluster. Official software development kits are available for Python, Java, Node.js, and Go, and Zilliz offers a managed cloud service for teams that prefer not to operate the database themselves.

| Aspect | Milvus | |--------|--------| | Indexing | HNSW, DiskANN, IVF, quantisation, GPU CAGRA | | Hybrid search | Vector plus scalar filtering | | Deployment | Lite, Standalone, Distributed | | SDKs | Python, Java, Node.js, Go | | Licence | Apache 2.0 |

References

  1. Milvus. (2026). Milvus Architecture Overview. https://milvus.io/docs/architecture_overview.md
  2. Zilliz. (2026). What is Milvus? https://zilliz.com/what-is-milvus
  3. milvus-io. (2026). Milvus GitHub Repository. https://github.com/milvus-io/milvus
  4. IBM. (2025). What is Milvus?