Chroma
An open-source vector database designed for embedding-based applications, optimised for developer ergonomics and increasingly for large-scale serverless retrieval through a 2025 Rust-core rewrite.
Chroma is an open-source vector database developed by Chroma Inc. and first released in October 2022. It stores embeddings, the documents they were derived from, and arbitrary metadata, and provides similarity search APIs designed for retrieval-augmented generation, semantic search, recommendations, and other embedding-driven applications. Distributed under the Apache 2.0 licence, Chroma is widely used in prototypes built with LangChain and LlamaIndex and has, since a 2025 Rust-core rewrite, become viable for larger production deployments.
Architecture
Chroma exposes a Python and JavaScript client API organised around collections, which group embeddings, source documents, and metadata. Each collection is associated with an embedding function — Chroma can either compute embeddings itself using built-in integrations (OpenAI, Cohere, sentence-transformers, Hugging Face, Voyage, Jina, and others) or accept pre-computed vectors. Queries return the nearest neighbours by cosine, L2, or inner-product distance, with optional metadata filtering using a familiar where-clause syntax.
Earlier versions of Chroma ran as an in-process Python library backed by SQLite and DuckDB, optimised for local development and small workloads. The 2025 Rust rewrite kept that single-file ergonomics but added a true multithreaded core that the company reports as roughly four times faster on writes and queries than the previous Python implementation, removing the Global Interpreter Lock as a bottleneck.
Chroma Cloud and serverless
Chroma Cloud is a managed serverless deployment introduced in 2025 with an architecture that separates query nodes from compactor nodes, using cloud object storage as the shared persistence layer. Query nodes serve indices directly from object storage while compactor nodes build and update them asynchronously, which lowers operational cost compared with SSD-replicated clusters at the expense of slightly higher cold-start latency. Customer data can be protected with customer-managed encryption keys and AWS PrivateLink connectivity.
Search features
Chroma supports dense vector search, BM25 lexical search, and SPLADE sparse vector search, allowing applications to combine semantic and keyword retrieval in a single query. Recent additions include regular-expression search over document text, a GroupBy operator that returns top results per metadata bucket, and richer metadata types including arrays of strings, numbers, and booleans. The simple collection model and zero-configuration local mode make Chroma a popular default for developers prototyping retrieval-augmented generation pipelines.
Comparison with alternatives
| Database | First release | Licence | Strength | | --- | --- | --- | --- | | Chroma | 2022 | Apache 2.0 | Developer experience, embedded mode | | Pinecone | 2021 | Commercial SaaS | Fully managed, large index scale | | Weaviate | 2019 | BSD-3 | Hybrid search, GraphQL API | | Qdrant | 2021 | Apache 2.0 | Quantisation, advanced filtering |
Chroma is typically the right choice for rapid development and workloads up to roughly ten million vectors, while Pinecone, Weaviate, and Qdrant are usually preferred for very large indexes, advanced security and access control, or specialised quantisation needs.
Ecosystem integration
Chroma is a first-class retriever in LangChain, LlamaIndex, Haystack, and many agent frameworks, and is included as a default option in the OpenAI and Anthropic cookbooks for retrieval-augmented generation. It is also a common backing store for local AI assistants packaged with desktop applications because the embedded mode requires no separate server process.
References
- Chroma Inc. (2025). Chroma documentation. trychroma.com.
- Adyog Blog (2025). Chroma Vector Database: The Open-Source Foundation for AI Search.
- Firecrawl (2026). Best Vector Databases in 2026: A Complete Comparison Guide. firecrawl.dev.
- Airbyte (2025). Chroma DB vs Qdrant: Key Differences. airbyte.com.