What Sets Them Apart
Vald ships as a set of loosely coupled Kubernetes components — LB Gateway, Discoverer, Agent, and Index Manager — each scaled independently via Helm. Qdrant ships as a single binary that handles everything, with optional sharding and replication for production scale. The architectural difference is fundamental: Vald assumes Kubernetes from the start; Qdrant assumes you want to get to search fast and grow into distributed deployment later.
Vald and Qdrant at a Glance
Vald is a CNCF Landscape project maintained by LY Corporation, the Japanese tech company behind LINE and Yahoo! Japan. It uses the NGT (Neighborhood Graph and Tree) algorithm — developed internally at Yahoo Japan — which is among the fastest ANN algorithms in benchmark comparisons. Vald supports automatic vector indexing, incremental index backup to object storage, and horizontal scaling without downtime, making it a strong fit for teams already deep in cloud-native Kubernetes operations.
Qdrant is a Rust-based vector similarity search engine with rich filtering, payload indexing, and hybrid search out of the box. It offers client libraries for Python, Go, Rust, TypeScript, Java, and C#, and supports both dense and sparse vectors (for hybrid BM25 + semantic search). Qdrant Cloud provides a fully managed option; the on-prem path is straightforward with Docker or a single binary.
Both projects are open source and production-ready, but they target different operator profiles. Vald expects a team comfortable writing Helm values and Kubernetes manifests; Qdrant can be running locally in seconds with docker run.
Indexing Strategy and Scale Characteristics
Vald's NGT-based indexing is graph-based and known for strong recall at high query-per-second rates. Its distributed architecture splits the index across multiple Agent pods, each responsible for a shard; the LB Gateway routes and aggregates results. This means billion-scale deployments are a native design goal, not an afterthought. Index backup is handled automatically to object storage, enabling recovery without full re-indexing.
Qdrant uses HNSW as its primary index, with configurable m and ef_construction parameters for the recall-speed tradeoff. It supports on-disk indexing (mmap) for memory-constrained environments, and its quantization support (scalar, product, binary) makes it practical for cost-sensitive deployments. Payload indexes on arbitrary JSON fields enable sub-millisecond pre-filtering before vector search, a capability that sees heavy use in production RAG pipelines.
For teams running in a managed Kubernetes environment with dedicated ML ops resources, Vald's component-level scaling is a genuine advantage — you can scale the indexing layer independently of the serving layer. For teams without a platform team to manage Helm releases and pod lifecycle, Qdrant's single-binary model removes a significant operational surface area.