Infinity is the database engine behind InfiniFlow's RAGFlow product, designed from the ground up for retrieval-augmented generation rather than retrofitted from a transactional or search-only system. It stores dense embeddings, sparse vectors (BM25 and SPLADE-style), tensors for late-interaction reranking like ColBERT, and structured filters in one engine, so a single query can blend lexical recall, semantic similarity, and reranking without shuffling data across pgvector, Elasticsearch, and a vector DB.
The engine is written in C++ and exposes a Python SDK plus an HTTP API. It supports hybrid search out of the box: developers can issue a query that combines dense kNN, BM25, and tensor reranking with a fusion strategy like RRF or weighted scoring. This matches how production RAG pipelines actually look in 2026 — the era of pure dense-only retrieval ended once teams started measuring recall against real document corpora and discovered hybrid almost always wins.
Operationally, Infinity targets self-hosted and air-gapped deployments. It runs in a single binary or Docker container, scales horizontally via sharding, and keeps the data layout deliberately simple so backups and snapshots remain straightforward. The 4,400+ stars on GitHub, Apache-2.0 license, and tight integration with RAGFlow give it real production traction inside Chinese enterprises and a growing global RAG community looking for a Postgres alternative that does not pretend RAG is just vector search.
