aicoolies logo

pg_textsearch

BM25 full-text search extension for PostgreSQL

Share
open-sourceOpen Source
Visit Website →

pg_textsearch is a PostgreSQL extension from Timescale that adds BM25 relevance-ranked full-text search directly inside Postgres. Using the same ranking algorithm as Elasticsearch and Lucene, it provides search-engine quality results without requiring a separate search cluster — particularly valuable for developers building RAG pipelines on PostgreSQL who want semantic-quality ranking alongside pgvector.

pg_textsearch is a PostgreSQL extension developed by Timescale that brings BM25 relevance-ranked full-text search directly into Postgres. While PostgreSQL's built-in tsvector and pg_trgm provide basic text search capabilities, they use TF-IDF style ranking that degrades at scale and lacks the sophisticated relevance scoring that users expect from modern search experiences. pg_textsearch adds BM25 — the same ranking algorithm powering Elasticsearch and Apache Lucene — as a native Postgres operator, enabling search-engine quality results without deploying and maintaining a separate search infrastructure.

The extension introduces a clean operator syntax where you simply write ORDER BY content <@> 'search terms' to get BM25-ranked results. Block-Max WAND optimization ensures efficient top-k query execution even over large tables, and the implementation integrates with PostgreSQL's query planner for optimal performance. For developers building RAG pipelines on PostgreSQL — already using pgvector for semantic similarity search — pg_textsearch adds the keyword-based relevance layer that makes hybrid search possible within a single database, eliminating the need to synchronize data between Postgres and Elasticsearch.

Released under the PostgreSQL license by Timescale, a well-funded infrastructure company, the extension has gained rapid attention with over 3,500 GitHub stars and a Hacker News launch that hit 180 points. The combination of pgvector for vector similarity and pg_textsearch for BM25 ranking gives PostgreSQL a complete hybrid search stack that rivals purpose-built search engines — an increasingly attractive proposition as teams look to reduce operational complexity by consolidating on fewer database systems.

Pricing

Free and open source (PostgreSQL License). Works with any PostgreSQL 14+ installation.

Platforms

PostgreSQL extension. Works on any platform where PostgreSQL runs. Compatible with pgvector for hybrid search.

Categories

Tags

Use Cases

Alternatives

Related Tools

Supabase MCP

MCP server for connecting AI assistants to Supabase projects

Supabase MCP is Supabase's Apache-2.0 server for connecting AI assistants to Supabase projects. It can expose database, configuration, and project-management workflows to MCP clients such as Cursor, Claude, and Windsurf, while the official docs emphasize permission and security review before production use, SQL changes, or high-privilege database access.

open-sourceOpen SourceTelemetry

pgvectorscale

DiskANN-powered vector search extension for PostgreSQL

pgvectorscale is an open-source PostgreSQL extension from Timescale that complements pgvector with DiskANN-based approximate vector search. It is useful for teams that want faster embedding retrieval while keeping vectors, filters, and application data inside the Postgres ecosystem instead of adopting a separate hosted vector database.

open-sourceOpen Source
Ardent logo

Ardent

Database branching for coding agents

Ardent is a Postgres database branching platform built for coding-agent workflows. It creates isolated database copies in seconds so Claude Code, Codex, Cursor, or human developers can test migrations, clean data, reproduce bugs, and run risky experiments without touching production. The strongest fit is teams already using Postgres who need agent-safe dev/test databases rather than another generic serverless database.

freemium
VectorChord logo

VectorChord

High-recall Postgres vector search at billion scale

VectorChord is a Postgres extension from the supervc-stack/VectorChord project that brings high-recall vector search to PostgreSQL. As the spiritual successor to pgvecto.rs, it combines IVF indexes with RaBitQ quantization to deliver Pinecone-class performance at billion-vector scale while keeping all data inside a single Postgres database — no separate vector store, no two-system sync, no rewrites when the workload grows.

open-sourceOpen Source
Infinity logo

Infinity

AI-native database for hybrid RAG retrieval

Infinity is an AI-native database from InfiniFlow that unifies dense vectors, sparse vectors, tensors, and full-text search in a single engine. Built for retrieval-augmented generation (RAG) at scale, it powers hybrid search workflows where lexical matching, semantic similarity, and reranking all happen against one storage layer instead of four loosely coupled services.

open-sourceOpen Source
Guidance logo

Guidance

Constrained generation that guarantees valid LLM outputs every time

Guidance is Microsoft's structured generation library that enforces output constraints directly within LLM decoding. It supports JSON schemas, regex patterns, grammars, and interleaved generation-and-control flow to guarantee valid outputs from any compatible model. Works with local models via llama.cpp, Transformers, and remote APIs including OpenAI and Anthropic. Eliminates retry loops and post-processing for structured data extraction.

freeOpen Source

Comparisons