aicoolies logo

ParadeDB vs pg_textsearch — Feature-Rich Postgres Search or Fast BM25 at Scale

ParadeDB and pg_textsearch both keep search inside Postgres. ParadeDB is broader for facets, phrase queries, joins, and analytics; Timescale’s pg_textsearch is a custom bm25 index access method, not GIN/tsvector, and its March 2026 benchmarks beat ParadeDB on MS MARCO query latency and throughput.

Analyzed by Raşit Akyol on April 3, 2026

Share

What Sets Them Apart

Both ParadeDB and pg_textsearch keep search inside PostgreSQL, but the original article incorrectly framed pg_textsearch as PostgreSQL’s built-in full-text search layered on GIN and tsvector. Timescale’s README shows a separate extension loaded through shared_preload_libraries, and its SQL install script creates a custom bm25 index access method. The correct split is not native GIN versus Rust search; it is ParadeDB’s broader search feature surface versus pg_textsearch’s purpose-built BM25 index, Block-Max WAND query path, and benchmarked latency focus.

ParadeDB and pg_textsearch at a Glance

BM25 ranking is no longer a ParadeDB-only advantage. pg_textsearch’s public README describes configurable BM25 parameters, the ORDER BY content <@> 'search terms' query form, and CREATE INDEX ... USING bm25 rather than CREATE INDEX ... USING gin. ParadeDB’s pg_search also targets Elasticsearch-style relevance, but the comparison should treat both products as search-grade Postgres extensions instead of contrasting full BM25 against PostgreSQL ts_rank.

The feature surface still favors ParadeDB when an application needs phrase queries, highlighting, tokenizers and token filters, filters, facets, aggregates, joins, or the columnar/analytics layer described in ParadeDB’s own README. Its Tantivy-backed approach stores term positions by default, which is why the Timescale benchmark notes ParadeDB can support phrase queries such as “quick brown fox.” That is a real capability gap, not a performance claim.

pg_textsearch is narrower, but it is not zero-install built-in PostgreSQL search. It requires installing the extension, preloading pg_textsearch, creating the extension in the database, and building a bm25 index. What it buys for that operational step is a Postgres-native extension workflow, expression indexes over JSONB or transformed text, partial indexes for scoped search, partition support, and parallel index builds for large tables.

Feature Breadth, BM25 Ranking, and Hybrid Claims

Hybrid/vector search should be described carefully. ParadeDB’s current README marks vector search and hybrid search as “coming soon,” while its search stack is already strong for BM25, phrase search, facets, filtering, joins, and analytics. pg_textsearch does not provide vector search itself; teams that need semantic retrieval typically pair it with pgvector, pgvectorscale, pgai, or a separate vector system rather than expecting pg_textsearch to solve hybrid ranking alone.

Index maintenance follows different trade-offs than the original text claimed. pg_textsearch is not a GIN index with fastupdate behavior; the extension defines a bm25 index access method and operator class. Timescale’s 1.3.1 SQL install file explicitly creates CREATE ACCESS METHOD bm25 TYPE INDEX HANDLER tp_handler, while ParadeDB uses its own Tantivy-backed index structures. Write amplification, segment merging, phrase support, and index size therefore have to be evaluated from each extension’s own index design, not from PostgreSQL GIN defaults.

Language analysis is one place where pg_textsearch intentionally reuses PostgreSQL strengths. The README says it works with PostgreSQL text search configurations such as English, French, and German, while also supporting expression indexes and multi-column search. ParadeDB provides a richer tokenizer and filter configuration surface for teams that want search-engine-style analysis controls. The right choice depends on whether you prefer Postgres text configuration compatibility or deeper search analyzer tuning.

Indexing, Language Analysis, and Benchmark Evidence

The benchmark story should be reversed from the original article. Timescale’s public pg_textsearch vs ParadeDB comparison, attributed to the pg_textsearch benchmark dashboard and commonly circulated by Todd J. Green, reports pg_textsearch 3.1x faster overall query throughput on the 8.8M-passage MS MARCO v1 run, with p50 latency faster across all 1-token through 8+ token buckets. ParadeDB still built the index faster in that run, 140.1 seconds versus 233.5 seconds, and retained phrase-query and broader feature advantages.

At larger scale, the same source reports pg_textsearch ahead on query performance rather than degrading behind ParadeDB. On the 138M-passage MS MARCO v2 experiment, pg_textsearch shows 2.3x faster weighted p50 query latency and 4.7x higher concurrent throughput with 16 clients, while ParadeDB builds the index 1.9x faster and keeps better p95 latency on some longer-query buckets. That is a nuanced trade-off: pg_textsearch appears stronger for BM25 query latency and concurrent throughput at scale; ParadeDB remains stronger for index build speed and richer search features.

The Bottom Line

The practical conclusion changes accordingly. Choose pg_textsearch when the priority is BM25-ranked Postgres search with strong latency, smaller index footprint in the cited benchmarks, native Postgres extension semantics, and compatibility with PostgreSQL text configurations. Choose ParadeDB when facets, phrase queries, highlighting, joins, aggregates, or an Elasticsearch-like feature set matter more than the latest pg_textsearch benchmark results. The page should not call pg_textsearch a GIN/tsvector wrapper, and it should not claim ParadeDB is the faster option at scale without qualifying which benchmark and workload it means.

Quick Comparison

FeatureParadeDBpg_textsearch
PricingFree community; managed cloud coming soonFree and open source (PostgreSQL License). Works with any PostgreSQL 14+ installation.
PlatformsPostgreSQL extension on any Postgres platformPostgreSQL extension. Works on any platform where PostgreSQL runs. Compatible with pgvector for hybrid search.
Open SourceYesYes
TelemetryCleanClean
DescriptionParadeDB brings Elasticsearch-quality full-text search, BM25 ranking, and hybrid vector-keyword search directly into PostgreSQL as native extensions. Backed by a 12 million dollar Series A with over 500,000 Docker deployments, it eliminates the overhead of running separate search infrastructure. Teams get powerful search within their existing Postgres stack without managing additional clusters.pg_textsearch is a PostgreSQL extension from Timescale that adds BM25 relevance-ranked full-text search directly inside Postgres. Using the same ranking algorithm as Elasticsearch and Lucene, it provides search-engine quality results without requiring a separate search cluster — particularly valuable for developers building RAG pipelines on PostgreSQL who want semantic-quality ranking alongside pgvector.