# rag

16 tools tagged

Showing 16 of 16 tools

Headroom

Context compression for LLM apps and coding agents

Headroom is an Apache-2.0 context compression layer for LLM apps and coding agents. It compresses tool output, logs, files, RAG chunks, and agent history through a local library, proxy, wrapper, or MCP server, with retrieval hooks for bringing originals back when needed. Treat its savings numbers as Headroom-reported benchmarks, not independent aicoolies measurements.

open-sourceOpen SourceTelemetry

Deep Lake

AI data runtime for multimodal datasets and vector search

Deep Lake is an open-source AI data runtime from Activeloop for storing, versioning, and querying multimodal data and embeddings. It fits teams building RAG, training, evaluation, or dataset-heavy agent workflows that need a bridge between vector search, structured metadata, and large image, text, audio, or video collections.

open-sourceOpen Source

SeekDB

AI-native state store with hybrid vector and full-text search

SeekDB is an open-source AI-native state store from the OceanBase ecosystem that combines MySQL-compatible data access with hybrid vector and full-text retrieval. It targets agent and AI application teams that need embedded or server deployment, copy-on-write style sandboxes, and searchable state without gluing together several separate storage layers.

open-sourceOpen Source

PageIndex

Vectorless, reasoning-based RAG that reads documents like a human expert — no vector DB, no chunking.

PageIndex is a vectorless, reasoning-based RAG system that builds hierarchical tree indexes from long documents and uses LLMs to navigate them like a human expert would. Instead of chunking text and comparing embeddings, it constructs a table-of-contents-style structure and reasons its way to the right sections — no vector database required. Available as an open-source Python package, cloud API, MCP server, and chat platform.

freemium

VectorChord

High-recall Postgres vector search at billion scale

VectorChord is a Postgres extension from the supervc-stack/VectorChord project that brings high-recall vector search to PostgreSQL. As the spiritual successor to pgvecto.rs, it combines IVF indexes with RaBitQ quantization to deliver Pinecone-class performance at billion-vector scale while keeping all data inside a single Postgres database — no separate vector store, no two-system sync, no rewrites when the workload grows.

open-sourceOpen Source

Infinity

AI-native database for hybrid RAG retrieval

Infinity is an AI-native database from InfiniFlow that unifies dense vectors, sparse vectors, tensors, and full-text search in a single engine. Built for retrieval-augmented generation (RAG) at scale, it powers hybrid search workflows where lexical matching, semantic similarity, and reranking all happen against one storage layer instead of four loosely coupled services.

open-sourceOpen Source

Rig

Build modular, scalable LLM applications in Rust

Open-source Rust library for building scalable, modular, and ergonomic LLM-powered applications. Rig unifies 20+ model providers (OpenAI, Anthropic, Mistral, DeepSeek, Ollama, and more) and 10+ vector stores behind one trait-based interface, supports completion and embedding workflows, multi-turn streaming, and transcription/audio/image generation, with full GenAI Semantic Convention compatibility and WASM-ready core library — production agentic infra for Rust teams.

freeOpen Source

sqlite-vec

Vector search extension for SQLite that runs anywhere

sqlite-vec is a lightweight vector search extension for SQLite written in pure C with zero dependencies. It brings nearest-neighbor search capabilities directly into SQLite databases, enabling AI applications to store and query embeddings without running a separate vector database. The extension works everywhere SQLite runs including Linux, macOS, Windows, WebAssembly in browsers, and even Raspberry Pi devices. Sponsored by Mozilla Builders, Fly.io, and Turso.

freeOpen Source

Hindsight

Agent memory system that learns, not just remembers

Hindsight is an agent memory system that enables AI agents to learn from experience rather than just store conversations. It organizes memories into three biomimetic categories: World knowledge for facts, Experiences for agent events, and Mental Models for learned understanding. The system provides retain, recall, and reflect operations backed by a temporal knowledge graph with parallel retrieval strategies including semantic, keyword, graph traversal, and temporal search.

freemiumOpen Source

WeKnora

Enterprise RAG framework by Tencent

WeKnora is a Tencent-developed LLM-powered knowledge management and Q&A framework for enterprise document understanding and semantic retrieval. Supports 10+ document formats including PDF, Word, Excel, and images with seamless IM platform integration for WeCom, Feishu, Slack, and Telegram. Offers Quick Q&A mode using RAG pipelines and Intelligent Reasoning mode with ReACT agents for complex multi-step reasoning tasks across organizational knowledge bases.

open-sourceOpen Source

RAG-Anything

All-in-one multimodal RAG framework

RAG-Anything is an all-in-one multimodal RAG framework from the University of Hong Kong that processes text, images, tables, and equations through a unified pipeline built on LightRAG. It constructs multi-modal knowledge graphs by extracting multimodal entities and establishing cross-modal relationships. The VLM-Enhanced Query mode integrates visual content into large language models for deeper document understanding beyond plain text retrieval.

open-sourceOpen Source

GitNexus

Graph RAG code knowledge graph for repository exploration

GitNexus is a code-knowledge-graph and Graph RAG app for exploring repository structure before humans or AI coding agents make changes. Write-time source checks support graph/RAG and local-server signals, but not hard local/server-architecture, 14-language, MCP, pricing, or licensing claims; evaluate privacy, scale, and integrations directly.

freemium

QMD

On-device hybrid search engine for your docs and notes

QMD is an on-device search engine built by Tobi Lütke (Shopify CEO) that indexes markdown notes, meeting transcripts, and documentation locally. It combines BM25 full-text search, vector semantic search, and LLM-powered re-ranking into a single hybrid pipeline. Ships with a built-in MCP server for seamless integration with Claude Code, Cursor, and other AI editors. All processing happens on your machine via node-llama-cpp with GGUF models — zero cloud dependency.

free

Kreuzberg

Polyglot document intelligence framework with Rust core

Kreuzberg is a polyglot document intelligence framework with a high-performance Rust core that extracts text, metadata, images, and structured data from 91+ file formats. Available for Python, Ruby, Java, Go, PHP, C#, TypeScript, plus CLI, REST API, and MCP server. Features multiple OCR backends (Tesseract, EasyOCR, PaddleOCR), table extraction with structure preservation, and native async support.

open-sourceOpen Source

Airweave

Context retrieval layer for AI agents and RAG

Airweave is an open-source context retrieval platform that connects AI agents and RAG systems to 50+ apps and databases through a unified search interface. It continuously syncs data from sources like Notion, Slack, GitHub, and databases, making it searchable through LLM-friendly APIs. Airweave includes Python and TypeScript SDKs, MCP support, and a CLI for managing data connections.

freemiumOpen Source

OpenAI Assistants API

Thread-based AI assistant API with tools and file support

OpenAI's platform API for building stateful AI assistants. Manages conversation threads, supports function calling, code interpreter, and file search (RAG) out of the box. Usage-based pricing makes it accessible for startups and enterprises alike, with built-in memory and tool orchestration for production-grade conversational applications.

api-usage-based