aicoolies logo
RAGFlow logo

RAGFlow

Deep document understanding RAG engine

Share
open-sourceOpen Source
Visit Website →

RAGFlow is an open-source RAG engine with 76K+ GitHub stars that provides deep document understanding for building knowledge-based AI applications. Optimizes chunking for 20+ document types including PDFs, Word docs, presentations, and images using layout-aware parsing. Features template-based chunking strategies, citation with source references, multi-recall retrieval combining keyword and semantic search, and a visual knowledge base management interface with drag-and-drop document upload.

RAGFlow is a deep document understanding RAG engine with 76K+ GitHub stars, focused on extracting maximum knowledge from complex documents. Unlike basic RAG tools that treat documents as plain text, RAGFlow uses layout-aware parsing that understands tables, figures, headers, and document structure.

Supports 20+ document types including PDFs, Word, Excel, PowerPoint, images, and web pages. Template-based chunking strategies allow optimizing extraction for different document types — technical papers, financial reports, legal documents each get specialized parsing.

Multi-recall retrieval combines keyword search (BM25) and semantic vector search for higher-quality results. Retrieved chunks include citation references back to source documents, enabling verifiable AI-generated answers.

The visual interface provides knowledge base management with drag-and-drop upload, chunk preview and editing, conversation testing, and API endpoints for integration. Runs as a Docker stack with its own embedding and reranking models.

Pricing

Free and open-source

Platforms

Docker, Self-hosted, API

Categories

Tags

Use Cases

Alternatives

Related Tools

Hermes Agent logo

Hermes Agent

Top Pick

Open-source AI agent framework with persistent memory, reusable skills, tools, and messaging gateways

Hermes Agent is an open-source AI agent framework with persistent memory, reusable skills, 40+ tools, cron jobs, and messaging gateways.

open-sourceOpen Source

Accomplish Coworker

Open-source desktop AI coworker for browsing and code execution.

Accomplish Coworker is an MIT-licensed open-source AI coworker that runs on the desktop, combining computer-use style browsing with code execution so agents can research, implement, run, and debug workflows in one local environment.

open-sourceOpen SourceTelemetry

Headroom

Context compression for LLM apps and coding agents

Headroom is an Apache-2.0 context compression layer for LLM apps and coding agents. It compresses tool output, logs, files, RAG chunks, and agent history through a local library, proxy, wrapper, or MCP server, with retrieval hooks for bringing originals back when needed. Treat its savings numbers as Headroom-reported benchmarks, not independent aicoolies measurements.

open-sourceOpen SourceTelemetry

Codebase Memory MCP

Codebase knowledge graph MCP server for AI coding agents

Codebase Memory MCP is an MIT-licensed MCP server that turns a repository into a persistent code knowledge graph for AI coding agents. It gives Claude Code, Cursor, Codex-style agents, and other MCP clients structural queries for functions, classes, call chains, routes, and architecture, helping them explore large projects without repeatedly rereading files or relying only on broad search.

open-sourceOpen SourceTelemetry
BeeAI Framework logo

BeeAI Framework

Python and TypeScript framework for production multi-agent systems

BeeAI Framework is an Apache-2.0 toolkit for building production-ready AI agents and multi-agent systems in Python and TypeScript. Its docs cover agents, tools, RAG, memory, workflows, backend providers, serving, and A2A/MCP integration surfaces, making it a vendor-neutral option for teams comparing LangGraph, CrewAI, Mastra, and related agent runtimes.

open-sourceOpen SourceTelemetry
Klavis AI logo

Klavis AI

MCP integration platform for agent tool use at scale

Klavis AI is an Apache-2.0 MCP integration platform for teams connecting AI agents to external SaaS tools and APIs. The public repo and official docs position it as infrastructure for reliable tool access at scale, so it fits teams that want reusable MCP connectors without treating every integration as a one-off script or custom OAuth maintenance project.

open-sourceOpen SourceTelemetry

Used in Stacks

Comparisons