aicoolies logo

RAG-Anything

All-in-one multimodal RAG framework

Share
open-sourceOpen Source
Visit Website →

RAG-Anything is an all-in-one multimodal RAG framework from the University of Hong Kong that processes text, images, tables, and equations through a unified pipeline built on LightRAG. It constructs multi-modal knowledge graphs by extracting multimodal entities and establishing cross-modal relationships. The VLM-Enhanced Query mode integrates visual content into large language models for deeper document understanding beyond plain text retrieval.

RAG-Anything builds on the popular LightRAG project to deliver an all-in-one multimodal Retrieval-Augmented Generation system. Traditional RAG pipelines handle text well but struggle with images, tables, equations, and diagrams embedded in real-world documents. RAG-Anything eliminates the need for multiple specialized extraction tools by parsing all content modalities through a unified pipeline that preserves cross-modal relationships and hierarchical document structure.

At its core, the framework constructs multimodal knowledge graphs that capture entities and relationships across text and visual content. When a document includes images or charts, the VLM-Enhanced Query mode feeds them directly into a vision-language model alongside the textual context, enabling answers that draw on both visual and semantic information. This multi-stage architecture supports context-aware processing where each content type receives format-appropriate extraction before merging into the unified graph.

Developed by the HKUDS research group at the University of Hong Kong, RAG-Anything is released under an open-source license and installable via pip. It integrates cleanly with existing LLM orchestration workflows and supports customizable retrieval strategies. The project has rapidly gained community traction, reflecting strong demand for RAG systems that move beyond text-only retrieval to handle the multimodal reality of modern enterprise documents.

Pricing

Free and open source under MIT license

Platforms

Python library, pip installable

Categories

Tags

Use Cases

Alternatives

Related Tools

Hermes Agent logo

Hermes Agent

Top Pick

Open-source AI agent framework with persistent memory, reusable skills, tools, and messaging gateways

Hermes Agent is an open-source AI agent framework with persistent memory, reusable skills, 40+ tools, cron jobs, and messaging gateways.

open-sourceOpen Source
Re_gent logo

Re_gent

Version control for AI coding-agent actions

Re_gent is an open-source version-control layer for AI coding-agent activity. Instead of only reviewing the final Git diff, it records what the agent attempted, changed, and executed along the way so teams can trace, undo, and govern autonomous coding work. It fits Claude Code, Codex, Cursor, and multi-agent teams that need an audit trail between prompt and pull request.

open-sourceOpen Source

agentmemory

Persistent memory layer for AI coding agents — keeps Claude Code, Codex, Cursor, and any MCP agent in context across sessions

agentmemory is an open-source MCP server that gives AI coding agents persistent, cross-session memory. Built on hybrid vector-graph search, it achieves 95.2% recall on the LongMemEval-S benchmark while using up to 92% fewer context tokens than naive context injection. Works out of the box with Claude Code, Codex, Cursor, Windsurf, Cline, OpenCode, Kilo Code, Hermes, and any MCP client through 51 MCP tools plus 12 hooks and 4 skills.

open-sourceOpen Source
fast-agent logo

fast-agent

MCP, ACP and Skills support for building production coding agents — interactive or automated.

fast-agent is an Apache-licensed Python framework for building and running LLM agents with full MCP (Model Context Protocol) and ACP support. It ships with an interactive shell mode, Skills management, and multi-model routing — making it a practical platform for coding agents, workflow automation, and agent evaluation across Claude, Codex, HuggingFace, and local models.

open-source
Omnara logo

Omnara

Command center for Claude Code and Codex — monitor, steer, and voice-control your AI agents from any device.

Omnara is a command center for AI coding agents, letting you run, monitor, and steer Claude Code and Codex sessions from your phone, web browser, Apple Watch, or any device while the agent runs on your machine. Sessions migrate to the cloud when your laptop goes offline, and the voice-first interface lets you guide your agent hands-free. Built by a YC S25 team and available with a free tier plus paid plans across desktop, web, and mobile clients.

freemium
PageIndex logo

PageIndex

Vectorless, reasoning-based RAG that reads documents like a human expert — no vector DB, no chunking.

PageIndex is a vectorless, reasoning-based RAG system that builds hierarchical tree indexes from long documents and uses LLMs to navigate them like a human expert would. Instead of chunking text and comparing embeddings, it constructs a table-of-contents-style structure and reasons its way to the right sections — no vector database required. Available as an open-source Python package, cloud API, MCP server, and chat platform.

freemium