LightRAG is a retrieval-augmented generation framework from HKU Data Science that combines knowledge-graph structures with vector retrieval. Published around the EMNLP 2025 paper with roughly 36K+ GitHub stars at write time, it addresses a core limitation of ordinary RAG systems: flat vector chunks often miss the relationships between entities, systems, people, and concepts.
The framework extracts entities and relationships from documents, stores them alongside vector representations, and supports multiple retrieval modes for local graph traversal, global graph context, hybrid retrieval, and broader mixed strategies. It also supports incremental updates so new documents can be added without rebuilding an entire knowledge graph, which is important for living knowledge bases.
LightRAG is MIT licensed and works with a broad set of storage backends including PostgreSQL, MongoDB, Neo4j, Milvus, Qdrant, ChromaDB, Faiss, and related options. The ecosystem includes RAG-Anything for multimodal documents, tables, formulas, and images. Extraction quality depends on the selected LLM and corpus complexity, so production teams should validate graph quality instead of assuming a fixed model-size rule.