aicoolies logo
Airweave logo

Airweave

Context retrieval layer for AI agents and RAG

Share
freemiumOpen Source
Visit Website →

Airweave is an open-source context retrieval platform that connects AI agents and RAG systems to 50+ apps and databases through a unified search interface. It continuously syncs data from sources like Notion, Slack, GitHub, and databases, making it searchable through LLM-friendly APIs. Airweave includes Python and TypeScript SDKs, MCP support, and a CLI for managing data connections.

Airweave solves the data connectivity problem that every RAG system and AI agent faces: getting up-to-date information from the dozens of tools and databases where an organization's knowledge actually lives. Instead of building custom integrations for each data source, Airweave provides a single platform that connects to over 50 services including Notion, Slack, GitHub, Jira, Google Drive, Confluence, and various databases, then continuously syncs and indexes that data for retrieval.

The platform handles the complete pipeline from data extraction through chunking, embedding, and indexing, exposing the results through a unified search API that AI agents can query naturally. MCP server support means coding agents like Claude Code and Cursor can access organizational knowledge directly. The sync engine runs incrementally, updating only changed data to minimize compute and API costs. Developers configure connections through a web dashboard, CLI, or programmatically via Python and TypeScript SDKs.

Backed by Y Combinator X25 with over 6,200 GitHub stars and an MIT license, Airweave has gained traction among teams building AI products that need access to real-world business data. The project maintains an aggressive release cadence with 457 releases and nearly 5,000 commits. For organizations implementing RAG or building AI agents that need to answer questions about internal data, Airweave provides the data plumbing that eliminates months of custom integration work.

Pricing

Free open source under MIT — cloud plans available

Platforms

Docker, self-hosted — Python and TypeScript SDKs

Categories

Tags

Use Cases

Alternatives

Related Tools

Deep Lake logo

Deep Lake

AI data runtime for multimodal datasets and vector search

Deep Lake is an open-source AI data runtime from Activeloop for storing, versioning, and querying multimodal data and embeddings. It fits teams building RAG, training, evaluation, or dataset-heavy agent workflows that need a bridge between vector search, structured metadata, and large image, text, audio, or video collections.

open-sourceOpen Source
SeekDB logo

SeekDB

AI-native state store with hybrid vector and full-text search

SeekDB is an open-source AI-native state store from the OceanBase ecosystem that combines MySQL-compatible data access with hybrid vector and full-text retrieval. It targets agent and AI application teams that need embedded or server deployment, copy-on-write style sandboxes, and searchable state without gluing together several separate storage layers.

open-sourceOpen Source
Marqo logo

Marqo

Embedding-first search and discovery engine for AI-powered product experiences.

Marqo is an open-source tensor search engine that combines embedding generation and vector search in a single API, removing the need to manage separate embedding pipelines and vector databases. Built for product discovery and multi-modal search, it lets teams index text, images, and structured data together, returning ranked results based on semantic similarity rather than keyword overlap.

freemium
VectorChord logo

VectorChord

High-recall Postgres vector search at billion scale

VectorChord is a Postgres extension from the supervc-stack/VectorChord project that brings high-recall vector search to PostgreSQL. As the spiritual successor to pgvecto.rs, it combines IVF indexes with RaBitQ quantization to deliver Pinecone-class performance at billion-vector scale while keeping all data inside a single Postgres database — no separate vector store, no two-system sync, no rewrites when the workload grows.

open-sourceOpen Source
Infinity logo

Infinity

AI-native database for hybrid RAG retrieval

Infinity is an AI-native database from InfiniFlow that unifies dense vectors, sparse vectors, tensors, and full-text search in a single engine. Built for retrieval-augmented generation (RAG) at scale, it powers hybrid search workflows where lexical matching, semantic similarity, and reranking all happen against one storage layer instead of four loosely coupled services.

open-sourceOpen Source
Magika logo

Magika

AI-powered file-type detection at Google scale

Open-source AI-powered file-type detection tool from Google that uses a custom deep-learning model under a few megabytes to identify more than 200 binary and textual content types in milliseconds, even on a single CPU. Magika ships as a CLI, Python package, JavaScript/TypeScript library, and an ONNX model, achieves around 99% accuracy on its test set, and is already used at Google scale across Gmail, Drive, and Safe Browsing as well as by VirusTotal and abuse.ch.

freeOpen Source