aicoolies logo
Docling logo

Docling

Get your documents ready for gen AI

Share
open-sourceOpen Source
Visit Website →

Docling is an open-source document processing toolkit by IBM Research that converts complex documents into structured formats optimized for generative AI applications. It parses PDF, DOCX, PPTX, XLSX, HTML, images, audio, and LaTeX with advanced PDF understanding including layout analysis, reading order detection, and table structure recognition. Docling exports to Markdown, HTML, JSON, and DocTags, and integrates natively with LangChain, LlamaIndex, and other AI frameworks for RAG workflows.

Docling is an open-source toolkit developed by IBM Research Zurich and now hosted under the Linux Foundation's AI and Data Foundation. It streamlines the process of converting unstructured documents into structured, machine-readable formats that large language models and foundation models can easily digest. With over 56,000 GitHub stars and more than 100 releases, Docling has become one of the most popular open-source document intelligence tools, praised by developers for its output quality compared to other solutions.

The toolkit provides advanced PDF understanding that goes beyond simple OCR, using computer vision models to recognize and categorize visual elements on a page including page layout, reading order, table structure, code blocks, formulas, and image classification. It supports a wide range of input formats including PDF, DOCX, PPTX, XLSX, HTML, images, audio files, LaTeX, and plain text. Output can be exported as Markdown, HTML, JSON, WebVTT, or the proprietary DocTags format designed for maximum LLM readability. The companion Granite-Docling vision-language model provides end-to-end document conversion in a single pass at just 258 million parameters.

Docling features a command-line interface, a Python API, and is lightweight enough to run on a standard laptop including Apple Silicon acceleration via MLX. It integrates seamlessly with LangChain, LlamaIndex, and other popular AI frameworks for retrieval-augmented generation and question-answering applications. For enterprise deployments, the Docling OpenShift Operator enables large-scale document ingestion on Kubernetes clusters with Ray Data for distributed processing. An MCP server component allows AI agents to use Docling's conversion capabilities directly within agentic workflows.

Pricing

Free and open-source under MIT license

Platforms

Python, CLI, Docker, Kubernetes, Apple Silicon MLX support

Categories

Tags

Use Cases

Alternatives

Related Tools

Codebase Memory MCP

Codebase knowledge graph MCP server for AI coding agents

Codebase Memory MCP is an MIT-licensed MCP server that turns a repository into a persistent code knowledge graph for AI coding agents. It gives Claude Code, Cursor, Codex-style agents, and other MCP clients structural queries for functions, classes, call chains, routes, and architecture, helping them explore large projects without repeatedly rereading files or relying only on broad search.

open-sourceOpen SourceTelemetry
Unabyss logo

Unabyss

MCP-native personal context vault for keeping AI agents aligned with your work, voice, and projects.

Unabyss is a personal context headquarters for AI agents. It syncs sources such as email, Slack, Notion, Drive, meetings, and professional profiles into structured context files that can be served to MCP-capable clients. The strongest angle is not generic note taking; it is permissioned, reusable context for Claude, Cursor, custom agents, and other tools that otherwise need the same background explained repeatedly.

freemiumTelemetry
tbls logo

tbls

CI-friendly database documentation generator

tbls is an open-source database documentation tool that automatically generates schema documentation in Markdown, with built-in linting to enforce documentation standards and coverage metrics for tables and columns. It supports 13+ databases including PostgreSQL, MySQL, BigQuery, Snowflake, MongoDB, and ClickHouse. Designed for CI integration with GitHub Actions support, tbls runs schema diff detection and documentation enforcement as part of automated pipelines.

open-sourceOpen Source

Context Engineering Intro

Context engineering patterns for AI coding assistants

Context Engineering Intro is an open-source repository by Cole Medin providing structured context engineering patterns for AI coding assistants. Built around Claude Code, it includes .claude command files, PRP templates, and the WISC framework for managing AI context in coding sessions. The repo shows how to structure project context and rules so AI assistants produce reliable, architecture-aware code. With 13K+ GitHub stars, it is a go-to reference for context-first AI coding.

open-sourceOpen Source
Quarkdown logo

Quarkdown

Programmable Markdown typesetting for docs, books, and slides

Quarkdown is a Turing-complete Markdown typesetting system that compiles a single source into print-ready books, academic papers, knowledge bases, or interactive presentations. It extends Markdown with a built-in scripting language featuring functions, variables, and a standard library for full document control. Supports HTML, PDF, and plain text output with live preview and real-time reloading during authoring.

free

QMD

On-device hybrid search engine for your docs and notes

QMD is an on-device search engine built by Tobi Lütke (Shopify CEO) that indexes markdown notes, meeting transcripts, and documentation locally. It combines BM25 full-text search, vector semantic search, and LLM-powered re-ranking into a single hybrid pipeline. Ships with a built-in MCP server for seamless integration with Claude Code, Cursor, and other AI editors. All processing happens on your machine via node-llama-cpp with GGUF models — zero cloud dependency.

free