# local-first
31 tools tagged
Showing 24 of 31 tools
Accomplish Coworker
Open-source desktop AI coworker for browsing and code execution.
Accomplish Coworker is an MIT-licensed open-source AI coworker that runs on the desktop, combining computer-use style browsing with code execution so agents can research, implement, run, and debug workflows in one local environment.
Safari MCP Server
Apple's Safari-native MCP server for web debugging agents
Safari MCP Server is Apple's safaridriver-based MCP server in Safari Technology Preview, giving compatible coding agents local access to Safari page content, console logs, network requests, screenshots, JavaScript evaluation, interactions, viewport controls, and accessibility/performance checks.
Headroom
Context compression for LLM apps and coding agents
Headroom is an Apache-2.0 context compression layer for LLM apps and coding agents. It compresses tool output, logs, files, RAG chunks, and agent history through a local library, proxy, wrapper, or MCP server, with retrieval hooks for bringing originals back when needed. Treat its savings numbers as Headroom-reported benchmarks, not independent aicoolies measurements.
Windows-MCP
MCP server for controlling Windows desktops through UIAutomation
Windows-MCP is an open-source MCP server for giving AI agents structured access to Windows desktop automation. It focuses on UIAutomation, snapshots, input control, and Windows-specific app workflows, making it different from general filesystem or shell MCP servers.
BrowserOS
Open-source agentic browser that runs local AI agents in your browsing workflow.
BrowserOS is a privacy-first, open-source agentic browser for running AI assistants locally inside real browsing sessions instead of handing every task to a remote cloud browser.
OpenHuman
Local-first personal AI agent with memory trees, desktop integrations, and private workspace context.
OpenHuman is an open-source, local-first personal AI agent from TinyHumans. It combines a desktop app, persistent memory trees, Obsidian-compatible storage, OAuth integrations, and local model support into a private assistant harness. It is most interesting for users who want agentic workflows and long-term memory without handing every context detail to a fully cloud-hosted assistant.
Grok Build
xAI's terminal coding agent with parallel subagents and worktree-aware automation
Grok Build is xAI's terminal-first coding agent for planning, editing, testing, and reviewing code from a local CLI. The early beta exposes subagent controls, worktree mode, headless JSON output, best-of-N parallel attempts, sandbox profiles, and experimental memory. It fits developers comparing Claude Code, Codex, and Gemini CLI for local agentic workflows with deeper parallel execution.
WOZCODE
Cut Claude Code token costs by up to 50% with a local plugin that never uploads your code.
WOZCODE is a Claude Code plugin that reduces token consumption by 25–55% using smarter context reads, batched file edits, AST truncation, and Haiku subagents. It installs in seconds with two CLI commands, runs entirely locally with no code upload, and requires no account sign-up. Developers report finishing the same tasks in fewer tokens without changing their existing editor or workflow.
exo
Run frontier AI models across a cluster of everyday devices
exo turns multiple local machines into a unified AI compute cluster for models that exceed a single device's memory. It automatically discovers devices, uses topology-aware auto parallelism to split work across available resources, and supports RDMA over Thunderbolt 5 for co-located clusters or standard networking for looser setups. The project exposes OpenAI Chat Completions, Claude Messages, OpenAI Responses, and Ollama-compatible APIs plus a dashboard for cluster management.
Lemonade
AMD's open-source local LLM server with GPU and NPU acceleration
Lemonade is AMD's open-source local AI serving platform for LLMs, image generation, speech recognition, and text-to-speech on your own hardware. Built in lightweight C++, it can detect CPU, GPU, and NPU backends and is extra optimized for Ryzen AI, Radeon, and Strix Halo PCs. Lemonade exposes OpenAI, Anthropic, and Ollama-compatible APIs, ships with a desktop model manager, and supports source-confirmed GGUF, FLM, and ONNX models across Windows, Linux, macOS, and Docker.
Claude-Mem
Persistent memory plugin for Claude Code with automatic context injection
Claude-Mem is a persistent memory plugin for Claude Code with 44,000+ GitHub stars that captures session context and injects it into future sessions. It features progressive disclosure with token cost visibility, automatic compression, and privacy controls with private tags to manage what gets remembered across coding sessions.
Memvid
Single-file memory layer replacing complex RAG for AI agents
Memvid is an open-source single-file memory system for AI agents with 13,700+ GitHub stars. It replaces complex RAG infrastructure with instant retrieval from portable .mv2 files, claiming 35% accuracy improvement over state-of-the-art on LoCoMo benchmarks with 0.025ms P50 latency. Available for Python, Node.js, Rust, and CLI.
Dyad
Local open-source AI app builder running entirely on your machine
Dyad is a local-first, open-source AI app builder with 20,000+ GitHub stars that provides a Lovable and Bolt.new alternative running entirely on your machine. It supports React and Next.js frameworks, integrates with Ollama for fully offline AI generation, and works cross-platform on macOS, Windows, and Linux with both cloud and local LLM providers.
FastGPT
No-code knowledge base platform with visual AI workflow and built-in RAG
FastGPT is an open-source no-code AI knowledge base platform with 27,000+ GitHub stars and 500,000+ users worldwide. It combines visual workflow orchestration, built-in RAG pipelines, QA-pair extraction, and API-aligned completions into a single deployable stack that runs on just 2GB RAM via Docker one-liner deployment.
Unsloth
2x faster LLM fine-tuning with 70% less VRAM on a single GPU
Unsloth is an open-source framework for fine-tuning large language models up to 2x faster while using 70% less VRAM. Built with custom Triton kernels, it supports 500+ model architectures including Llama 4, Qwen 3, and DeepSeek on consumer NVIDIA GPUs. Unsloth Studio adds a no-code web UI for dataset creation, training observability, model comparison, and GGUF export for Ollama and vLLM deployment.
Lume
macOS and Linux VM runtime for AI agents on Apple Silicon
Lume is an open-source CLI for creating and managing macOS and Linux virtual machines on Apple Silicon, built specifically for AI agent sandboxing, CI/CD pipelines, and desktop automation. Using Apple's native Virtualization.Framework for near-native performance, it provides the missing isolation layer for running coding agents safely — so an accidental destructive command doesn't affect your host machine.
LightRAG
Knowledge graph-powered RAG framework from HKU
LightRAG is a research-backed RAG framework from Hong Kong University that combines knowledge graph structures with vector search for more contextual retrieval. Published at EMNLP 2025, it extracts entities and relationships from documents to build a structured knowledge graph, then uses dual-level retrieval across both graph and vector representations with five query modes: naive, local, global, hybrid, and mix.
Beszel
Lightweight server monitoring with Docker stats and alerts
Beszel is a lightweight, self-hosted server monitoring platform built in Go that tracks CPU, memory, disk, network, GPU, temperature, and Docker container metrics with historical data visualization and configurable alerts. Its simple hub-and-agent architecture deploys in minutes and consumes minimal resources compared to traditional monitoring stacks like Prometheus and Grafana.
OpenClaw
Open-source personal AI agent for messaging apps
OpenClaw is a free, open-source AI agent framework that turns any LLM into an autonomous personal assistant accessible through messaging apps like WhatsApp, Telegram, Discord, and Signal. Running entirely on your local machine via a Node.js gateway, it connects AI models to system tools, browsers, files, and APIs for multi-step task execution with persistent memory across sessions.
Microsandbox
Local microVM sandboxes for AI agent code execution
Microsandbox provides hardware-level isolated sandboxes for AI agents to execute code safely on local machines. Using libkrun microVMs and a 320ms bare-metal Linux/KVM homepage benchmark, it offers stronger isolation than Docker containers while staying lightweight enough for dev workstations. OCI-compatible with Python and Node.js runtimes. Apache-2.0 licensed with 6.6K+ GitHub stars.
Hyprnote
Local-first AI notepad for meetings and voice notes
Hyprnote is a local-first AI notepad designed for capturing and processing meeting notes and voice recordings. It runs entirely on-device for privacy, transcribes audio using local models, and generates structured summaries, action items, and follow-ups. Built with Rust and Tauri for native desktop performance. Over 8,000 GitHub stars with strong privacy-focused community adoption.
Khoj
Open-source AI second brain with deep research and RAG
Khoj is an open-source personal AI app that serves as a self-hostable second brain. It connects to your documents — PDFs, Markdown, Notion, Word — and uses RAG to answer questions grounded in your knowledge base. Supports any local or cloud LLM including Llama, Claude, GPT, and Gemini. Features custom agents, scheduled automations, deep research mode, semantic search, and Obsidian, Emacs, and WhatsApp integrations. Over 33,000 GitHub stars, YC-backed.
Open WebUI
Self-hosted AI platform with ChatGPT-like interface for local and cloud LLMs.
Extensible, self-hosted AI platform with 290M+ Docker pulls and 124K+ GitHub stars. Supports Ollama, OpenAI-compatible APIs, and any Chat Completions backend. Features built-in RAG, multi-user RBAC, voice/video calls, Python function workspace, model builder, and web browsing. Runs entirely offline with enterprise features including SSO and audit logging.
LM Studio
Run local LLMs with an intuitive desktop GUI and OpenAI-compatible API server.
Free desktop application by Element Labs for discovering, downloading, and running open-source LLMs locally. Features a curated Hugging Face model browser, side-by-side model comparison, parameter tuning, and an OpenAI-compatible API server on localhost:1234. Powered by llama.cpp with Metal acceleration for Apple Silicon.