# local-first

31 tools tagged

Showing 24 of 31 tools

Accomplish Coworker

Open-source desktop AI coworker for browsing and code execution.

Accomplish Coworker is an MIT-licensed open-source AI coworker that runs on the desktop, combining computer-use style browsing with code execution so agents can research, implement, run, and debug workflows in one local environment.

open-sourceOpen SourceTelemetry

Safari MCP Server

Apple's Safari-native MCP server for web debugging agents

Safari MCP Server is Apple's safaridriver-based MCP server in Safari Technology Preview, giving compatible coding agents local access to Safari page content, console logs, network requests, screenshots, JavaScript evaluation, interactions, viewport controls, and accessibility/performance checks.

freeTelemetry

Headroom

Context compression for LLM apps and coding agents

Headroom is an Apache-2.0 context compression layer for LLM apps and coding agents. It compresses tool output, logs, files, RAG chunks, and agent history through a local library, proxy, wrapper, or MCP server, with retrieval hooks for bringing originals back when needed. Treat its savings numbers as Headroom-reported benchmarks, not independent aicoolies measurements.

open-sourceOpen SourceTelemetry

Windows-MCP

MCP server for controlling Windows desktops through UIAutomation

Windows-MCP is an open-source MCP server for giving AI agents structured access to Windows desktop automation. It focuses on UIAutomation, snapshots, input control, and Windows-specific app workflows, making it different from general filesystem or shell MCP servers.

open-sourceOpen Source

BrowserOS

Open-source agentic browser that runs local AI agents in your browsing workflow.

BrowserOS is a privacy-first, open-source agentic browser for running AI assistants locally inside real browsing sessions instead of handing every task to a remote cloud browser.

open-sourceOpen Source

OpenHuman

Local-first personal AI agent with memory trees, desktop integrations, and private workspace context.

OpenHuman is an open-source, local-first personal AI agent from TinyHumans. It combines a desktop app, persistent memory trees, Obsidian-compatible storage, OAuth integrations, and local model support into a private assistant harness. It is most interesting for users who want agentic workflows and long-term memory without handing every context detail to a fully cloud-hosted assistant.

open-sourceOpen SourceTelemetry

Grok Build

xAI's terminal coding agent with parallel subagents and worktree-aware automation

Grok Build is xAI's terminal-first coding agent for planning, editing, testing, and reviewing code from a local CLI. The early beta exposes subagent controls, worktree mode, headless JSON output, best-of-N parallel attempts, sandbox profiles, and experimental memory. It fits developers comparing Claude Code, Codex, and Gemini CLI for local agentic workflows with deeper parallel execution.

paid

WOZCODE

Cut Claude Code token costs by up to 50% with a local plugin that never uploads your code.

WOZCODE is a Claude Code plugin that reduces token consumption by 25–55% using smarter context reads, batched file edits, AST truncation, and Haiku subagents. It installs in seconds with two CLI commands, runs entirely locally with no code upload, and requires no account sign-up. Developers report finishing the same tasks in fewer tokens without changing their existing editor or workflow.

freemium

exo

Run frontier AI models across a cluster of everyday devices

exo turns multiple local machines into a unified AI compute cluster for models that exceed a single device's memory. It automatically discovers devices, uses topology-aware auto parallelism to split work across available resources, and supports RDMA over Thunderbolt 5 for co-located clusters or standard networking for looser setups. The project exposes OpenAI Chat Completions, Claude Messages, OpenAI Responses, and Ollama-compatible APIs plus a dashboard for cluster management.

open-sourceOpen Source

Lemonade

AMD's open-source local LLM server with GPU and NPU acceleration

Lemonade is AMD's open-source local AI serving platform for LLMs, image generation, speech recognition, and text-to-speech on your own hardware. Built in lightweight C++, it can detect CPU, GPU, and NPU backends and is extra optimized for Ryzen AI, Radeon, and Strix Halo PCs. Lemonade exposes OpenAI, Anthropic, and Ollama-compatible APIs, ships with a desktop model manager, and supports source-confirmed GGUF, FLM, and ONNX models across Windows, Linux, macOS, and Docker.

open-sourceOpen Source

Claude-Mem

Persistent memory plugin for Claude Code with automatic context injection

Claude-Mem is a persistent memory plugin for Claude Code with 44,000+ GitHub stars that captures session context and injects it into future sessions. It features progressive disclosure with token cost visibility, automatic compression, and privacy controls with private tags to manage what gets remembered across coding sessions.

open-sourceOpen Source

Memvid

Single-file memory layer replacing complex RAG for AI agents

Memvid is an open-source single-file memory system for AI agents with 13,700+ GitHub stars. It replaces complex RAG infrastructure with instant retrieval from portable .mv2 files, claiming 35% accuracy improvement over state-of-the-art on LoCoMo benchmarks with 0.025ms P50 latency. Available for Python, Node.js, Rust, and CLI.

open-sourceOpen Source

Dyad

Local open-source AI app builder running entirely on your machine

Dyad is a local-first, open-source AI app builder with 20,000+ GitHub stars that provides a Lovable and Bolt.new alternative running entirely on your machine. It supports React and Next.js frameworks, integrates with Ollama for fully offline AI generation, and works cross-platform on macOS, Windows, and Linux with both cloud and local LLM providers.

open-sourceOpen Source

FastGPT

No-code knowledge base platform with visual AI workflow and built-in RAG

FastGPT is an open-source no-code AI knowledge base platform with 27,000+ GitHub stars and 500,000+ users worldwide. It combines visual workflow orchestration, built-in RAG pipelines, QA-pair extraction, and API-aligned completions into a single deployable stack that runs on just 2GB RAM via Docker one-liner deployment.

free

Unsloth

2x faster LLM fine-tuning with 70% less VRAM on a single GPU

Unsloth is an open-source framework for fine-tuning large language models up to 2x faster while using 70% less VRAM. Built with custom Triton kernels, it supports 500+ model architectures including Llama 4, Qwen 3, and DeepSeek on consumer NVIDIA GPUs. Unsloth Studio adds a no-code web UI for dataset creation, training observability, model comparison, and GGUF export for Ollama and vLLM deployment.

open-sourceOpen Source

Lume

macOS and Linux VM runtime for AI agents on Apple Silicon

Lume is an open-source CLI for creating and managing macOS and Linux virtual machines on Apple Silicon, built specifically for AI agent sandboxing, CI/CD pipelines, and desktop automation. Using Apple's native Virtualization.Framework for near-native performance, it provides the missing isolation layer for running coding agents safely — so an accidental destructive command doesn't affect your host machine.

open-sourceOpen Source

LightRAG

Knowledge graph-powered RAG framework from HKU

LightRAG is a research-backed RAG framework from Hong Kong University that combines knowledge graph structures with vector search for more contextual retrieval. Published at EMNLP 2025, it extracts entities and relationships from documents to build a structured knowledge graph, then uses dual-level retrieval across both graph and vector representations with five query modes: naive, local, global, hybrid, and mix.

open-sourceOpen Source

Beszel

Lightweight server monitoring with Docker stats and alerts

Beszel is a lightweight, self-hosted server monitoring platform built in Go that tracks CPU, memory, disk, network, GPU, temperature, and Docker container metrics with historical data visualization and configurable alerts. Its simple hub-and-agent architecture deploys in minutes and consumes minimal resources compared to traditional monitoring stacks like Prometheus and Grafana.

open-sourceOpen Source

OpenClaw

Open-source personal AI agent for messaging apps

OpenClaw is a free, open-source AI agent framework that turns any LLM into an autonomous personal assistant accessible through messaging apps like WhatsApp, Telegram, Discord, and Signal. Running entirely on your local machine via a Node.js gateway, it connects AI models to system tools, browsers, files, and APIs for multi-step task execution with persistent memory across sessions.

open-sourceOpen Source

Microsandbox

Local microVM sandboxes for AI agent code execution

Microsandbox provides hardware-level isolated sandboxes for AI agents to execute code safely on local machines. Using libkrun microVMs and a 320ms bare-metal Linux/KVM homepage benchmark, it offers stronger isolation than Docker containers while staying lightweight enough for dev workstations. OCI-compatible with Python and Node.js runtimes. Apache-2.0 licensed with 6.6K+ GitHub stars.

open-sourceOpen Source

Hyprnote

Local-first AI notepad for meetings and voice notes

Hyprnote is a local-first AI notepad designed for capturing and processing meeting notes and voice recordings. It runs entirely on-device for privacy, transcribes audio using local models, and generates structured summaries, action items, and follow-ups. Built with Rust and Tauri for native desktop performance. Over 8,000 GitHub stars with strong privacy-focused community adoption.

open-sourceOpen Source

Khoj

Open-source AI second brain with deep research and RAG

Khoj is an open-source personal AI app that serves as a self-hostable second brain. It connects to your documents — PDFs, Markdown, Notion, Word — and uses RAG to answer questions grounded in your knowledge base. Supports any local or cloud LLM including Llama, Claude, GPT, and Gemini. Features custom agents, scheduled automations, deep research mode, semantic search, and Obsidian, Emacs, and WhatsApp integrations. Over 33,000 GitHub stars, YC-backed.

freemiumOpen Source

Open WebUI

Self-hosted AI platform with ChatGPT-like interface for local and cloud LLMs.

Extensible, self-hosted AI platform with 290M+ Docker pulls and 124K+ GitHub stars. Supports Ollama, OpenAI-compatible APIs, and any Chat Completions backend. Features built-in RAG, multi-user RBAC, voice/video calls, Python function workspace, model builder, and web browsing. Runs entirely offline with enterprise features including SSO and audit logging.

free

LM Studio

Run local LLMs with an intuitive desktop GUI and OpenAI-compatible API server.

Free desktop application by Element Labs for discovering, downloading, and running open-source LLMs locally. Features a curated Hugging Face model browser, side-by-side model comparison, parameter tuning, and an OpenAI-compatible API server on localhost:1234. Powered by llama.cpp with Metal acceleration for Apple Silicon.

free