11 tools tagged
Showing 11 of 11 tools
On-device AI inference engine for mobile and wearable applications
Cactus is a YC-backed open-source inference engine built specifically for running LLMs, vision models, and embeddings on smartphones, tablets, and wearable devices. It provides native SDKs for iOS, Android, Flutter, and React Native with optimized ARM CPU and Apple NPU execution paths. Cactus achieves the fastest inference speeds on ARM processors with 10x lower RAM usage compared to generic runtimes, enabling privacy-first AI applications that run entirely on-device.
Microsoft's framework for running 1-bit large language models on consumer CPUs
BitNet is Microsoft's official inference framework for 1-bit quantized large language models that enables running models with up to 100 billion parameters on standard consumer CPUs without requiring a GPU. By leveraging extreme quantization where weights use only 1.58 bits on average, BitNet achieves dramatic reductions in memory footprint and computational cost while maintaining competitive output quality for many practical use cases.
AMD's open-source local LLM server with GPU and NPU acceleration
Lemonade is AMD's open-source local AI serving platform that runs LLMs, image generation, speech recognition, and text-to-speech directly on your hardware. Built in lightweight C++, it automatically detects and configures optimal CPU, GPU, and NPU backends. Lemonade exposes an OpenAI-compatible API so existing applications work without code changes, and ships with a desktop app for model management and testing. Supports GGUF, ONNX, and SafeTensors across Windows, Linux, macOS, and Docker.
Local-first AI notepad for meetings and voice notes
Hyprnote is a local-first AI notepad designed for capturing and processing meeting notes and voice recordings. It runs entirely on-device for privacy, transcribes audio using local models, and generates structured summaries, action items, and follow-ups. Built with Rust and Tauri for native desktop performance. Over 8,000 GitHub stars with strong privacy-focused community adoption.
Self-hosted AI platform with ChatGPT-like interface for local and cloud LLMs.
Extensible, self-hosted AI platform with 290M+ Docker pulls and 124K+ GitHub stars. Supports Ollama, OpenAI-compatible APIs, and any Chat Completions backend. Features built-in RAG, multi-user RBAC, voice/video calls, Python function workspace, model builder, and web browsing. Runs entirely offline with enterprise features including SSO and audit logging.
Run local LLMs with an intuitive desktop GUI and OpenAI-compatible API server.
Free desktop application by Element Labs for discovering, downloading, and running open-source LLMs locally. Features a curated Hugging Face model browser, side-by-side model comparison, parameter tuning, and an OpenAI-compatible API server on localhost:1234. Powered by llama.cpp with Metal acceleration for Apple Silicon.
Extensible self-documenting editor with Elisp environment
Microsoft's free, extensible code editor that dominates the developer tools market. Rich extension marketplace, integrated terminal, Git support, and IntelliSense code completion. Supports virtually every programming language and framework. The default IDE for millions of developers worldwide, offering a balance of lightweight performance and full-featured development capabilities through its massive plugin ecosystem.
The ubiquitous modal text editor
Full-featured IDE suite from JetBrains covering every major programming language — IntelliJ IDEA for Java/Kotlin, PyCharm for Python, WebStorm for JavaScript, and more. Known for deep code intelligence, powerful refactoring, and built-in database tools. The professional choice for developers who need comprehensive language-specific tooling with AI assistance integrated directly into their workflow.
Hyperextensible Vim-based text editor
Modern fork of Vim with Lua-based plugin architecture and built-in LSP support. Lightweight yet endlessly extensible, it serves as the foundation for AI-enhanced development workflows through plugins like Avante, Codecompanion, and Copilot.lua. With 85k+ GitHub stars, it's the terminal editor of choice for developers who value speed, customization, and keyboard-driven efficiency.
Run LLMs locally with one command
Tool for running large language models locally on your machine with a simple CLI interface. Download and run Llama 3, Mistral, Gemma, Phi, Code Llama, and dozens of other open-source models with a single command. Features model management, GPU acceleration (NVIDIA/AMD/Apple Silicon), OpenAI-compatible API server, Modelfile for customization, and multi-model switching. Ideal for offline AI development, privacy-sensitive use cases, and local testing. 120K+ GitHub stars.
A fast, cross-platform terminal
Self-described 'fastest terminal emulator in existence' — a GPU-accelerated, cross-platform terminal written in Rust focused on simplicity and performance. No tabs, splits, or built-in multiplexer — designed to pair with tmux or Zellij. Configured via YAML with a minimal feature set that prioritizes speed above all else. Supports true color, Vi mode, regex search, and clickable URLs. Available on macOS, Linux, Windows, and BSD. 57K+ GitHub stars.