aicoolies logo

# llm

90 tools tagged

Showing 24 of 90 tools

KubeAI

Kubernetes operator for serving AI inference workloads

KubeAI is an Apache-2.0 Kubernetes operator for deploying and scaling AI inference workloads, including LLMs, embeddings, reranking, and speech-to-text. It gives platform teams OpenAI-compatible endpoints, model proxy/controller primitives, model caching, scale-from-zero behavior, and cluster-native resource management for self-hosted inference on Kubernetes.

open-sourceOpen Source
agentmemory logo

agentmemory

Persistent memory layer for AI coding agents — keeps Claude Code, Codex, Cursor, and any MCP agent in context across sessions

agentmemory is an open-source MCP server that gives AI coding agents persistent, cross-session memory. Built on hybrid vector-graph search, it achieves 95.2% recall on the LongMemEval-S benchmark while using up to 92% fewer context tokens than naive context injection. Works out of the box with Claude Code, Codex, Cursor, Windsurf, Cline, OpenCode, Kilo Code, Hermes, and any MCP client through 51 MCP tools plus 12 hooks and 4 skills.

open-sourceOpen Source
Judgeval logo

Judgeval

Open-source post-building layer for agents — tracing, evals, and online monitoring

Judgeval is the open-source post-building layer for AI agents from Judgment Labs, providing OpenTelemetry-based tracing, hosted and custom evaluation scorers, and online behavior monitoring for LLM-powered applications. Instrument any function with a single decorator, score live production traffic against faithfulness and instruction-adherence checks, and feed real-world failures back into reinforcement learning or supervised fine-tuning loops.

open-sourceOpen Source
TraceRoot logo

TraceRoot

Open-source observability and self-healing layer for AI agents

TraceRoot is a YC S25-backed open-source observability platform purpose-built for AI agents and LLM apps. It combines OpenTelemetry-compatible tracing with an agentic debugging runtime that reads your source code, correlates failures with recent commits, and proposes fix PRs automatically. BYOK support spans seven LLM providers; the entire stack runs self-hosted via Docker Compose, with TraceRoot Cloud available for managed deployments.

open-sourceOpen Source
GraphBit logo

GraphBit

Rust-native multi-agent orchestration for production

GraphBit is a Rust-native, multi-agent orchestration framework built for production. It targets the gap between Python-first frameworks like LangGraph and the operational expectations of enterprise systems — predictable memory, low latency, deterministic concurrency, and the ability to embed an agent runtime in services that already run Rust without dragging in a Python interpreter.

open-sourceOpen Source
OpenSRE logo

OpenSRE

Open-source toolkit for building AI SRE incident response agents

OpenSRE is Tracer Cloud’s open-source public-alpha Python toolkit for building AI SRE agents that investigate and respond to production incidents. It ships 60+ tools across observability, databases, incident management, communications, deployment and protocol integrations, plus simulation/evaluation workflows for benchmarking agent accuracy before live pager use.

open-sourceOpen Source
Evolver logo

Evolver

Self-evolution engine for AI agents with auditable updates

Evolver is an open-source self-evolution engine for AI agents that turns run logs into auditable, reviewable updates via its Genome Evolution Protocol. Instead of ad hoc prompt tweaking, teams collect traces and Evolver proposes versioned diffs to prompts, tools and workflows that engineers can approve, reject or roll back like code.

open-sourceOpen Source
genericagent logo

GenericAgent

Self-evolving local computer agent with a reusable skill tree

GenericAgent is a minimal, self-evolving autonomous agent from a 3.3K-line seed and ~3K core loop that gives LLMs system-level control of a local computer. It writes files, runs shell commands, browses the web, and uses keyboard/mouse/screen/mobile tools, while skill crystallization saves successful runs into a reusable skill tree that cuts token cost on repeats.

open-sourceOpen Source
Open SWE logo

Open SWE

Open-source async coding agent you can run in your own sandbox

Open-source framework from LangChain AI for building your organization's internal coding agent — the same pattern Stripe's Minions, Ramp's Inspect, and Coinbase's Cloudbot follow. Built on LangGraph and Deep Agents, Open SWE runs each task in an isolated cloud sandbox (Modal, Daytona, Runloop, or LangSmith), invokes from Slack, Linear, or GitHub, orchestrates subagents, and opens pull requests autonomously — customizable end-to-end for your codebase and conventions.

freeOpen Source
Rig logo

Rig

Build modular, scalable LLM applications in Rust

Open-source Rust library for building scalable, modular, and ergonomic LLM-powered applications. Rig unifies 20+ model providers (OpenAI, Anthropic, Mistral, DeepSeek, Ollama, and more) and 10+ vector stores behind one trait-based interface, supports completion and embedding workflows, multi-turn streaming, and transcription/audio/image generation, with full GenAI Semantic Convention compatibility and WASM-ready core library — production agentic infra for Rust teams.

freeOpen Source
Open Agents logo

Open Agents

Fork, customize, and ship AI agents on Vercel in minutes

Open Agents is a Vercel Labs open-source template for building and deploying cloud-hosted AI agents. It provides a production-ready Next.js starter with built-in tool use, streaming responses, multi-model support, and deployment on Vercel infrastructure. Developers can fork, customize agent behavior and tools, then ship agent-backed apps in minutes with automatic scaling and edge routing.

freeOpen Source
Guidance logo

Guidance

Constrained generation that guarantees valid LLM outputs every time

Guidance is Microsoft's structured generation library that enforces output constraints directly within LLM decoding. It supports JSON schemas, regex patterns, grammars, and interleaved generation-and-control flow to guarantee valid outputs from any compatible model. Works with local models via llama.cpp, Transformers, and remote APIs including OpenAI and Anthropic. Eliminates retry loops and post-processing for structured data extraction.

freeOpen Source
Chatbox logo

Chatbox

One desktop app for every LLM — private, cross-platform, extensible

Chatbox is a cross-platform desktop AI client supporting OpenAI, Claude, Gemini, DeepSeek, and local models via Ollama. All chat data stays on-device, making it ideal for privacy-conscious developers. Features include document analysis, code assistance with syntax highlighting, image generation, web search, and a local knowledge base for private Q&A. Available on Windows, macOS, Linux, Android, iOS, and web.

freemiumOpen Source
TaxHacker logo

TaxHacker

Self-hosted AI accounting for freelancers and small teams

TaxHacker is an open-source, self-hosted AI accounting app that automatically extracts financial data from receipts, invoices, and bank statements using LLMs. It supports 170+ currencies and 14 cryptocurrencies with historical exchange rate conversion, multi-project accounting, and custom AI extraction fields. Works with OpenAI, Gemini, Mistral, or local models via Ollama—deploy with Docker and keep all financial data under your control.

open-sourceOpen Source
Google AI Edge Gallery logo

Google AI Edge Gallery

Run open-source LLMs on your phone, fully offline and private

Google AI Edge Gallery is an open-source mobile app that lets you download and run large language models like Gemma directly on Android and iOS devices with zero cloud dependency. Built on MediaPipe and LiteRT, it features AI chat with reasoning mode, multimodal image analysis, real-time audio transcription, and autonomous agent skills—all running entirely on-device for complete privacy. A reference implementation for developers building offline-first AI experiences.

open-sourceOpen Source

AI Scientist v2

Autonomous scientific discovery via agentic tree search

AI Scientist v2 is Sakana AI's open-source system for fully autonomous scientific research using LLM-powered agentic tree search. It generates hypotheses, designs experiments, writes and executes code, analyzes results, and produces publishable manuscripts without human intervention. The system uses progressive exploration with backtracking to navigate the research space efficiently.

open-sourceOpen Source

gemma.cpp

Lightweight C++ inference for Google Gemma models

gemma.cpp is Google's standalone C++ inference engine built specifically for running Gemma language models without Python or CUDA dependencies. It provides optimized CPU inference using SIMD instructions and Highway library, supports Gemma 2 and Gemma 3 models, and runs on x86 and ARM architectures. Designed for embedded systems, edge devices, and server deployments needing minimal overhead.

open-sourceOpen Source

Microsoft Agent Framework

Unified Python/.NET framework for multi-agent AI

Microsoft Agent Framework is Microsoft's official unified SDK for building multi-agent AI workflows in Python and .NET. It consolidates Semantic Kernel and AutoGen into a single framework with MCP tool integration, graph-based workflows, human-in-the-loop patterns, and multi-agent orchestration. The framework reached Release Candidate status in February 2026 and is Microsoft's recommended path for production agent development.

open-sourceOpen Source

DeepSeek Coder

State-of-the-art open-source code language models

DeepSeek Coder is a family of open-source code language models trained from scratch on 2 trillion tokens of code and natural language data. Available in sizes from 1B to 33B parameters, these models support 80+ programming languages with 16K context windows and fill-in-the-blank capabilities. DeepSeek Coder outperforms CodeLlama-34B on HumanEval and MBPP benchmarks while being commercially licensable under MIT.

open-sourceOpen Source

One API

OpenAI API management gateway for 100+ LLM providers

One API is a self-hosted LLM API gateway that provides a unified OpenAI-compatible interface for managing multiple model providers including OpenAI, Azure, Anthropic, Google, and dozens of Chinese providers. It handles load balancing, quota management, rate limiting, token tracking, and channel-based routing through a web dashboard. Widely adopted in the Chinese developer ecosystem with over 18,000 GitHub stars.

open-sourceOpen Source

chatgpt-on-wechat

AI chatbot framework for WeChat with multi-model and plugin support

chatgpt-on-wechat is an open-source framework for deploying AI chatbots on WeChat, the dominant messaging platform in China. It supports OpenAI, Claude, Gemini, Qwen, and local models through a plugin architecture. Features group chat management, image generation, voice messages, and knowledge base integration. Over 42,700 GitHub stars reflecting massive adoption in the Chinese developer community.

open-sourceOpen Source
Sonatype Lifecycle logo

Sonatype Lifecycle

Enterprise software composition analysis for supply chain security

Sonatype Lifecycle is an enterprise software composition analysis platform that identifies vulnerabilities, license risks, and quality issues in open-source dependencies throughout the development lifecycle. It integrates with IDEs, CI/CD pipelines, and artifact repositories to block risky components before they enter the codebase. Backed by the largest vulnerability database with proprietary research beyond public CVE data.

paid
OpenBao logo

OpenBao

Linux Foundation fork of HashiCorp Vault for secrets management

OpenBao is the Linux Foundation's community-driven fork of HashiCorp Vault created after Vault's license change from open-source to BSL. It provides secrets management, encryption as a service, dynamic credentials, and PKI certificate management. Maintains API compatibility with Vault while developing under truly open-source governance with over 5,700 GitHub stars.

open-sourceOpen Source
Escape logo

Escape

AI-powered DAST platform specializing in API and GraphQL security

Escape is an AI-powered dynamic application security testing platform focused on API security including REST, GraphQL, and gRPC endpoints. It automatically discovers and tests API endpoints for vulnerabilities without requiring source code access. Features business logic testing that goes beyond OWASP patterns, CI/CD integration for shift-left security, and detailed remediation guidance for developers.

freemium