30 tools tagged
Showing 24 of 30 tools
Cost-effective AI inference platform with 86+ models from $0.02/M tokens
DeepInfra is an AI inference platform offering 86+ LLM models with pricing starting at $0.02 per million tokens. Backed by $20.6M in funding including an $18M Series A from Felicis Ventures, it provides OpenAI-compatible endpoints for models including DeepSeek, Llama, and Mistral with pay-as-you-go pricing.
Conversational data analysis with natural language queries over databases
PandasAI enables natural-language queries against databases, data lakes, CSVs, and parquet files using LLMs and RAG pipelines. With 23,400+ GitHub stars, it bridges the gap between database tools and AI by letting developers and analysts interact with data conversationally, supporting SQL, PostgreSQL, and various file formats.
2x faster LLM fine-tuning with 70% less VRAM on a single GPU
Unsloth is an open-source framework for fine-tuning large language models up to 2x faster while using 70% less VRAM. Built with custom Triton kernels, it supports 500+ model architectures including Llama 4, Qwen 3, and DeepSeek on consumer NVIDIA GPUs. Unsloth Studio adds a no-code web UI for dataset creation, training observability, model comparison, and GGUF export for Ollama and vLLM deployment.
LLM-powered web scraping with graph-based extraction pipelines
ScrapeGraphAI is a Python library that uses LLMs and graph-based logic to build automated, self-healing web scraping pipelines. Developers describe desired data in natural language and ScrapeGraphAI constructs a processing graph that extracts structured information from any website. It supports multiple LLM providers, achieves 96%+ accuracy on semantic extraction benchmarks, and adapts to layout changes automatically. Over 20,000 GitHub stars.
Open-source LLM gateway with built-in optimization and A/B testing
TensorZero is an open-source LLMOps platform in Rust that unifies an LLM gateway, observability, prompt optimization, and A/B experimentation in a single binary. It routes requests across providers with sub-millisecond P99 latency at 10K+ QPS while capturing structured data for continuous improvement. Supports dynamic in-context learning, fine-tuning workflows, and production feedback loops. Backed by $7.3M seed funding, 11K+ GitHub stars.
Governed analytics with a unified semantic layer
Querio turns plain English into SQL queries with a focus on governed analytics, connecting to live data warehouses and enforcing a unified semantic layer across the organization. It ensures that AI-generated queries remain consistent with business logic definitions, preventing the common problem where different teams get different answers to the same question from the same data.
Natural language scripting for LLM-system interaction
GPTScript is an Apache 2.0 licensed framework with 3,300+ GitHub stars that enables natural language scripting where LLMs interact with local systems, APIs, and tools through simple prompt definitions. It supports multiple model providers including OpenAI-compatible APIs and local models, providing a lightweight approach to building AI agents that can execute CLI commands, call APIs, and process files.
Chat with your database in natural language
AskYourDatabase lets users chat with their databases using natural language, automatically generating SQL queries and visualizations from conversational questions. It autonomously creates dashboards and charts from queried data, serving as a self-serve reporting tool that reduces the business intelligence workload on engineering teams by letting non-technical users access data directly.
Evaluation-first LLM and agent observability
Confident AI is an evaluation-first observability platform that scores every trace and span with 50+ metrics, alerting on quality drops in LLM and agent applications. It goes beyond traditional APM by treating evaluation as core observability, providing actionable insights that help teams understand not just whether their AI applications are running but whether they are producing correct and useful outputs.
Open-source AI-powered log analysis by Salesforce
LogAI is an open-source log analysis platform by Salesforce Research that uses deep learning to detect anomalies in large-scale system logs. It provides research-backed autonomous log troubleshooting capabilities, applying ML models to identify patterns, cluster log events, and surface anomalies that would be invisible in manual log review across high-volume production environments.
DRM and IP protection for AI model weights
RefortifAI is a Y Combinator P2026 batch company that provides DRM and intellectual property protection for AI models by obfuscating model weights so they only run inside a hardened runtime. It solves the critical problem of model weight protection for companies distributing custom LLMs to untrusted environments, preventing IP theft while maintaining inference performance.
AI reviewer that catches hallucinations in generated code
Codoki is a specialized AI code reviewer focused on catching hallucinations in code generated by autonomous agents like Devin and Claude Code. It validates that AI-proposed code actually functions according to provided requirements, serving as a critical safety layer for teams where AI agents generate a significant portion of the codebase and human review capacity cannot keep pace with generation speed.
Open-source RAG-based text-to-SQL engine
Vanna AI is an open-source MIT-licensed text-to-SQL framework with 6,000+ GitHub stars that uses RAG to generate accurate SQL queries from natural language. It learns and improves over time by indexing your specific database schema, documentation, and query history, ensuring no vendor lock-in while delivering highly accurate schema-specific query generation across any database.
AI tables inside your database with SQL
MindsDB is an open-source platform with 20,000+ GitHub stars that lets developers combine machine learning predictions with standard SQL queries by federating data across 200+ sources. It enables creating AI models as virtual database tables, querying predictions with familiar SELECT statements, and building real-time ML pipelines without leaving the database workflow that teams already know.
OpenTelemetry-native LLM observability instrumentation
OpenLLMetry by Traceloop is an open-source instrumentation library with 7,000+ GitHub stars that adds OpenTelemetry-native tracing to LLM and AI agent applications. It captures detailed traces of model calls including latency, token usage, costs, and error rates, exporting data to any OpenTelemetry-compatible backend like Grafana, Datadog, or Jaeger for vendor-neutral AI observability.
Open-source ML and LLM monitoring with 100+ metrics
Evidently AI is an open-source platform with 100+ pre-built metrics for monitoring data quality, model performance, and data drift in AI/ML pipelines. Available under Apache 2.0 with a cloud version, it helps teams detect when production data shifts away from training distributions, LLM output quality degrades, or feature pipelines introduce anomalies that silently degrade model accuracy.
Multi-provider AI coding assistant with BYOK model access
CodeGPT is an AI coding assistant for VS Code and JetBrains IDEs that connects to multiple AI providers including OpenAI, Anthropic, Google, Mistral, and local models via Ollama using your own API keys. It offers code generation, explanation, refactoring, documentation writing, bug detection, and an agent marketplace with pre-built assistants for common tasks. The BYOK approach gives developers full cost control, no rate limits, and complete data ownership over their AI interactions.
Lightweight eval library for LLM applications
OpenEvals is a lightweight evaluation library from the LangChain team for testing LLM application quality using LLM-as-judge patterns. It provides pre-built prompt sets and evaluation functions that score model outputs against criteria like accuracy, relevance, coherence, and safety without requiring complex infrastructure. Available as both Python and JavaScript packages, OpenEvals complements OpenAI Evals with a simpler, framework-agnostic approach to quality measurement in agentic workflows.
Input and output security scanners for LLM applications
LLM Guard is an open-source security toolkit by Protect AI that provides 15 input scanners and 20 output scanners to protect LLM applications from prompt injection, PII leakage, toxic content, secrets exposure, and data exfiltration. Each scanner is modular and independent — pick the ones you need, configure thresholds, and chain them into a pipeline. The library works with any LLM and has been downloaded over 2.5 million times. MIT licensed, Python 3.9+.
Framework for evaluating LLM and agent performance
OpenAI Evals is an open-source framework and benchmark registry for evaluating LLM performance on custom tasks. It provides infrastructure for writing evaluation prompts, running them against models, and recording results in a structured format for comparison. The hosted Evals API on the OpenAI platform adds managed run tracking, dataset management, and programmatic access to evaluation pipelines. With 17,700+ GitHub stars, it serves as a foundation for systematic LLM quality measurement.
Observability and lifecycle management for AI agents
AgentOps is an observability platform for monitoring, debugging, and managing the lifecycle of AI agents. It provides session replays with time-travel debugging, detailed event timelines showing every tool call and LLM interaction, cost tracking per session, and anomaly detection for agents stuck in loops or making unexpected decisions. AgentOps integrates with CrewAI, LangChain, AutoGen, and other frameworks through a lightweight SDK that requires just two lines of code to instrument.
Validate and structure LLM outputs with composable Guards
Guardrails AI is an open-source Python and JavaScript framework for validating and structuring LLM outputs using composable Guards built from a Hub of pre-built validators. It handles structured data extraction with Pydantic models, content safety checks including toxicity, PII detection, competitor mentions, and bias filtering, plus automatic re-prompting when validation fails. The Guardrails Hub offers dozens of validators from regex matching to hallucination detection via LLM judges.
Build and evaluate LLM apps end-to-end
Prompt Flow is Microsoft's open-source development suite for building, testing, evaluating, and deploying LLM-based applications end-to-end. It links LLM calls, prompts, Python code, and other tools into executable flows defined in YAML, with a VS Code extension providing a visual flow designer. The tool supports tracing LLM interactions for debugging, running batch evaluations with quality metrics against larger datasets, and integrating tests into CI/CD pipelines before production deployment.
Self-hosted AI platform with ChatGPT-like interface for local and cloud LLMs.
Extensible, self-hosted AI platform with 290M+ Docker pulls and 124K+ GitHub stars. Supports Ollama, OpenAI-compatible APIs, and any Chat Completions backend. Features built-in RAG, multi-user RBAC, voice/video calls, Python function workspace, model builder, and web browsing. Runs entirely offline with enterprise features including SSO and audit logging.