28 tools tagged
Showing 24 of 28 tools
OpenAI's open-source speech recognition model for any language
Whisper is OpenAI's open-source automatic speech recognition model trained on 680,000 hours of multilingual audio data. It supports transcription and translation across 99 languages with robust handling of accents, background noise, and technical vocabulary. Available in multiple model sizes from tiny (39M) to large (1.5B parameters) for balancing accuracy and speed.
Build and share ML web apps in Python with a few lines of code
Gradio is an open-source Python library for creating interactive web interfaces for machine learning models. It supports any data type including images, audio, video, 3D objects, and dataframes. Apps deploy for free on Hugging Face Spaces with auto-scaling, or can be shared via temporary public links. Gradio 5 adds server-side rendering, WebRTC streaming, and an AI Playground for generating apps.
Distributed AI compute engine for scaling Python and ML workloads
Ray is an open-source distributed computing framework built for scaling AI and Python applications from a laptop to thousands of GPUs. It provides libraries for distributed training, hyperparameter tuning, model serving, reinforcement learning, and data processing under a single unified API. Used by OpenAI for ChatGPT training, Uber, Shopify, and Instacart. Maintained by Anyscale and part of the PyTorch Foundation.
Unified framework for fine-tuning 100+ large language models
LLaMA-Factory is an open-source toolkit providing a unified interface for fine-tuning over 100 LLMs and vision-language models. It supports SFT, RLHF with PPO and DPO, LoRA and QLoRA for memory-efficient training, and continuous pre-training. The LLaMA Board web UI enables no-code configuration, while CLI and YAML workflows serve advanced users. Integrates with Hugging Face, ModelScope, vLLM, and SGLang for model deployment.
On-device AI inference engine for mobile and wearable applications
Cactus is a YC-backed open-source inference engine built specifically for running LLMs, vision models, and embeddings on smartphones, tablets, and wearable devices. It provides native SDKs for iOS, Android, Flutter, and React Native with optimized ARM CPU and Apple NPU execution paths. Cactus achieves the fastest inference speeds on ARM processors with 10x lower RAM usage compared to generic runtimes, enabling privacy-first AI applications that run entirely on-device.
Python toolkit for assessing and mitigating ML model fairness issues
Fairlearn is a Microsoft-backed open-source Python toolkit that helps developers assess and improve the fairness of machine learning models. It provides metrics for measuring disparity across groups defined by sensitive features, mitigation algorithms that reduce unfairness while maintaining model performance, and an interactive visualization dashboard for exploring fairness-accuracy trade-offs. Integrated with scikit-learn and Azure ML's Responsible AI dashboard.
Microsoft's framework for running 1-bit large language models on consumer CPUs
BitNet is Microsoft's official inference framework for 1-bit quantized large language models that enables running models with up to 100 billion parameters on standard consumer CPUs without requiring a GPU. By leveraging extreme quantization where weights use only 1.58 bits on average, BitNet achieves dramatic reductions in memory footprint and computational cost while maintaining competitive output quality for many practical use cases.
Multi-agent software company simulation for automated development
ChatDev simulates an entire virtual software company through multi-agent collaboration where LLM-powered roles including CEO, CTO, programmer, tester, and designer work together to produce complete software. With 32,000+ GitHub stars and a NeurIPS 2025 accepted paper, it offers a novel approach to automated software development through role-based agent orchestration.
State-of-the-art OCR toolkit supporting 100+ languages from Baidu
PaddleOCR is an open-source OCR toolkit from Baidu's PaddlePaddle ecosystem with over 73,000 GitHub stars. It provides ultra-lightweight and high-accuracy text detection and recognition for 100+ languages including CJK, Arabic, and Indic scripts. The toolkit offers pre-trained models, easy deployment via pip, and server/edge inference options for document digitization workflows.
2x faster LLM fine-tuning with 70% less VRAM on a single GPU
Unsloth is an open-source framework for fine-tuning large language models up to 2x faster while using 70% less VRAM. Built with custom Triton kernels, it supports 500+ model architectures including Llama 4, Qwen 3, and DeepSeek on consumer NVIDIA GPUs. Unsloth Studio adds a no-code web UI for dataset creation, training observability, model comparison, and GGUF export for Ollama and vLLM deployment.
Google's pretrained foundation model for zero-shot time-series forecasting
TimesFM is a pretrained time-series foundation model from Google Research that performs zero-shot forecasting on diverse datasets without task-specific training. It handles univariate and multivariate time series across domains including finance, logistics, energy, and infrastructure monitoring with accuracy competitive against traditional statistical methods like ARIMA and Prophet.
Fully managed RAG-as-a-Service platform for enterprise AI applications
Ragie is a managed retrieval-augmented generation platform that handles document ingestion, indexing, and retrieval so developers can build grounded AI applications without managing vector databases or chunking pipelines. It connects to Google Drive, Notion, Slack, Confluence, and other enterprise data sources with simple APIs for hybrid search and entity extraction.
Google's official Agent Development Kit for building AI agents in Go
adk-go is Google's official Agent Development Kit for the Go programming language, providing the tools and abstractions needed to build production AI agents. It supports tool calling, multi-turn conversations, structured outputs, and integration with Google's Gemini models. With 7,300 GitHub stars and Apache 2.0 license, it brings first-class AI agent development capabilities to the Go ecosystem.
First commercially viable 1-bit LLMs that are 14x smaller and 8x faster
PrismML Bonsai delivers the first commercially viable 1-bit large language models with 8B, 4B, and 1.7B parameter variants. The 8B model runs in just 1GB of RAM versus 16GB for standard FP16 models, achieving 44 tokens per second on iPhone. Backed by $16.25M from Khosla Ventures and released under Apache 2.0, Bonsai makes capable LLMs practical for edge devices and resource-constrained environments.
ACP skill definitions giving coding agents HuggingFace ML superpowers
Hugging Face Skills is the official collection of ACP skill definitions that give AI coding agents access to HuggingFace ML capabilities. The 13 skills cover LLM fine-tuning with TRL, vision model training, dataset management, model evaluation, and cloud job submission on HF infrastructure. Compatible with Claude Code, Codex, Gemini CLI, and Cursor via a single npx command.
SQL-native memory infrastructure for AI agents and applications
Memori is an AI memory engine that provides persistent, queryable memory for agents and applications using SQL-native storage. It stores structured memories with semantic search, temporal awareness, and relationship tracking, enabling AI systems to remember user preferences, past interactions, and contextual facts across sessions. With 12,900 GitHub stars, it offers a database-native approach to the agent memory problem.
Production-grade reinforcement learning framework for LLM training
verl is an open-source reinforcement learning framework designed specifically for training and aligning large language models. Built for production use with support for distributed training across multiple GPUs and nodes, it implements RLHF, DPO, and other alignment algorithms that make LLMs follow instructions, avoid harmful outputs, and generate higher quality responses. Over 580 contributors and 20,000 GitHub stars signal strong adoption.
Free cloud-hosted Jupyter notebooks with GPU access
Google Colab is a free cloud-hosted Jupyter notebook environment providing access to GPUs and TPUs for machine learning, data analysis, and education. Requires no setup — notebooks run in the browser with pre-installed ML libraries including TensorFlow, PyTorch, scikit-learn, and pandas. Features Google Drive integration for persistent storage, collaborative editing, and sharing. Free tier includes limited GPU access; Colab Pro provides faster GPUs, more memory, and longer runtimes.
ML model serving and deployment framework
BentoML is an open-source framework with 7K+ GitHub stars for packaging, deploying, and serving ML models as production-ready APIs. Bundles models, preprocessing, and serving logic into portable Bento archives with auto-generated REST/gRPC endpoints. Features adaptive batching for throughput optimization, GPU scheduling, multi-model inference pipelines, and containerization. Supports all major ML frameworks including PyTorch, TensorFlow, scikit-learn, and Hugging Face Transformers.
Hugging Face's lightweight agent framework
smolagents is Hugging Face's lightweight agent framework for building AI agents that can use tools, write and execute code, and collaborate in multi-agent setups. Designed for simplicity with minimal abstractions — agents are just LLMs that write Python code to orchestrate tool calls rather than using JSON-based function calling. Supports any LLM provider, integrates with Hugging Face Hub for sharing tools and agents, and runs with as few as 1,000 lines of core library code.
Interactive computing notebooks for data science
Jupyter is the open-source interactive computing platform providing notebook interfaces for data science, ML, scientific computing, and education. Notebooks combine live code, equations, visualizations, and narrative text. Supports 40+ languages via kernels including Python, R, Julia, and Scala. JupyterLab provides a modern IDE-like interface. JupyterHub enables multi-user deployments. The standard tool for computational research and data exploration worldwide.
Open-source AI observability for models and data pipelines
WhyLabs is an AI observability platform for monitoring ML models, LLM apps, and data pipelines — now fully open-sourced. Built on whylogs for privacy-preserving data logging and LangKit for LLM monitoring. Provides continuous drift detection, data quality monitoring, anomaly alerting, and LLM security including prompt injection and hallucination detection. Processes 100% of data without sampling across tabular, image, text, and embedding types. Incubated at Allen Institute for AI.
Open-source embedding database — the AI-native way to store and query embeddings.
Chroma is an open-source embedding database designed for simplicity and developer experience. Runs in-memory, as a Python library, or as a client-server deployment. Popular for prototyping RAG applications, local development, and lightweight vector search. Integrates natively with LangChain, LlamaIndex, and OpenAI.
Fully managed vector database built for AI applications at production scale.
Pinecone is the leading managed vector database designed for high-performance similarity search at scale. Purpose-built for AI applications including RAG, recommendation systems, and semantic search. Offers serverless and pod-based architectures with automatic scaling, filtering, and namespacing. No infrastructure management required.