33 tools tagged
Showing 24 of 33 tools
Microsoft's zero-code-change RL trainer for AI agents
Agent Lightning is Microsoft Research's open-source framework that makes AI agents trainable through reinforcement learning with virtually zero code changes. Supports RL, Automatic Prompt Optimization, and Supervised Fine-tuning across any agent framework including LangChain, OpenAI Agents SDK, AutoGen, and CrewAI. 14K+ GitHub stars, ranked among Microsoft's top 50 most-starred projects.
Autonomous scientific discovery via agentic tree search
AI Scientist v2 is Sakana AI's open-source system for fully autonomous scientific research using LLM-powered agentic tree search. It generates hypotheses, designs experiments, writes and executes code, analyzes results, and produces publishable manuscripts without human intervention. The system uses progressive exploration with backtracking to navigate the research space efficiently.
State-of-the-art open-source code language models
DeepSeek Coder is a family of open-source code language models trained from scratch on 2 trillion tokens of code and natural language data. Available in sizes from 1B to 33B parameters, these models support 80+ programming languages with 16K context windows and fill-in-the-blank capabilities. DeepSeek Coder outperforms CodeLlama-34B on HumanEval and MBPP benchmarks while being commercially licensable under MIT.
Modern data pipeline orchestration with built-in AI
Mage AI is an open-source data pipeline orchestration tool positioned as a modern alternative to Apache Airflow. It provides a visual pipeline editor, native AI integrations for generating pipeline code, real-time streaming support, and built-in data quality checks. Mage handles batch and streaming workloads with a developer-friendly notebook-style interface and deploys to any cloud provider.
On-device ML solutions for mobile and edge AI
MediaPipe is Google's open-source framework for building on-device machine learning pipelines across mobile, web, desktop, and edge platforms. It provides pre-built solutions for face detection, hand tracking, pose estimation, object detection, image classification, text classification, and on-device LLM inference. MediaPipe runs entirely locally without cloud dependencies, supporting Android, iOS, Python, and web browsers.
Deep learning optimization for distributed training
DeepSpeed is Microsoft's open-source deep learning optimization library that makes distributed training and inference easy, efficient, and effective. Its ZeRO optimizer eliminates memory redundancies across data-parallel processes, enabling training of models with trillions of parameters. DeepSpeed supports 3D parallelism combining data, pipeline, and tensor parallelism, along with mixed precision training, gradient checkpointing, and CPU/NVMe offloading for memory-constrained environments.
OpenAI's open-source speech recognition model for any language
Whisper is OpenAI's open-source automatic speech recognition model trained on 680,000 hours of multilingual audio data. It supports transcription and translation across 99 languages with robust handling of accents, background noise, and technical vocabulary. Available in multiple model sizes from tiny (39M) to large (1.5B parameters) for balancing accuracy and speed.
Build and share ML web apps in Python with a few lines of code
Gradio is an open-source Python library for creating interactive web interfaces for machine learning models. It supports any data type including images, audio, video, 3D objects, and dataframes. Apps deploy for free on Hugging Face Spaces with auto-scaling, or can be shared via temporary public links. Gradio 5 adds server-side rendering, WebRTC streaming, and an AI Playground for generating apps.
Distributed AI compute engine for scaling Python and ML workloads
Ray is an open-source distributed computing framework built for scaling AI and Python applications from a laptop to thousands of GPUs. It provides libraries for distributed training, hyperparameter tuning, model serving, reinforcement learning, and data processing under a single unified API. Used by OpenAI for ChatGPT training, Uber, Shopify, and Instacart. Maintained by Anyscale and part of the PyTorch Foundation.
Unified framework for fine-tuning 100+ large language models
LLaMA-Factory is an open-source toolkit providing a unified interface for fine-tuning over 100 LLMs and vision-language models. It supports SFT, RLHF with PPO and DPO, LoRA and QLoRA for memory-efficient training, and continuous pre-training. The LLaMA Board web UI enables no-code configuration, while CLI and YAML workflows serve advanced users. Integrates with Hugging Face, ModelScope, vLLM, and SGLang for model deployment.
On-device AI inference engine for mobile and wearable applications
Cactus is a YC-backed low-latency AI engine for mobile and wearable devices that runs LLMs, transcription, embedding, and TTS models locally. It achieves 16-20 tok/sec on older devices and 70+ tok/sec on flagships with ARM SIMD kernels optimized for Snapdragon, Apple, and MediaTek processors. Supports Qwen, Gemma, Llama, DeepSeek with Flutter, React Native, and Kotlin SDKs.
Python toolkit for assessing and mitigating ML model fairness issues
Fairlearn is a Microsoft-backed open-source Python toolkit that helps developers assess and improve the fairness of machine learning models. It provides metrics for measuring disparity across groups defined by sensitive features, mitigation algorithms that reduce unfairness while maintaining model performance, and an interactive visualization dashboard for exploring fairness-accuracy trade-offs. Integrated with scikit-learn and Azure ML's Responsible AI dashboard.
Microsoft's framework for running 1-bit large language models on consumer CPUs
BitNet is Microsoft's official inference framework for 1-bit quantized large language models that enables running models with up to 100 billion parameters on standard consumer CPUs without requiring a GPU. By leveraging extreme quantization where weights use only 1.58 bits on average, BitNet achieves dramatic reductions in memory footprint and computational cost while maintaining competitive output quality for many practical use cases.
Multi-agent software company simulation for automated development
ChatDev simulates an entire virtual software company through multi-agent collaboration where LLM-powered roles including CEO, CTO, programmer, tester, and designer work together to produce complete software. With 32,000+ GitHub stars and a NeurIPS 2025 accepted paper, it offers a novel approach to automated software development through role-based agent orchestration.
State-of-the-art OCR toolkit supporting 100+ languages from Baidu
PaddleOCR is an open-source OCR toolkit from Baidu's PaddlePaddle ecosystem with over 73,000 GitHub stars. It provides ultra-lightweight and high-accuracy text detection and recognition for 100+ languages including CJK, Arabic, and Indic scripts. The toolkit offers pre-trained models, easy deployment via pip, and server/edge inference options for document digitization workflows.
2x faster LLM fine-tuning with 70% less VRAM on a single GPU
Unsloth is an open-source framework for fine-tuning large language models up to 2x faster while using 70% less VRAM. Built with custom Triton kernels, it supports 500+ model architectures including Llama 4, Qwen 3, and DeepSeek on consumer NVIDIA GPUs. Unsloth Studio adds a no-code web UI for dataset creation, training observability, model comparison, and GGUF export for Ollama and vLLM deployment.
Google's pretrained foundation model for zero-shot time-series forecasting
TimesFM is a pretrained time-series foundation model from Google Research that performs zero-shot forecasting on diverse datasets without task-specific training. It handles univariate and multivariate time series across domains including finance, logistics, energy, and infrastructure monitoring with accuracy competitive against traditional statistical methods like ARIMA and Prophet.
Fully managed RAG-as-a-Service platform for enterprise AI applications
Ragie is a managed retrieval-augmented generation platform that handles document ingestion, indexing, and retrieval so developers can build grounded AI applications without managing vector databases or chunking pipelines. It connects to Google Drive, Notion, Slack, Confluence, and other enterprise data sources with simple APIs for hybrid search and entity extraction.
First commercially viable 1-bit LLMs that are 14x smaller and 8x faster
PrismML Bonsai delivers the first commercially viable 1-bit large language models with 8B, 4B, and 1.7B parameter variants. The 8B model runs in just 1GB of RAM versus 16GB for standard FP16 models, achieving 44 tokens per second on iPhone. Backed by $16.25M from Khosla Ventures and released under Apache 2.0, Bonsai makes capable LLMs practical for edge devices and resource-constrained environments.
SQL-native memory infrastructure for AI agents and applications
Memori is an AI memory engine that provides persistent, queryable memory for agents and applications using SQL-native storage. It stores structured memories with semantic search, temporal awareness, and relationship tracking, enabling AI systems to remember user preferences, past interactions, and contextual facts across sessions. With 12,900 GitHub stars, it offers a database-native approach to the agent memory problem.
Production-grade reinforcement learning framework for LLM training
verl is an open-source reinforcement learning framework designed specifically for training and aligning large language models. Built for production use with support for distributed training across multiple GPUs and nodes, it implements RLHF, DPO, and other alignment algorithms that make LLMs follow instructions, avoid harmful outputs, and generate higher quality responses. Over 580 contributors and 20,000 GitHub stars signal strong adoption.
Browser automation framework turning websites into action APIs
Notte is a browser automation framework for AI agents that converts any website into a structured action API. Instead of scraping pages for text, Notte lets agents interact with sites — clicking buttons, filling forms, and navigating flows. Built with hybrid AI-plus-deterministic scripting, it includes digital personas, CAPTCHA solving, and proxy management for reliable automation at scale.
AI-native data application framework with SQL generation and agents
DB-GPT is an open-source AI-native data app framework combining SQL generation, database chat, RAG, and multi-agent orchestration for data-centric workflows. It supports natural language to SQL conversion, automated data analysis, and custom data app development. Integrates with MySQL, PostgreSQL, SQLite, and more. 28,000+ GitHub stars, Apache 2.0 licensed. Positioned as an alternative to MindsDB for teams building AI-powered data applications and internal database tools.
Enterprise feature platform for real-time ML
Tecton is an enterprise feature platform for building and serving ML features at scale. Created by the team behind Feast, it provides managed feature engineering, real-time feature computation from streaming data, feature monitoring, and a unified feature store with offline/online consistency. Used by production ML teams to eliminate training-serving skew and accelerate model deployment cycles.