331 tools tagged
Showing 24 of 331 tools
YC-backed cloud deployment platform for Rust and Python applications
Shuttle is a YC-backed cloud deployment platform that simplifies deploying Rust and Python backend applications. Developers annotate their code with Shuttle macros to declare infrastructure needs like databases, caches, and secrets, and Shuttle provisions the resources automatically on deployment. Features instant deployment, automatic HTTPS, and infrastructure-from-code. Over 6,100 GitHub stars.
GitHub's Kubernetes controller for autoscaling GitHub Actions runners
actions-runner-controller (ARC) is GitHub's official Kubernetes controller for managing self-hosted GitHub Actions runners. It automatically scales runner pods up and down based on workflow demand, provisioning runners when jobs queue and terminating them when complete. Supports runner groups, custom runner images, and organization-level runner management. Over 6,100 GitHub stars.
DeepSeek's FP8 general matrix multiplication kernels for efficient inference
DeepGEMM is DeepSeek's open-source library of FP8 matrix multiplication CUDA kernels optimized for LLM inference and training on modern NVIDIA GPUs. It provides efficient GEMM operations using 8-bit floating point precision that reduce memory bandwidth requirements while maintaining model accuracy. Designed for integration into inference engines and training frameworks. Over 6,300 GitHub stars.
DeepSeek's expert-parallel communication library for MoE model training
DeepEP is DeepSeek's open-source communication library optimized for expert-parallel training of Mixture-of-Experts models. It provides efficient GPU-to-GPU data routing for distributing tokens to expert networks across multiple devices during MoE model training and inference. Enables the distributed expert parallelism that powers DeepSeek's competitive model efficiency. Over 9,100 GitHub stars.
Lightweight microVM execution layer for AI agent code sandboxing
Vercel Sandbox provides a lightweight microVM execution environment for running untrusted code generated by AI agents safely. It creates isolated sandboxes that prevent generated code from accessing the host system, network, or other processes. Designed for AI coding platforms that need to execute user or agent-generated code without security risks to the host infrastructure.
Python library for declarative data loading that LLMs can generate
dlt (data load tool) is a Python library for building data pipelines with declarative, schema-aware loading that is simple enough for LLMs to generate correctly. It extracts data from APIs, databases, and files, normalizes nested structures, handles schema evolution, and loads into warehouses and lakes. Supports 30+ destinations including BigQuery, Snowflake, DuckDB, and PostgreSQL. Over 5,200 GitHub stars.
YC-backed multimodal RAG platform for documents, images, and video
Morphik is a YC-backed multimodal RAG platform that ingests and retrieves information from documents, images, tables, and video content. It processes complex document layouts including charts, diagrams, and multi-column formats that traditional text-only RAG systems handle poorly. Provides API-first integration for building knowledge bases that understand visual as well as textual information.
Self-hosted cloud development environments for teams and AI agents
Coder provisions self-hosted cloud development environments on any infrastructure including Kubernetes, Docker, AWS, GCP, and Azure. Developers connect through VS Code, JetBrains IDEs, or browser-based editors to standardized environments with pre-configured dependencies. Features template-based provisioning, automatic shutdown, and audit logging. Over 12,800 GitHub stars with growing AI agent use cases.
Google's application kernel for container sandboxing and security
gVisor is Google's open-source container runtime sandbox that provides an additional layer of isolation between containerized applications and the host kernel. It implements a user-space application kernel that intercepts system calls, preventing container escapes and limiting the attack surface. Used in Google Cloud Run, GKE Sandbox, and other Google Cloud services. Over 18,000 GitHub stars.
StackBlitz's browser-based Node.js runtime for instant dev environments
WebContainers by StackBlitz runs Node.js natively in the browser using WebAssembly, enabling full development environments without server infrastructure. It powers StackBlitz and Bolt with instant project startup, npm package installation, and dev server execution entirely client-side. Supports Node.js APIs, filesystem operations, and terminal emulation within browser security constraints.
AI chatbot framework for WeChat with multi-model and plugin support
chatgpt-on-wechat is an open-source framework for deploying AI chatbots on WeChat, the dominant messaging platform in China. It supports OpenAI, Claude, Gemini, Qwen, and local models through a plugin architecture. Features group chat management, image generation, voice messages, and knowledge base integration. Over 42,700 GitHub stars reflecting massive adoption in the Chinese developer community.
Multilingual emotional text-to-speech with 80+ language support
Fish Speech is an open-source text-to-speech system supporting 80+ languages with emotional expression, zero-shot voice cloning, and real-time streaming. It generates natural speech with controllable emotions, speaking styles, and prosody. Features a web interface, API server, and integration with AI agent frameworks for voice-enabled applications. Over 29,000 GitHub stars.
Open-source voice cloning and text-to-speech with few-shot learning
GPT-SoVITS is an open-source voice cloning and text-to-speech system that generates natural-sounding speech from just a few seconds of reference audio. It combines GPT-style language modeling with SoVITS voice synthesis for zero-shot and few-shot voice cloning across multiple languages. Supports Chinese, English, Japanese, Korean, and Cantonese with over 56,000 GitHub stars.
DeepSeek's optimized attention kernel for Multi-Head Latent Attention
FlashMLA is DeepSeek's open-source CUDA kernel implementing efficient Multi-Head Latent Attention, the attention mechanism used in DeepSeek-V2 and V3 models. It provides optimized GPU kernels that significantly reduce memory usage and improve inference speed for MLA-based architectures. Represents DeepSeek's contribution to open AI infrastructure with over 12,600 GitHub stars.
Local model inference engine with OpenAI-compatible API and web UI
Xinference is a local inference engine that runs LLMs, embedding models, image generation, and audio models with an OpenAI-compatible API. It provides a web dashboard for model management, supports vLLM, llama.cpp, and transformers backends, and handles multi-GPU deployment automatically. Supports 100+ models including Qwen, Llama, Mistral, and DeepSeek with over 9,200 GitHub stars.
ModelScope's fine-tuning framework supporting 600+ models
ms-swift is ModelScope's open-source framework for fine-tuning over 600 large language and multimodal models. It supports SFT, DPO, RLHF, LoRA, QLoRA, and full fine-tuning with a web UI and CLI interface. Optimized for the Chinese AI ecosystem with native ModelScope Hub integration alongside Hugging Face support. Over 13,500 GitHub stars.
Alibaba's agent framework built for the Qwen model family
Qwen-Agent is Alibaba's open-source framework for building AI agents powered by the Qwen model family. It provides tool use, planning, memory, and multi-agent orchestration with native optimization for Qwen models including function calling and code interpretation. Supports RAG, browser automation, and custom tool development with over 15,900 GitHub stars.
Adaptive web scraping library with anti-bot evasion and smart selectors
Scrapling is a Python web scraping library that uses adaptive selectors and anti-bot evasion techniques to extract data from websites reliably. It generates selectors that survive website layout changes by understanding element context rather than relying on brittle CSS paths. Features stealth browser automation, automatic retry logic, and proxy rotation. Over 34,500 GitHub stars.
No-code AI web scraping platform with visual workflow builder
Maxun is a no-code web scraping platform that uses AI to extract structured data from websites through a visual workflow builder. Users point and click on the data they want to extract, and Maxun generates resilient scraping workflows that handle pagination, authentication, and dynamic content. Features anti-bot detection avoidance, scheduled runs, and API access for integration. Over 15,300 GitHub stars.
All-in-one embeddings database with RAG, search, and agent capabilities
txtai is a self-contained AI search and RAG platform that combines vector embeddings, semantic search, LLM pipelines, and agent workflows in a single Python library. It handles embedding generation, similarity search, extractive QA, summarization, translation, and custom pipelines without external dependencies. Runs locally with over 12,400 GitHub stars and Apache 2.0 license.
End-to-end open-source platform for training and evaluating foundation models
Oumi is an end-to-end open-source platform for training, fine-tuning, and evaluating foundation models at any scale. It covers data preparation, distributed training, reinforcement learning from human feedback, evaluation benchmarks, and model deployment in a unified framework. Supports training from scratch to post-training alignment with over 9,100 GitHub stars.
Intelligent model router that balances cost and quality across LLM providers
RouteLLM by LMSYS routes LLM requests to the most cost-effective model that can handle each query's complexity. It uses learned routing models to classify whether a query needs a powerful expensive model or can be handled by a cheaper alternative, reducing costs by up to 85% while maintaining quality. Supports OpenAI, Anthropic, and other providers through an OpenAI-compatible API.
Multi-LoRA inference server for serving hundreds of fine-tuned models
LoRAX is an inference server that serves hundreds of fine-tuned LoRA models from a single base model deployment. It dynamically loads and unloads LoRA adapters on demand, sharing the base model's GPU memory across all adapters. Built on text-generation-inference with OpenAI-compatible API. Enables multi-tenant model serving without per-model GPU allocation. Over 3,700 GitHub stars.
Builder.io visual canvas with Figma-to-code and code-to-Figma sync
Fusion by Builder.io is a visual development canvas that provides bidirectional sync between Figma designs and production code. Designers edit in Figma while developers work in code, and changes propagate in both directions. Supports React, Vue, Svelte, Angular, and Qwik output with a visual editor for non-developers to make content and layout changes without touching code.