6 tools tagged
Showing 6 of 6 tools
Managed foundation models on AWS
Microsoft's cloud AI platform offering Azure OpenAI Service for GPT and DALL-E models with enterprise security, compliance, and regional data residency. Includes AI Studio for model catalog, fine-tuning, and prompt engineering. The default AI platform for Microsoft-centric enterprises that need access to frontier models with the governance and compliance guarantees Azure provides.
Production-grade inference with serverless and on-demand GPUs
Open-source model serving platform optimized for large language models and generative AI. Supports Hugging Face models, LoRA adapters, and continuous batching for efficient multi-user serving. Built on PyTorch with OpenAI-compatible endpoints. Designed for teams who need production-grade LLM serving with lower latency and better resource utilization than generic model serving frameworks.
Ultra-fast LPU inference with fastest token generation
Enterprise AI platform for fine-tuning and deploying custom language models. Offers Command R family of models, Embed API for retrieval, and Rerank API for search relevance. Known for strong enterprise features including data privacy guarantees, custom model training, and retrieval-augmented generation capabilities that help organizations build AI applications grounded in their proprietary data.
Fast inference platform for open-source models
Meta's open-source large language model family available for commercial use. Llama 3 models range from 8B to 405B parameters, offering competitive performance with full weight access. Hosted on Hugging Face and available through major cloud providers. The most impactful open-source AI release, enabling companies and researchers to build, fine-tune, and deploy custom AI solutions without API dependencies.
Run LLMs locally with one command
Tool for running large language models locally on your machine with a simple CLI interface. Download and run Llama 3, Mistral, Gemma, Phi, Code Llama, and dozens of other open-source models with a single command. Features model management, GPU acceleration (NVIDIA/AMD/Apple Silicon), OpenAI-compatible API server, Modelfile for customization, and multi-model switching. Ideal for offline AI development, privacy-sensitive use cases, and local testing. 120K+ GitHub stars.
Data framework for LLM applications
Leading Python framework for building LLM-powered applications with focus on data-aware and agentic workflows. Provides tools for RAG (Retrieval-Augmented Generation), document indexing, vector store integrations, query engines, and multi-agent orchestration. 150+ data connectors for various sources. Works with OpenAI, Anthropic, local models, and more. Includes LlamaHub for community tools and LlamaCloud for managed RAG pipelines. 40K+ GitHub stars.