# fine-tuning

10 tools tagged

Showing 10 of 10 tools

ms-swift

ModelScope's fine-tuning framework supporting 600+ models

ms-swift is ModelScope's open-source framework for fine-tuning over 600 large language and multimodal models. It supports SFT, DPO, RLHF, LoRA, QLoRA, and full fine-tuning with a web UI and CLI interface. Optimized for the Chinese AI ecosystem with native ModelScope Hub integration alongside Hugging Face support. Over 13,500 GitHub stars.

open-sourceOpen Source

torchtune

Meta's official PyTorch library for LLM fine-tuning

torchtune is Meta's official PyTorch-native library for fine-tuning large language models. It provides composable building blocks for training recipes covering LoRA, QLoRA, full fine-tuning, DPO, and knowledge distillation. Supports Llama, Mistral, Gemma, Qwen, and Phi model families with distributed training across multiple GPUs. Designed as a hackable, dependency-minimal alternative to higher-level frameworks.

open-sourceOpen Source

LLaMA-Factory

Unified framework for fine-tuning 100+ large language models

LLaMA-Factory is an open-source toolkit providing a unified interface for fine-tuning over 100 LLMs and vision-language models. It supports SFT, RLHF with PPO and DPO, LoRA and QLoRA for memory-efficient training, and continuous pre-training. The LLaMA Board web UI enables no-code configuration, while CLI and YAML workflows serve advanced users. Integrates with Hugging Face, ModelScope, vLLM, and SGLang for model deployment.

open-sourceOpen Source

Unsloth

2x faster LLM fine-tuning with 70% less VRAM on a single GPU

Unsloth is an open-source framework for fine-tuning large language models up to 2x faster while using 70% less VRAM. Built with custom Triton kernels, it supports 500+ model architectures including Llama 4, Qwen 3, and DeepSeek on consumer NVIDIA GPUs. Unsloth Studio adds a no-code web UI for dataset creation, training observability, model comparison, and GGUF export for Ollama and vLLM deployment.

open-sourceOpen Source

OpenAI API

API for GPT-4, o1, DALL-E, Whisper, and embeddings

Official API platform for GPT-4o, o1/o3 reasoning models, DALL-E image generation, Whisper speech-to-text, and text embeddings. Features Assistants API, function calling, JSON mode, fine-tuning, and batch processing. The most widely used AI API in the industry, powering millions of applications from chatbots to complex multi-step agent systems across every sector.

api-usage-based

Google Vertex AI

Google Cloud ML platform with Gemini and custom models

Google Cloud's end-to-end ML platform with Gemini models, Model Garden featuring 150+ models, AutoML, and custom training pipelines. Features Vertex AI Search, Conversation, and Agent Builder for enterprise AI applications. The comprehensive platform for organizations building production AI systems at scale within the Google Cloud ecosystem, with enterprise governance and compliance built in.

api-usage-based

Hugging Face

The GitHub of ML — model hub, datasets, and inference

Open-source platform for building, sharing, and deploying machine learning models and datasets. Hosts 500k+ models, 100k+ datasets, and Spaces for interactive demos. The central hub of the open-source AI ecosystem, providing model discovery, inference APIs, and collaborative tools that make it the GitHub of machine learning for researchers and developers worldwide.

freemiumOpen Source

Replicate

Run and deploy ML models via API with simple pricing

Cloud platform that lets developers run 50,000+ open-source ML models through a simple API without managing GPUs or infrastructure. Replicate hosts production-ready models like FLUX, Stable Diffusion, Llama, and Whisper for image, text, audio, and video, with custom model deployment, LoRA support, automatic scaling, version history with rollback, and pay-per-use pricing.

api-usage-based

Fireworks AI

Production-grade inference with serverless and on-demand GPUs

High-performance inference platform serving open-source and custom AI models at global scale, processing 13+ trillion tokens daily at ~180K requests per second. Fireworks AI delivers 1,000+ tokens per second on large models through quantization-aware tuning and adaptive speculation, with serverless, fine-tuning, and dedicated GPU options across text, image, and audio modalities.

freemium

Together AI

Fast inference platform for open-source models

Cloud platform for running, fine-tuning, and training open-source AI models with optimized inference speeds up to 4x faster than traditional deployments. Together AI supports serverless endpoints and dedicated GPUs, fine-tuning of 100B+ parameter models like DeepSeek-V3 and Qwen3-235B, plus async batch processing scaling to 30B tokens for cost-effective large workloads.

api-usage-based