# fine-tuning
10 tools tagged
Showing 10 of 10 tools
ms-swift
ModelScope's fine-tuning framework supporting 600+ models
ms-swift is ModelScope's open-source framework for fine-tuning over 600 large language and multimodal models. It supports SFT, DPO, RLHF, LoRA, QLoRA, and full fine-tuning with a web UI and CLI interface. Optimized for the Chinese AI ecosystem with native ModelScope Hub integration alongside Hugging Face support. Over 13,500 GitHub stars.
torchtune
Meta's official PyTorch library for LLM fine-tuning
torchtune is Meta's official PyTorch-native library for fine-tuning large language models. It provides composable building blocks for training recipes covering LoRA, QLoRA, full fine-tuning, DPO, and knowledge distillation. Supports Llama, Mistral, Gemma, Qwen, and Phi model families with distributed training across multiple GPUs. Designed as a hackable, dependency-minimal alternative to higher-level frameworks.
LLaMA-Factory
Unified framework for fine-tuning 100+ large language models
LLaMA-Factory is an open-source toolkit providing a unified interface for fine-tuning over 100 LLMs and vision-language models. It supports SFT, RLHF with PPO and DPO, LoRA and QLoRA for memory-efficient training, and continuous pre-training. The LLaMA Board web UI enables no-code configuration, while CLI and YAML workflows serve advanced users. Integrates with Hugging Face, ModelScope, vLLM, and SGLang for model deployment.
Unsloth
2x faster LLM fine-tuning with 70% less VRAM on a single GPU
Unsloth is an open-source framework for fine-tuning large language models up to 2x faster while using 70% less VRAM. Built with custom Triton kernels, it supports 500+ model architectures including Llama 4, Qwen 3, and DeepSeek on consumer NVIDIA GPUs. Unsloth Studio adds a no-code web UI for dataset creation, training observability, model comparison, and GGUF export for Ollama and vLLM deployment.
OpenAI API
API for GPT-5 family models, multimodal generation, embeddings, and agents
Official API platform for the GPT-5 family, reasoning/thinking variants, multimodal generation, speech, embeddings, and agent workflows. Features the Responses API, tool calling, structured outputs, batch processing, fine-tuning, and SDK support. It remains one of the most widely integrated AI APIs in the developer ecosystem, but model choice, retention settings, rate limits, and pricing tiers require active governance in production.
Google Vertex AI
Google Cloud ML platform with Gemini and custom models
Google Cloud's end-to-end ML platform with Gemini models, Model Garden featuring 150+ models, AutoML, and custom training pipelines. Features Vertex AI Search, Conversation, and Agent Builder for enterprise AI applications. The comprehensive platform for organizations building production AI systems at scale within the Google Cloud ecosystem, with enterprise governance and compliance built in.
Hugging Face
The GitHub of ML — model hub, datasets, and inference
Open-source platform for building, sharing, and deploying machine learning models and datasets. Hosts 500k+ models, 100k+ datasets, and Spaces for interactive demos. The central hub of the open-source AI ecosystem, providing model discovery, inference APIs, and collaborative tools that make it the GitHub of machine learning for researchers and developers worldwide.
Replicate
Run and deploy ML models via API with simple pricing
Cloud platform that lets developers run thousands of open-source and proprietary public ML models through a simple API without managing GPUs or infrastructure. Replicate hosts models for image, text, audio, and video, supports Cog-based custom deployments and private models, and now operates as a distinct Cloudflare brand with pay-by-time or input/output pricing depending on the model.
Fireworks AI
Production-grade inference with serverless and on-demand GPUs
High-performance inference platform serving open-source and custom AI models at global scale, processing 13+ trillion tokens daily at ~180K requests per second. Fireworks AI delivers 1,000+ tokens per second on large models through quantization-aware tuning and adaptive speculation, with serverless, fine-tuning, and dedicated GPU options across text, image, and audio modalities.
Together AI
Open-weight inference, fine-tuning, and GPU-cloud platform
Together AI is a cloud platform for running, fine-tuning, batching, and training open-weight AI models. It supports serverless inference, dedicated endpoints, LoRA and full fine-tuning, GPU clusters, code-execution sandboxes, and async batch jobs up to 30B tokens per model. Current docs list fast-moving families such as Qwen, Kimi, GLM, GPT-OSS, DeepSeek, Llama, MiniMax, and Mistral.