aicoolies logo

# fine-tuning

10 tools tagged

Showing 10 of 10 tools

ms-swift

ModelScope's fine-tuning framework supporting 600+ models

ms-swift is ModelScope's open-source framework for fine-tuning over 600 large language and multimodal models. It supports SFT, DPO, RLHF, LoRA, QLoRA, and full fine-tuning with a web UI and CLI interface. Optimized for the Chinese AI ecosystem with native ModelScope Hub integration alongside Hugging Face support. Over 13,500 GitHub stars.

open-sourceOpen Source

torchtune

Meta's official PyTorch library for LLM fine-tuning

torchtune is Meta's official PyTorch-native library for fine-tuning large language models. It provides composable building blocks for training recipes covering LoRA, QLoRA, full fine-tuning, DPO, and knowledge distillation. Supports Llama, Mistral, Gemma, Qwen, and Phi model families with distributed training across multiple GPUs. Designed as a hackable, dependency-minimal alternative to higher-level frameworks.

open-sourceOpen Source

LLaMA-Factory

Unified framework for fine-tuning 100+ large language models

LLaMA-Factory is an open-source toolkit providing a unified interface for fine-tuning over 100 LLMs and vision-language models. It supports SFT, RLHF with PPO and DPO, LoRA and QLoRA for memory-efficient training, and continuous pre-training. The LLaMA Board web UI enables no-code configuration, while CLI and YAML workflows serve advanced users. Integrates with Hugging Face, ModelScope, vLLM, and SGLang for model deployment.

open-sourceOpen Source
Unsloth logo

Unsloth

2x faster LLM fine-tuning with 70% less VRAM on a single GPU

Unsloth is an open-source framework for fine-tuning large language models up to 2x faster while using 70% less VRAM. Built with custom Triton kernels, it supports 500+ model architectures including Llama 4, Qwen 3, and DeepSeek on consumer NVIDIA GPUs. Unsloth Studio adds a no-code web UI for dataset creation, training observability, model comparison, and GGUF export for Ollama and vLLM deployment.

open-sourceOpen Source
OpenAI API logo

OpenAI API

API for GPT-5 family models, multimodal generation, embeddings, and agents

Official API platform for the GPT-5 family, reasoning/thinking variants, multimodal generation, speech, embeddings, and agent workflows. Features the Responses API, tool calling, structured outputs, batch processing, fine-tuning, and SDK support. It remains one of the most widely integrated AI APIs in the developer ecosystem, but model choice, retention settings, rate limits, and pricing tiers require active governance in production.

api-usage-based
Google Vertex AI logo

Google Vertex AI

Google Cloud ML platform with Gemini and custom models

Google Cloud's end-to-end ML platform with Gemini models, Model Garden featuring 150+ models, AutoML, and custom training pipelines. Features Vertex AI Search, Conversation, and Agent Builder for enterprise AI applications. The comprehensive platform for organizations building production AI systems at scale within the Google Cloud ecosystem, with enterprise governance and compliance built in.

api-usage-based
Hugging Face logo

Hugging Face

The GitHub of ML — model hub, datasets, and inference

Open-source platform for building, sharing, and deploying machine learning models and datasets. Hosts 500k+ models, 100k+ datasets, and Spaces for interactive demos. The central hub of the open-source AI ecosystem, providing model discovery, inference APIs, and collaborative tools that make it the GitHub of machine learning for researchers and developers worldwide.

freemiumOpen Source
Replicate logo

Replicate

Run and deploy ML models via API with simple pricing

Cloud platform that lets developers run thousands of open-source and proprietary public ML models through a simple API without managing GPUs or infrastructure. Replicate hosts models for image, text, audio, and video, supports Cog-based custom deployments and private models, and now operates as a distinct Cloudflare brand with pay-by-time or input/output pricing depending on the model.

api-usage-based
Fireworks AI logo

Fireworks AI

Production-grade inference with serverless and on-demand GPUs

High-performance inference platform serving open-source and custom AI models at global scale, processing 13+ trillion tokens daily at ~180K requests per second. Fireworks AI delivers 1,000+ tokens per second on large models through quantization-aware tuning and adaptive speculation, with serverless, fine-tuning, and dedicated GPU options across text, image, and audio modalities.

freemium
Together AI logo

Together AI

Open-weight inference, fine-tuning, and GPU-cloud platform

Together AI is a cloud platform for running, fine-tuning, batching, and training open-weight AI models. It supports serverless inference, dedicated endpoints, LoRA and full fine-tuning, GPU clusters, code-execution sandboxes, and async batch jobs up to 30B tokens per model. Current docs list fast-moving families such as Qwen, Kimi, GLM, GPT-OSS, DeepSeek, Llama, MiniMax, and Mistral.

api-usage-based