RouteLLM addresses the cost optimization challenge of LLM-powered applications where most requests are simple enough for cheaper models but some require the capability of frontier models. The system uses trained classifier models that evaluate each incoming request's complexity and route it to the most cost-effective model that can handle it adequately. Simple queries go to fast, cheap models while complex queries route to powerful, expensive ones.
The routing models are trained on preference data from the Chatbot Arena, learning the relationship between query characteristics and model capability requirements. This data-driven approach produces routing decisions that reflect real-world quality judgments rather than heuristic rules. The system provides configurable quality thresholds that let operators tune the cost-quality tradeoff based on their application's requirements.
Developed by LMSYS, the team behind Chatbot Arena and the LLM evaluation ecosystem, RouteLLM provides an OpenAI-compatible API that serves as a drop-in replacement for direct model API calls. Applications point their API requests at RouteLLM instead of a specific model, and the router handles model selection transparently. With over 4,800 GitHub stars, RouteLLM enables significant cost reduction for applications that currently send all requests to frontier models regardless of complexity.