LLM Gateway Stack

Infrastructure for routing, optimizing, and monitoring LLM API calls across multiple providers. This stack combines local inference, API gateway, observability, and cost tracking to give development teams full control over their AI application economics and provider dependencies.

What This Stack Does

As AI applications move to production, managing LLM provider interactions becomes a significant operational concern. This stack provides the infrastructure layer for routing, fallback, cost optimization, and monitoring across all your LLM API calls.

Local Development and Unified Routing

Ollama provides local inference for development, testing, and privacy-sensitive workloads. Running models locally at zero marginal cost means you can iterate on prompts, run test suites, and develop without burning API credits.

LiteLLM serves as the API gateway that unifies 100+ LLM providers behind a single OpenAI-compatible interface. Route requests to different providers based on model capability, cost, or latency with automatic fallbacks.

Observability and Production Optimization

Langfuse provides the observability layer — tracing every LLM call with input/output, latency, token counts, and costs. Self-hostable and open-source, it gives you the monitoring data needed to optimize prompts and track quality.

Portkey adds production-grade gateway features: semantic caching, request retries, budget limits, and detailed analytics. For teams spending significantly on LLM APIs, Portkey's optimization features can reduce costs by 30-50%.

The Bottom Line

Together, these tools create a complete LLM infrastructure stack: local development with Ollama, production routing with LiteLLM, observability with Langfuse, and optimization with Portkey. Each component is independently replaceable.

Tool	Role	Pricing	Open Source
Ollama	Local LLM Inference	Free	Yes
LiteLLM	Multi-Provider API Gateway	Free (open-source) / Enterprise available	Yes
Langfuse	LLM Observability & Tracing	Free open-source / Cloud free tier / Pro from $59/mo	Yes
Portkey	Production Gateway & Cost Controls	Free tier (10K requests/mo) / Growth from $49/mo / Enterprise custom	Yes

LLM Gateway Stack

What This Stack Does

Local Development and Unified Routing

Observability and Production Optimization

The Bottom Line

Stack Overview