What This Stack Does
Multi-agent systems — where specialized AI agents collaborate to complete complex tasks — represent the next evolution beyond single-model applications. This stack provides the frameworks, infrastructure, and development tools needed to build agent systems that plan, research, code, and iterate autonomously. It's designed for developers moving beyond single-prompt AI into architectures where multiple agents with different roles work together.
Two Approaches to Agent Orchestration
CrewAI provides the most intuitive multi-agent orchestration. You define agents with roles (Researcher, Writer, Analyst, Coder), assign them tasks with dependencies, and CrewAI manages the execution flow — agents communicate, delegate sub-tasks, and produce structured outputs. The role-based abstraction makes it straightforward to model real-world workflows where different specialists collaborate. CrewAI's growing library of tools and integrations accelerates development.
AutoGen, maintained by Microsoft, offers a more flexible conversation-based approach. Agents communicate through message passing, enabling patterns like round-robin discussions, hierarchical delegation, and human-in-the-loop workflows. AutoGen's GroupChat pattern is powerful for scenarios where multiple agents need to discuss and converge on a solution. The code execution capability lets agents write and run code autonomously.
The Shared Infrastructure Layer
LangChain and LangGraph serve as the underlying toolkit. LangChain provides the tool integration layer — connecting agents to APIs, databases, search engines, and custom functions. LangGraph handles stateful workflows with graph-based execution, checkpointing, and human-in-the-loop breakpoints. Both CrewAI and AutoGen can use LangChain tools, making LangChain the common integration layer regardless of which orchestration framework you choose.
LiteLLM as the LLM gateway is essential for multi-agent systems because different agents benefit from different models. Your Researcher agent might use a web-search-capable model, your Coder agent might use Claude for its reasoning depth, and your Summarizer might use a fast, cheap model like GPT-4o mini. LiteLLM routes each agent's requests to the optimal provider while tracking total costs — critical when agent conversations can generate hundreds of LLM calls per task.
The Bottom Line
Weaviate provides the knowledge layer — multi-agent systems frequently need shared memory and accumulated knowledge that persists across interactions. Weaviate's vector database stores this knowledge with semantic search retrieval. Ollama handles local development and testing at zero cost, since running agent systems against cloud APIs during development is expensive. LiteLLM makes the switch from local to cloud transparent: change the model string and everything else stays the same.