2 tools tagged
Showing 2 of 2 tools
Hot-swap between local LLM models via OpenAI-compatible API
llama-swap is an open-source tool that manages multiple local LLM models behind a single OpenAI-compatible API endpoint. It automatically loads and unloads models on demand, letting developers hot-swap between different models without restarting services. With 3.1K+ GitHub stars, it solves the common pain point of running multiple specialized models on limited hardware.
Run and fine-tune Vision Language Models locally on Mac
Open-source Python package for running and fine-tuning Vision Language Models locally on Mac using Apple's MLX framework. Supports multimodal inference with images, audio, and video across Qwen, DeepSeek, Phi, and Gemma architectures. Features OpenAI-compatible API server, Gradio chat UI, and KV cache optimization. 3.8K+ GitHub stars.