aicoolies logo
Google AI Edge Gallery logo

Google AI Edge Gallery

Run open-source LLMs on your phone, fully offline and private

Share
open-sourceOpen Source
Visit Website →

Google AI Edge Gallery is an open-source mobile app that lets you download and run large language models like Gemma directly on Android and iOS devices with zero cloud dependency. Built on MediaPipe and LiteRT, it features AI chat with reasoning mode, multimodal image analysis, real-time audio transcription, and autonomous agent skills—all running entirely on-device for complete privacy. A reference implementation for developers building offline-first AI experiences.

Google AI Edge Gallery brings powerful generative AI directly to mobile devices without requiring any internet connection or cloud service. Developed by Google as an open-source showcase, the app lets users download models from the Gemma family and run multi-turn conversations, analyze images, transcribe audio, and execute multi-step agent workflows entirely on their phone's hardware. All inference happens locally, meaning prompts, images, and personal data never leave the device.

The app includes several distinct capabilities: AI Chat with Thinking Mode reveals the model's step-by-step reasoning during conversations; Ask Image provides multimodal analysis using the device camera for object detection and visual question-answering; Audio Scribe handles real-time speech-to-text transcription and translation without cloud APIs; and Prompt Lab gives developers fine-grained control over parameters like temperature and top-k sampling. The Agent Skills feature enables autonomous, multi-step workflows that can query Wikipedia, look up locations, and chain tool calls together—all on-device.

Built on Google's MediaPipe framework and LiteRT runtime (formerly TensorFlow Lite), the gallery supports loading custom models from Hugging Face and includes hardware benchmarking to compare model performance across different devices. Licensed under Apache 2.0, it serves both as a polished consumer app—reaching the App Store top 10—and as a production-ready reference for developers building privacy-first mobile AI applications.

Pricing

Free and open source (Apache 2.0)

Platforms

Android 12+, iOS 17+

Categories

Tags

Use Cases

Alternatives

Ollama logo

Ollama

Run LLMs locally with one command

Tool for running large language models locally on your machine with a simple CLI interface. Download and run Llama 3, Mistral, Gemma, Phi, Code Llama, and dozens of other open-source models with a single command. Features model management, GPU acceleration (NVIDIA/AMD/Apple Silicon), OpenAI-compatible API server, Modelfile for customization, and multi-model switching. Ideal for offline AI development, privacy-sensitive use cases, and local testing. 120K+ GitHub stars.

open-sourceOpen Source
LM Studio logo

LM Studio

Run local LLMs with an intuitive desktop GUI and OpenAI-compatible API server.

Free desktop application by Element Labs for discovering, downloading, and running open-source LLMs locally. Features a curated Hugging Face model browser, side-by-side model comparison, parameter tuning, and an OpenAI-compatible API server on localhost:1234. Powered by llama.cpp with Metal acceleration for Apple Silicon.

free

MLX-VLM

Run and fine-tune Vision Language Models locally on Mac

Open-source Python package for running and fine-tuning Vision Language Models locally on Mac using Apple's MLX framework. Supports multimodal inference with images, audio, and video across Qwen, DeepSeek, Phi, and Gemma architectures. Features OpenAI-compatible API server, Gradio chat UI, and KV cache optimization. 3.8K+ GitHub stars.

open-sourceOpen Source
Jan logo

Jan

Offline-first AI assistant for local inference

Jan is an open-source offline-first AI assistant with 25K+ GitHub stars running LLMs locally without sending data externally. Features a ChatGPT-like interface with one-click model downloads from Hugging Face, conversation management, customizable prompts, and an OpenAI-compatible local API server. Supports GGUF models via llama.cpp with GPU acceleration on NVIDIA and Apple Silicon. Built with Electron for macOS, Windows, and Linux with full data privacy.

open-sourceOpen Source

Related Tools

Deep Lake logo

Deep Lake

AI data runtime for multimodal datasets and vector search

Deep Lake is an open-source AI data runtime from Activeloop for storing, versioning, and querying multimodal data and embeddings. It fits teams building RAG, training, evaluation, or dataset-heavy agent workflows that need a bridge between vector search, structured metadata, and large image, text, audio, or video collections.

open-sourceOpen Source
SeekDB logo

SeekDB

AI-native state store with hybrid vector and full-text search

SeekDB is an open-source AI-native state store from the OceanBase ecosystem that combines MySQL-compatible data access with hybrid vector and full-text retrieval. It targets agent and AI application teams that need embedded or server deployment, copy-on-write style sandboxes, and searchable state without gluing together several separate storage layers.

open-sourceOpen Source
Marqo logo

Marqo

Embedding-first search and discovery engine for AI-powered product experiences.

Marqo is an open-source tensor search engine that combines embedding generation and vector search in a single API, removing the need to manage separate embedding pipelines and vector databases. Built for product discovery and multi-modal search, it lets teams index text, images, and structured data together, returning ranked results based on semantic similarity rather than keyword overlap.

freemium
Magika logo

Magika

AI-powered file-type detection at Google scale

Open-source AI-powered file-type detection tool from Google that uses a custom deep-learning model under a few megabytes to identify more than 200 binary and textual content types in milliseconds, even on a single CPU. Magika ships as a CLI, Python package, JavaScript/TypeScript library, and an ONNX model, achieves around 99% accuracy on its test set, and is already used at Google scale across Gmail, Drive, and Safe Browsing as well as by VirusTotal and abuse.ch.

freeOpen Source
Zep logo

Zep

Context engineering platform for AI agents with temporal knowledge graphs

Zep is a context engineering platform that assembles relationship-aware context for AI agents from conversations, business data, documents, and events. It maintains a temporal knowledge graph that automatically extracts entities and relationships, tracking how context evolves over time. Zep delivers formatted context blocks optimized for LLMs with sub-200ms latency, integrating with LangChain, LlamaIndex, AutoGen, and Google ADK through Python, TypeScript, and Go SDKs.

freemium
Hindsight logo

Hindsight

Agent memory system that learns, not just remembers

Hindsight is an agent memory system that enables AI agents to learn from experience rather than just store conversations. It organizes memories into three biomimetic categories: World knowledge for facts, Experiences for agent events, and Mental Models for learned understanding. The system provides retain, recall, and reflect operations backed by a temporal knowledge graph with parallel retrieval strategies including semantic, keyword, graph traversal, and temporal search.

freemiumOpen Source