aicoolies logo

NCNN

High-performance mobile neural network inference

Share
open-sourceOpen Source
Visit Website →

NCNN is Tencent's high-performance neural network inference framework optimized for mobile and embedded platforms. It features pure C++ with zero dependencies, ARM NEON assembly optimization, Vulkan GPU acceleration, and sophisticated memory management for minimal footprint. Supports importing models from PyTorch, ONNX, Caffe, TensorFlow, and Keras with 8-bit quantization and half-precision storage for efficient on-device deployment across Android, iOS, and Linux.

NCNN is Tencent's production-grade neural network inference framework designed from the ground up for mobile and embedded devices. Unlike frameworks that adapt desktop implementations for mobile, NCNN was built with mobile constraints as first-class requirements: zero external dependencies, minimal binary size, and hand-tuned ARM NEON assembly for maximum throughput on the processors that power smartphones and edge devices. It runs inside over 30 Tencent applications including WeChat, QQ, and Taobao.

The framework supports model import from all major training frameworks through its PNNX converter, which can translate PyTorch models directly without going through the often-unreliable ONNX intermediate step. It handles quantized 8-bit integer and half-precision floating-point inference for reduced model size and faster execution, and provides Vulkan-based GPU acceleration that works across Android, iOS, and desktop platforms. The extensive operator coverage supports CNN, RNN, Transformer, and detection architectures commonly used in mobile AI features.

NCNN's ARM big.LITTLE aware scheduling automatically distributes workloads across performance and efficiency CPU cores for optimal power consumption, a critical concern for battery-powered devices. The framework's memory management is designed for environments with limited RAM, reusing buffers and minimizing allocations during inference. With over 22,000 GitHub stars and active maintenance from Tencent's engineering team, NCNN remains one of the most mature and widely deployed options for developers shipping on-device AI features on mobile platforms.

Pricing

Free and open source under BSD license

Platforms

Android, iOS, Linux, Windows, macOS

Categories

Tags

Use Cases

Alternatives

Related Tools

Marqo logo

Marqo

Embedding-first search and discovery engine for AI-powered product experiences.

Marqo is an open-source tensor search engine that combines embedding generation and vector search in a single API, removing the need to manage separate embedding pipelines and vector databases. Built for product discovery and multi-modal search, it lets teams index text, images, and structured data together, returning ranked results based on semantic similarity rather than keyword overlap.

freemium
Freestyle logo

Freestyle

Sandboxes for coding agents — Linux VMs, Git, and deploys in one box

Freestyle is YC-backed sandbox infrastructure built for AI coding agents, shipping secure Linux VMs with nested virtualization, Git servers, and one-click web deploys. It lets agents run real workloads, branch repos, and deploy apps under short-lived identities while billing only for active compute. Used in production by vly.ai, Rork, and Vibeflow.

freemium
OpenSRE logo

OpenSRE

Open-source toolkit for building AI SRE incident response agents

OpenSRE is an open-source Python toolkit from Tracer Cloud for building AI SRE agents that investigate and respond to production incidents. It ships with connectors to Prometheus, Grafana, Kubernetes and incident platforms, plus a simulation harness that replays past incidents so teams can benchmark agent accuracy before trusting it on live pager rotations.

open-sourceOpen Source
Magika logo

Magika

AI-powered file-type detection at Google scale

Open-source AI-powered file-type detection tool from Google that uses a custom deep-learning model under a few megabytes to identify more than 200 binary and textual content types in milliseconds, even on a single CPU. Magika ships as a CLI, Python package, JavaScript/TypeScript library, and an ONNX model, achieves around 99% accuracy on its test set, and is already used at Google scale across Gmail, Drive, and Safe Browsing as well as by VirusTotal and abuse.ch.

freeOpen Source
Zep logo

Zep

Context engineering platform for AI agents with temporal knowledge graphs

Zep is a context engineering platform that assembles relationship-aware context for AI agents from conversations, business data, documents, and events. It maintains a temporal knowledge graph that automatically extracts entities and relationships, tracking how context evolves over time. Zep delivers formatted context blocks optimized for LLMs with sub-200ms latency, integrating with LangChain, LlamaIndex, AutoGen, and Google ADK through Python, TypeScript, and Go SDKs.

freemium
Hindsight logo

Hindsight

Agent memory system that learns, not just remembers

Hindsight is an agent memory system that enables AI agents to learn from experience rather than just store conversations. It organizes memories into three biomimetic categories: World knowledge for facts, Experiences for agent events, and Mental Models for learned understanding. The system provides retain, recall, and reflect operations backed by a temporal knowledge graph with parallel retrieval strategies including semantic, keyword, graph traversal, and temporal search.

freemiumOpen Source