Magika is an open-source file-type detection tool from Google that replaces classic signature-based utilities with a compact deep-learning model. A custom model weighing only a few megabytes and trained on roughly 100 million samples across more than 200 content types identifies binary and textual formats in milliseconds on a single CPU. The project reports around 99% accuracy on its test set and already powers file-type routing at Google scale — Gmail, Drive, and Safe Browsing rely on it to send hundreds of billions of samples per week into the right security and content-policy scanners.

The tool is deliberately polyglot. A Rust-powered CLI lets you run `magika file.bin` on a server or developer machine; the Python package exposes a `Magika` class with streaming APIs and scoring thresholds; a JavaScript/TypeScript package targets Node and browsers for client-side detection; and the underlying Keras-trained ONNX model can be embedded in any language with an ONNX runtime. Magika has been integrated with VirusTotal and abuse.ch and is commonly used as a pre-filter in malware-analysis pipelines, data-lake ingestion, DLP tools, and forensic triage where GNU `file` and libmagic fall short on obfuscated or renamed inputs.

For AI-infrastructure teams Magika slots in wherever you need fast, language-agnostic content detection without external calls. It is Apache-2.0 licensed so it can ship inside commercial products, it runs offline so it is safe in regulated environments, and it returns a rich label plus a confidence score that you can threshold per use case. Typical deployments put Magika in front of virus scanners, attachment filters, LLM upload pipelines, and automated reverse-engineering workflows — anywhere a wrong file-type guess would send a file to the wrong processor. The project is actively maintained by Google's security team on GitHub with regular model and dataset updates.

Magika

Pricing

Platforms

Categories

Tags

Use Cases

Related Tools

Deep Lake

SeekDB

Agent Governance Toolkit

Baz

Rampart

Statewright