# NVIDIA

2 tools tagged

Showing 2 of 2 tools

Triton Inference Server

NVIDIA's optimized AI model serving platform

Triton Inference Server is NVIDIA's open-source inference serving platform that deploys AI models from TensorRT, PyTorch, ONNX, TensorFlow, OpenVINO, Python, and more across cloud, data center, and edge environments. It supports dynamic batching, model ensembles, concurrent model execution on GPUs and CPUs, and real-time, streaming, and batch inference patterns. Includes Model Analyzer for profiling and Model Navigator for automated optimization.

open-sourceOpen Source

PersonaPlex

NVIDIA's real-time persona-driven voice dialogue model

PersonaPlex is NVIDIA's open-source, full-duplex speech-to-speech conversational AI model that enables persona control through text-based role prompts and audio-based voice conditioning. Built on the Moshi architecture, it produces natural, low-latency spoken interactions with consistent persona across conversations. The model supports multiple pre-packaged voice embeddings for both natural and varied speaking styles, making it suitable for building interactive voice agents and assistants.

open-sourceOpen Source