Remove Auto-complete Remove Deep Learning Remove Inference Engine Remove Metadata
article thumbnail

Host ML models on Amazon SageMaker using Triton: TensorRT models

AWS Machine Learning Blog

TensorRT is an SDK developed by NVIDIA that provides a high-performance deep learning inference library. It’s optimized for NVIDIA GPUs and provides a way to accelerate deep learning inference in production environments. Triton Inference Server supports ONNX as a model format.

ML 88