ONNX on AWS Inferentia
Last updated
Was this helpful?
Last updated
Was this helpful?
AWS Neuron does not directly support compilation of models in the ONNX file format. The recommended way to compile a model that is in the ONNX file format is to first convert the model to PyTorch using a publicly available tool like onnx2pytorch . Once the ONNX model is converted to PyTorch, it can then be compiled with the
torch_neuron.trace()
function to produce a model that can run on Neuron.This document is relevant for:
Inf1
,Inf2
,Trn1
,Trn1n
- AWS Neuron Documentation
Convert ONNX to pytorch use https://github.com/Talmaj/onnx2pytorch
and use torch_neuron.trace() of Neuron SDK