Amazon Elastic Inference allows you to attach just the right amount of GPU-powered acceleration to any Amazon EC2 instance, Amazon SageMaker instance or Amazon ECS task to reduce the cost of running deep learning inference by up to 75%. With Amazon Elastic Inference, you can choose the instance type that is best suited to the overall CPU and memory needs of your application, and separately configure the amount of inference acceleration that you need with no code changes. Until now, you could provision a maximum of 4GB of GPU memory on Elastic Inference. Now, you can choose among 3 new accelerator types, which have 2GB, 4GB and 8GB of GPU memory respectively. Amazon Elastic Inference supports TensorFlow, Apache MXNet, and ONNX models, with more frameworks coming soon.
The new Elastic Inference Accelerators are available in US East (Virginia), US West (Oregon), US East (Ohio), Asia Pacific (Seoul) and EU (Ireland). Support for other regions is coming soon.