Introducing Amazon SageMaker ml.p3dn.24xlarge instances, optimized for distributed machine learning with up to 4x the network bandwidth of ml.p3.16xlarge instances

The ml.p3dn.24xlarge instances provide up to 100 Gbps of networking throughput, 96 custom Intel® Xeon® Scalable (Skylake) vCPUs, 8 NVIDIA® V100 Tensor Core GPUs with 32 GB of memory each, 300 GB/s NVLINK GPU interconnect, and 1.8 TB of local NVMe-based SSD storage. Compared to the next largest P3 instance, the 4X increase in network throughput, coupled with faster processors and local NVMe-based SSD storage, will enable developers to efficiently distribute their machine learning training jobs across several ml.p3dn.24xlarge instances and remove data transfer and preprocessing bottlenecks.

Below is a comparison of how Amazon SageMaker ml.p3dn.24xlarge instances compare to existing Amazon SageMaker ML P3 instances.

Amazon SageMaker ml.p3dn.24xlarge instances are available in the US East (N. Virginia) and US West (Oregon) AWS regions. With these instances customers can use the 1.8 TB of local NVMe-based SSD storage eliminating the need for creating and paying for additional ml storage volumes. Visit Amazon SageMaker documentation to learn more about using local NVMe-based SSD storage on this instance type. Visit the P3 page to learn more about how the P3 instances are being used by AWS customers.



https://aws.amazon.com/about-aws/whats-new/2019/10/introducing-amazon-sagemaker-mlp3dn24xlarge-instances/