- AMD launches AMD Instinct™ MI200 series accelerators, previews 3rd Gen AMD EPYC™ processors with AMD 3D V-Cache, and provides new details on expanded set of next-generation EPYC™ processors powered by “Zen 4” and “Zen 4c” CPU cores —
- Meta chooses EPYC™ CPUs for its data center —
SANTA CLARA, Calif. — November 9, 2021 — AMD (NASDAQ: AMD) held the virtual Accelerated Data Center Premiere , launching the new AMD Instinct™ MI200 series accelerators, the world’s fastest accelerator for high performance computing (HPC) and artificial intelligence (AI) workloads[i], and provided a preview of the innovative 3rd Gen AMD EPYC™ processors with AMD 3D V-Cache. AMD also revealed new information about its next generation “Zen 4” processor core and announced the new “Zen 4c” processor core, both of which will power future AMD server processors and are designed to extend the company’s leadership products for the data center.
“We are in a high-performance computing megacycle that is driving demand for more compute to power the services and devices that impact every aspect of our daily lives,” said Dr. Lisa Su, president and CEO, AMD. “We are building significant momentum in the data center with our leadership product portfolio, including Meta’s adoption of AMD EPYC to power their infrastructure and the buildout of Frontier, the first U.S. exascale supercomputer which will be powered by EPYC and AMD Instinct processors. In addition, today we announced a breadth of new products that build on that momentum in next-generation EPYC processors with new innovations in design, leadership, 3D packaging technology, and 5 nm high-performance manufacturing to further extend our leadership for cloud, enterprise and HPC customers.”
Meta Adopts EPYC CPUs [03:09 – 05:29]
AMD announced Meta is the latest major hyperscale cloud company that has adopted AMD EPYC CPUs to power its data centers. AMD and Meta worked together to define an open, cloud-scale, single-socket server designed for performance and power efficiency, based on the 3rd Gen EPYC processor. Further details will be discussed at the Open Compute Global Summit later this week.
Advanced Packaging Driving Data Center Performance [05:35 – 18:00]
AMD previewed the use of innovative 3D chiplet packaging technology in the data center with the first server CPU using high-performance 3D die stacking. The 3rd Gen AMD EPYC processors with AMD 3D V-Cache, codenamed “Milan-X,” represents an innovative step forward in CPU design and packaging, and will offer a 50% average performance uplift across targeted technical computing workloads[ii].
- 3rd Gen EPYC with AMD 3D V-Cache will offer the same capabilities and features as the 3rd Gen EPYC processors and they will be drop-in compatible with a BIOS upgrade, delivering easy adoption and performance enhancements.
- Microsoft Azure HPC virtual machines featuring 3rd Gen EPYC with AMD 3D V-Cache are available today in Private Preview, with broad rollout in the coming weeks. More information on performance and availability is available here.
- 3rd Gen EPYC CPUs with AMD 3D V-Cache will launch in Q1 2022. Partners including Cisco, Dell Technologies, Lenovo, HPE and Supermicro are planning to offer server solutions with these processors.
Delivering Exascale Class Performance for Accelerated Computing [18:02 – 31:50]
AMD launched the AMD Instinct MI200 series accelerators. Based on the AMD CDNA™2 architecture, the MI200 series accelerators are the most advanced accelerators in the world[iii] and provide up to 4.9x higher peak performance for HPC workloads[iv] and 1.2X higher peak flops of mixed precision performance for leadership AI training, helping to fuel the convergence of HPC and AI.
- Utilized in the Frontier supercomputer at Oak Ridge National Laboratory, the HPC and AI performance capabilities in AMD Instinct MI200 series accelerators will be key in enabling researchers and scientists to accelerate their time to science and discovery.
“Zen 4” Powered Data Center, Designed for Leadership Performance [31:52 – 36:22]
AMD provided new details on the expanded next generation AMD EPYC processors codenamed “Genoa” and “Bergamo.”
- “Genoa” is expected to be the world’s highest performance processor for general purpose computing. It will have up to 96 high-performance “Zen 4” cores produced on optimized 5nm technology, and will support the next generation of memory and I/O technologies with DDR5 and PCIe® 5. “Genoa” will also include support for CXL, enabling significant memory expansion capabilities for data center applications. “Genoa” is on track for production and launch in 2022.
- “Bergamo” is a high-core count CPU, tailor made for cloud native applications, featuring 128 high performance “Zen 4c” cores. AMD optimized the new “Zen 4c” core for cloud-native computing, tuning the core design for density and increased power efficiency to enable higher core count processors with breakthrough performance per-socket. “Bergamo” comes with all the same software and security features and is socket compatible with “Genoa.” “Bergamo” is on track to ship in the first half of 2023.
You can watch the full video here and learn more about all of the products discussed during the event here.
Supporting Resources
- Learn more about theAMD Instinct™ accelerators
- Learn more about AMD EPYC™ processors
- Learn more about the Oak Ridge National Laboratory’s Frontier supercomputer
- Become a fan of AMD on Facebook
- Follow AMD on Twitter
- Connect with AMD On LinkedIn
[i] MI200-01: World’s fastest data center GPU is the AMD Instinct™ MI250X. Calculations conducted by AMD Performance Labs as of Sep 15, 2021, for the AMD Instinct™ MI250X (128GB HBM2e OAM module) accelerator at 1,700 MHz peak boost engine clock resulted in 95.7 TFLOPS peak theoretical double precision (FP64 Matrix), 47.9 TFLOPS peak theoretical double precision (FP64), 95.7 TFLOPS peak theoretical single precision matrix (FP32 Matrix), 47.9 TFLOPS peak theoretical single precision (FP32), 383.0 TFLOPS peak theoretical half precision (FP16), and 383.0 TFLOPS peak theoretical Bfloat16 format precision (BF16) floating-point performance. Calculations conducted by AMD Performance Labs as of Sep 18, 2020 for the AMD Instinct™ MI100 (32GB HBM2 PCIe® card) accelerator at 1,502 MHz peak boost engine clock resulted in 11.54 TFLOPS peak theoretical double precision (FP64), 46.1 TFLOPS peak theoretical single precision matrix (FP32), 23.1 TFLOPS peak theoretical single precision (FP32), 184.6 TFLOPS peak theoretical half precision (FP16) floating-point performance. Published results on the NVidia Ampere A100 (80GB) GPU accelerator, boost engine clock of 1410 MHz, resulted in 19.5 TFLOPS peak double precision tensor cores (FP64 Tensor Core), 9.7 TFLOPS peak double precision (FP64). 19.5 TFLOPS peak single precision (FP32), 78 TFLOPS peak half precision (FP16), 312 TFLOPS peak half precision (FP16 Tensor Flow), 39 TFLOPS peak Bfloat 16 (BF16), 312 TFLOPS peak Bfloat16 format precision (BF16 Tensor Flow), theoretical floating-point performance. The TF32 data format is not IEEE compliant and not included in this comparison. https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/nvidia-ampere-architecture-whitepaper.pdf, page 15, Table 1.
[ii] MLNX-021R: AMD internal testing as of 09/27/2021 on 2x 64C 3rd Gen EPYC with AMD 3D V-Cache (Milan-X) compared to 2x 64C AMD 3rd Gen EPYC 7763 CPUs using cumulative average of each of the following benchmark’s maximum test result score: ANSYS® Fluent® 2021.1, ANSYS® CFX® 2021.R2, and Altair Radioss 2021. Results may vary.
[iii] MI200-31: As of October 20th, 2021, the AMD Instinct™ MI200 series accelerators are the “Most advanced server accelerators (GPUs) for data center,” defined as the only server accelerators to use the advanced 6nm manufacturing technology on a server. AMD on 6nm for AMD Instinct MI200 series server accelerators. Nvidia on 7nm for Nvidia Ampere A100 GPU. https://developer.nvidia.com/blog/nvidia-ampere-architecture-in-depth/
[iv] MI200-02: Calculations conducted by AMD Performance Labs as of Sep 15, 2021, for the AMD Instinct™ MI250X accelerator (128GB HBM2e OAM module) at 1,700 MHz peak boost engine clock resulted in 95.7 TFLOPS peak double precision matrix (FP64 Matrix) theoretical, floating-point performance. Published results on the NVidia Ampere A100 (80GB) GPU accelerator resulted in 19.5 TFLOPS peak double precision (FP64 Tensor Core) theoretical, floating-point performance. Results found at: https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/nvidia-ampere-architecture-whitepaper.pdf, page 15, Table 1.