Much has been focused on how Intel and AMD plan for the future in packaging their dies to increase overall performance and mitigate higher manufacturing costs. For AMD, that next step was V-cache, an additional L3 cache (SRAM) chiplet designed to be 3D stacked on top of an existing Zen 3 chiplet, tripling the total L3 cache available. Today, AMD’s V-cache technology is finally available to the wider market, as AMD announces that its EPYC 7003X “Milan-X” server processors have now reached general availability.
As first announced late last year, AMD is bringing its 3D V-Cache technology to the enterprise market via Milan-X, an advanced variant of its Milan-based 3rd generation EPYC 7003 processors. AMD is releasing four new processors ranging from 16-core to 64-core, all with Zen 3 cores and 768MB of L3 cache via 3D stacked V-Cache.
AMD’s Milan-X processors are an upgraded version of its current Milan-based 3rd generation EPYC 7003 processors. Adding to its pre-existing Milan-based EPYC 7003 lineup, which we reviewed in June last year , the Milan-X’s most significant advancement is thanks to its large 768MB of L3 cache using AMD’s V-Cache 3D stacking technology. The AMD 3D V-Cache uses TSMC’s N7 process node – the same node that Milan’s Zen 3 chiplets are built on – and it measures 36mm², with a 64 MiB chip above the existing 32 MiB found on Zen 3 chiplets.
Focusing on key specs and technologies, the latest Milan-X AMD EPYC 7003-X processors feature 128 available PCIe 4.0 lanes that can be utilized through a selection of full-length PCIe 4.0 slots and controllers. It depends on how the motherboard and server vendors want to use them. There are also four memory controllers capable of supporting two DIMMs per controller, enabling the use of eight-channel DDR4 memory.
The overall chip layout for Milan-X is a giant nine-chip MCM, with eight CCD arrays and a large I/O array, and that goes for all Milan-X SKUs. Critically, AMD chose to equip all of its new EPYC V-cache chips with the maximum 768MB L3 cache, which means all 8 CCDs must be present, from the top SKU (EPYC 7773X) to the bottom SKU ( EPYC 7373X). Instead, AMD will vary the number of CPU cores enabled in each CCD. Moving down, each CCD includes 32 MB of L3 cache, with an additional 64 MB of overlaid 3D V-Cache for a total of 96 MB of L3 cache per CCD (8 x 96 = 768).
In terms of memory compatibility, nothing has changed compared to previous Milan chips. Each EPYC 7003-X chip supports eight DDR4-3200 memory modules per socket, with capacities up to 4TB per chip and 8TB on a 2P system. It should be noted that the new Milan-X EPYC 7003-X chips share the same SP3 socket as the existing line and as such are compatible with current LGA 4094 motherboards via a firmware update.
|AMD EPYC 7003 Milan/Milan-X processors|
|EYPC 7773X||64||128||2200||3500||768 MB||128×4.0||8x DDR4-3200||280||$8800|
|EPYC 7763||64||128||2450||3400||256 MB||128×4.0||8x DDR4-3200||280||$7890|
|EPYC7573X||32||64||2800||3600||768 MB||128×4.0||8x DDR4-3200||280||$5590|
|EPYC 75F3||32||64||2950||4000||256 MB||128×4.0||8x DDR4-3200||280||$4860|
|EPYC 7473X||24||48||2800||3700||768 MB||128×4.0||8x DDR4-3200||240||$3900|
|EPYC 74F3||24||78||3200||4000||256 MB||128×4.0||8x DDR4-3200||240||$2900|
|EPYC 7373X||16||32||3050||3800||768 MB||128×4.0||8x DDR4-3200||240||$4185|
|EPYC 73F3||16||32||3500||4000||256 MB||128×4.0||8x DDR4-3200||240||$3521|
Looking at the new EPYC 7003 stack with 3D V-Cache technology, the top SKU is the EPYC 7773X. It features 64 Zen3 cores with 128 threads, a base frequency of 2.2 GHz and a maximum boost frequency of 3.5 GHz. The EPYC 7573X has 32 cores and 64 threads, with a higher base frequency of 2.8 GHz and a boost frequency of up to 3.6 GHz. The EPYC 7773X and 7573X both have a base TDP of 280 W, although AMD specifies that the four EPYC 7003-X chips have a configurable TDP between 225 and 280 W.
The worst-performing chip in the new lineup is the EPYC 7373X, which has 16 cores with 32 threads, a base frequency of 3.05 GHz and a boost frequency of 3.8 GHz. Moving up the stack, it also has a 24c/48t option with a base frequency of 2.8 GHz and a boost frequency of up to 3.7 GHz. Both include a TDP of 240W, but like the larger components, AMD has confirmed that the 16- and 24-core models will have a configurable TDP between 225W and 280W.
Notably, all of these new Milan-X chips have some kind of clock speed regression over their usual Milan (maximum base performance) counterparts. In the case of the 7773X, this is the base clock speed, while the other SKUs all drop a little on the base and boost clock speeds. The drop is necessitated by the V-cache, which, with about 26 billion more transistors for a full Milan-X configuration, eats into the chips’ power budget. So, with AMD opting to keep the TDPs consistent, the clock speeds have been reduced a bit to compensate. As always, AMD’s processors will run as fast as heat and TDP headroom allow, but chips equipped with V-cache will hit those limits a little sooner.
AMD’s target market for the new Milan-X chips are customers who need to maximize per-core performance; specifically, the subset of workloads that benefit from the additional cache. This is why Milan-X chips are not a full replacement for EPYC 70F3 chips, as not all workloads will respond to the additional cache. Thus, the two lines will share the top spot as AMD’s fastest core EPYC SKUs.
For its part, AMD is particularly launching the new chips in the CAD/CAM market, for tasks such as finite element analysis and the automation of electronic design. According to the company, they saw a more than 66% increase in RTL verification speeds on Synopsys’ VCS verification software in an apples-to-apples comparison between Milan processors with and without V-cache. As with other chips that incorporate larger caches, the greatest benefits will be found in workloads that overflow contemporary-sized caches, but fit perfectly into the larger cache. Minimizing costly trips to main memory means processor cores can continue to operate much more often.
Microsoft found something similar last year, when they unveiled a public preview of its Azure HBv3 virtual machines in November. At the time, the company was releasing some performance numbers from its internal testing, mostly on HPC-related workloads. In comparing Milan-X directly to Milan, Microsoft used data from EPYC 7003 and EPYC 7003-X in its VM HBv3 platforms. It should also be noted that the tests were performed on dual-socket systems, as all EPYC 7003-X processors announced today could be used in both 1P and 2P deployments.
The performance data published by Microsoft Azure is encouraging, and through its internal testing, it appears that the additional L3 cache is playing a significant role. In Computational Fluid Dynamics it was noted that there was better speed with less elements, so this needs to be taken into consideration. Microsoft said that with its current HBv3 series, customers can expect maximum gains of up to 80% in computational fluid dynamics performance over previous HBv3 VM systems with Milan.
In conclusion, AMD’s EPYC 7003-X processors are now generally available to the public. With prices listed on a 1K unit order basis, AMD says the EPYC 7773X with 64C/128T will be available for around $8800, while the 32C/64T model, the EPYC 7573X, will cost around $5590. Going down, the EPYC 7473X with 24C/48T will cost $3900, and the entry EPYC 7373X with 16C/32T will cost a bit more with a cost of $4185.
Given the large orders required, the overall retail price is likely to be slightly higher for one unit. Although the majority of AMD’s customers are server and cloud providers, AMD will undoubtedly have customers who buy in bulk. Many of AMD’s major server OEM partners are also expected to start shipping systems using the new chips, including Dell, Supermicro, Lenovo and HPE.
Finally, consumers will have their own chance to get their hands on some AMD V-cache enabled processors next month when AMD’s second V-cache product, the Ryzen 7 5800X3D, is released. The desktop processor is based on a single CCD with a whopping 96MB of available L3 cache, which contrasts well with the much larger EPYC chips.