Vik


Everspin Announces New MRAM Products And Partnerships

Everspin Announces New MRAM Products And Partnerships

Magnetoresistive RAM manufacturer Everspin has announced their first MRAM-based storage products and issued two other press releases about recent accomplishments. Until now, Everspin’s business model has been to sell discrete MRAM components, but they’re introducing a NVMe SSD based on their MRAM. Everspin’s MRAM is one of the highest-performing and most durable non-volatile memory technologies on the market today, but its density and capacity falls far short of NAND flash, 3D XPoint, and even DRAM. As a result, use of MRAM has largely been confined to embedded systems and industrial computing that need consistent performance and high reliability, but have very modest capacity requirements. MRAM has also seen some use as a non-volatile cache or configuration memory in some storage array controllers. The new nvNITRO family of MRAM drives is intended to be used as a storage accelerator: a high-IOPS low-latency write cache or transaction log, with performance exceeding that of any single-controller drive based on NAND flash.

Everspin’s current generation of spin-torque MRAM has a capacity of 256Mb per die with a DDR3 interface (albeit with very different timings from JEDEC standard for DRAM). The initial nvNITRO products will use 32 or 64 MRAM chips to offer capacities of 1GB or 2GB on a PCIe 3 x8 card. MRAM has high enough endurance that the nvNITRO does not need to perform any wear leveling, which allows for a drastically simpler controller design and means performance does not degrade over time or as the drive is filled up—the nvNITRO does not need any large spare area or overprovisioning. Read and write performance are also nearly identical, while flash memory suffers from much slower writes than reads, which forces flash-based SSDs to buffer and combine writes in order to offer good performance. Everspin did not have complete performance specifications available at time of writing, but the numbers they did offer are very impressive: 6µs overall latency for 4kB transfers (compared to 20µs for the Intel SSD DC P3700), and 1.5M IOPS (4kB) at QD32 (compared to 1.2M IOPS read/200k IOPS write for the HGST Ultrastar SN260). The nvNITRO does rely somewhat on higher queue depths to deliver full performance, but it is still able to deliver over 1M IOPS at QD16, around 800k IOPS at QD8, and QD1 performance is around 175k IOPS read/150k IOPS write. MRAM supports fine-grained access, so the nvNITRO performs well even with small transfer sizes: Everspin has hit 2.2M IOPS for 512B transfers, although that is not an official performance specification or measurement from the final product.

As part of today’s announcements, Everspin is introducing MRAM support for Xilinx UltraScale FPGAs in the form of scripts for Xilinx’s Memory Interface Generator tool. This will allow customers to integrate MRAM into their designs as easily as they would use SDRAM or SRAM. The nvNITRO drives are a demonstration of this capability, as the SSD controller is implemented on a Xilinx FPGA. The FPGA provides the PCIe upstream link as a standard feature, the memory controller is Everspin’s new and Everspin has developed a custom NVMe implementation to take advantage of the low latency and simple management afforded by MRAM. Everspin claims a 30% performance advantage over an unspecified NVRAM drive based on battery-backed DRAM, and attributes it primarily to their lightweight NVMe protocol implementation. In addition to NVMe, the nvNITRO can be configured to allow all or part of the memory to be directly accessible for memory-mapped IO, bypassing the protocol overhead of NVMe.

 

 

The initial version of the nvNITRO is built with an off-the-shelf FPGA development board and mounts the MRAM on a pair of SO-DIMMs. Later this year Everspin will introduce new denser versions on a custom PCIe card, as well as M.2 drives and 2.5″ U.2 using a 15mm height to accommodate two stacked PCBs. By the end of the year, Everspin will be shipping their next generation 1Gb ST-MRAM with a DDR4 interface, and the nvNITRO will use that to expand to capacities of up to 16GB in the PCIe half-height half-length card form factor, 8GB in 2.5″ U.2, and at least 512MB for M.2.

Everspin has not announced pricing for the nvNITRO products. The first generation nvNITRO products are currently sampling to select customers and will be for sale in the second quarter of this year, primarily through storage vendors and system integrators as a pre-installed option.

New Design Win For Current MRAM

Everspin is also announcing another design win for their older field-switched MRAM technology. JAG Jakob Ltd is adopting Everspin’s 16Mb MRAM parts for their PdiCS process control systems, with MRAM serving as both working memory and code storage. These systems have extremely strict uptime requirements, hard realtime performance requirements and service lifetimes of up to 20 years; there are very few memory technologies on the market that can satisfy all of those requirements. Everspin will continue to develop their line of MRAM devices that compete against SRAM and NOR flash even as their higher-capacity offerings adopt DRAM-like interfaces.

Everspin Announces New MRAM Products And Partnerships

Everspin Announces New MRAM Products And Partnerships

Magnetoresistive RAM manufacturer Everspin has announced their first MRAM-based storage products and issued two other press releases about recent accomplishments. Until now, Everspin’s business model has been to sell discrete MRAM components, but they’re introducing a NVMe SSD based on their MRAM. Everspin’s MRAM is one of the highest-performing and most durable non-volatile memory technologies on the market today, but its density and capacity falls far short of NAND flash, 3D XPoint, and even DRAM. As a result, use of MRAM has largely been confined to embedded systems and industrial computing that need consistent performance and high reliability, but have very modest capacity requirements. MRAM has also seen some use as a non-volatile cache or configuration memory in some storage array controllers. The new nvNITRO family of MRAM drives is intended to be used as a storage accelerator: a high-IOPS low-latency write cache or transaction log, with performance exceeding that of any single-controller drive based on NAND flash.

Everspin’s current generation of spin-torque MRAM has a capacity of 256Mb per die with a DDR3 interface (albeit with very different timings from JEDEC standard for DRAM). The initial nvNITRO products will use 32 or 64 MRAM chips to offer capacities of 1GB or 2GB on a PCIe 3 x8 card. MRAM has high enough endurance that the nvNITRO does not need to perform any wear leveling, which allows for a drastically simpler controller design and means performance does not degrade over time or as the drive is filled up—the nvNITRO does not need any large spare area or overprovisioning. Read and write performance are also nearly identical, while flash memory suffers from much slower writes than reads, which forces flash-based SSDs to buffer and combine writes in order to offer good performance. Everspin did not have complete performance specifications available at time of writing, but the numbers they did offer are very impressive: 6µs overall latency for 4kB transfers (compared to 20µs for the Intel SSD DC P3700), and 1.5M IOPS (4kB) at QD32 (compared to 1.2M IOPS read/200k IOPS write for the HGST Ultrastar SN260). The nvNITRO does rely somewhat on higher queue depths to deliver full performance, but it is still able to deliver over 1M IOPS at QD16, around 800k IOPS at QD8, and QD1 performance is around 175k IOPS read/150k IOPS write. MRAM supports fine-grained access, so the nvNITRO performs well even with small transfer sizes: Everspin has hit 2.2M IOPS for 512B transfers, although that is not an official performance specification or measurement from the final product.

As part of today’s announcements, Everspin is introducing MRAM support for Xilinx UltraScale FPGAs in the form of scripts for Xilinx’s Memory Interface Generator tool. This will allow customers to integrate MRAM into their designs as easily as they would use SDRAM or SRAM. The nvNITRO drives are a demonstration of this capability, as the SSD controller is implemented on a Xilinx FPGA. The FPGA provides the PCIe upstream link as a standard feature, the memory controller is Everspin’s new and Everspin has developed a custom NVMe implementation to take advantage of the low latency and simple management afforded by MRAM. Everspin claims a 30% performance advantage over an unspecified NVRAM drive based on battery-backed DRAM, and attributes it primarily to their lightweight NVMe protocol implementation. In addition to NVMe, the nvNITRO can be configured to allow all or part of the memory to be directly accessible for memory-mapped IO, bypassing the protocol overhead of NVMe.

 

 

The initial version of the nvNITRO is built with an off-the-shelf FPGA development board and mounts the MRAM on a pair of SO-DIMMs. Later this year Everspin will introduce new denser versions on a custom PCIe card, as well as M.2 drives and 2.5″ U.2 using a 15mm height to accommodate two stacked PCBs. By the end of the year, Everspin will be shipping their next generation 1Gb ST-MRAM with a DDR4 interface, and the nvNITRO will use that to expand to capacities of up to 16GB in the PCIe half-height half-length card form factor, 8GB in 2.5″ U.2, and at least 512MB for M.2.

Everspin has not announced pricing for the nvNITRO products. The first generation nvNITRO products are currently sampling to select customers and will be for sale in the second quarter of this year, primarily through storage vendors and system integrators as a pre-installed option.

New Design Win For Current MRAM

Everspin is also announcing another design win for their older field-switched MRAM technology. JAG Jakob Ltd is adopting Everspin’s 16Mb MRAM parts for their PdiCS process control systems, with MRAM serving as both working memory and code storage. These systems have extremely strict uptime requirements, hard realtime performance requirements and service lifetimes of up to 20 years; there are very few memory technologies on the market that can satisfy all of those requirements. Everspin will continue to develop their line of MRAM devices that compete against SRAM and NOR flash even as their higher-capacity offerings adopt DRAM-like interfaces.

NVIDIA Announces Jetson TX2: Parker Comes To NVIDIA’s Embedded System Kit

NVIDIA Announces Jetson TX2: Parker Comes To NVIDIA’s Embedded System Kit

For a few years now, NVIDIA has been offering their line of Jetson embedded system kits. Originally launched using Tegra K1 in 2014, the first Jetson was designed to be a dev kit for groups looking to build their own Tegra-based devices from scratch. Instead, what NVIDIA surprisingly found, was that groups would use the Jetson board as-is instead and build their devices around that. This unexpected market led NVIDIA to pivot a bit on what Jetson would be, resulting in the second-generation Jetson TX1, a proper embedded system board that can be used for both development purposes and production devices.

This relaunched Jetson came at an interesting time for NVIDIA, which was right when their fortunes in neural networking/deep learning took off in earnest. Though the Jetson TX1 and underlying Tegra X1 SoC lack the power needed for high-performance use cases – these are after all based on an SoC designed for mobile applications – they have enough power for lower-performance inferencing. As a result, the Jetson TX1 has become an important part of NVIDIA’s neural networking triad, offering their GPU architecture and its various benefits for devices doing inferencing at the “edge” of a system.

Now about a year and a half after the launch of the Jetson TX1, NVIDIA is going to be giving the Jetson platform a significant update in the form of the Jetson TX2. This updated Jetson is not as radical a change as the TX1 before it was – NVIDIA seems to have found a good place in terms of form factor and the platform’s core feature set – but NVIDIA is looking to take what worked with TX1 and further ramp up the performance of the platform.

The big change here is the upgrade to NVIDIA’s newest-generation Parker SoC. While Parker never made it into third-party mobile designs, NVIDIA has been leveraging it internally for the Drive system and other projects, and now it will finally become the heart of the Jetson platform as well. Relative to the Tegra X1 in the previous Jetson, Parker is a bigger and better version of the SoC. The GPU architecture is upgraded to NVIDIA’s latest-generation Pascal architecture, and on the CPU side NVIDIA adds a pair of Denver 2 CPU cores to the existing quad-core Cortex-A57 cluster. Equally important, Parker finally goes back to a 128-bit memory bus, greatly boosting the memory bandwidth available to the SoC. The resulting SoC is fabbed on TSMC’s 16nm FinFET process, giving NVIDIA a much-welcomed improvement in power efficiency.

Paired with Parker on the Jetson TX2 as supporting hardware is 8GB of LPDDR4-3733 DRAM, a 32GB eMMC flash module, a 2×2 802.11ac + Bluetooth wireless radio, and a Gigabit Ethernet controller. The resulting board is still 50mm x 87mm in size, with NVIDIA intending it to be drop-in compatible with Jetson TX1.

Given these upgrades to the core hardware, unsurprisingly NVIDIA’s primary marketing angle with the Jetson TX2 is on its performance relative to the TX1. In a bit of a departure from the TX1, NVIDIA is canonizing two performance modes on the TX2: Max-Q and Max-P. Max-Q is the company’s name for TX2’s energy efficiency mode; at 7.5W, this mode clocks the Parker SoC for efficiency over performance – essentially placing it right before the bend in the power/performance curve – with NVIDIA claiming that this mode offers 2x the energy efficiency of the Jetson TX1. In this mode, TX2 should have similar performance to TX1 in the latter’s max performance mode.

Meanwhile the board’s Max-P mode is its maximum performance mode. In this mode NVIDIA sets the board TDP to 15W, allowing the TX2 to hit higher performance at the cost of some energy efficiency. NVIDIA claims that Max-P offers up to 2x the performance of the Jetson TX1, though as GPU clockspeeds aren’t double TX1’s, it’s going to be a bit more sensitive on an application-by-application basis.

NVIDIA Jetson TX2 Performance Modes
  Max-Q Max-P Max Clocks
GPU Frequency 854MHz 1122MHz 1302MHz
Cortex-A57 Frequency 1.2GHz Stand-Alone: 2GHz
w/Denver: 1.4GHz
2GHz+
Denver 2 Frequency N/A Stand-Alone: 2GHz
w/A57: 1.4GHz
2GHz
TDP 7.5W 15W N/A

In terms of clockspeeds, NVIDIA has disclosed that in Max-Q mode, the GPU is clocked at 854MHz while the Cortex-A57 cluster is at 1.2GHz. Going to Max-P increases the GPU clockspeed further to 1122MHz, and allows for multiple CPU options; either the Cortex-A57 cluster or Denver 2 cluster can be run at 2GHz, or both can be run at 1.4GHz. Though when it comes to all-out performance, even Max-P mode is below the TX2’s limits; the GPU clock can top out at just over 1300MHz and CPU clocks can reach 2GHz or better. Power states are configurable, so customers can dial in the TDPs and desired clockspeeds they want, however NVIDIA notes that using the maximum clocks goes further outside of the Parker SoC’s efficiency range.

Finally, along with announcing the Jetson TX2 module itself, NVIDIA is also announcing a Jetson TX2 development kit. The dev kit will actually ship first – it ships next week in the US and Europe, with other regions in April – and contains a TX2 module along with a carrier board to provide I/O breakout and interfaces to various features such as the USB, HDMI, and Ethernet. Judging from the pictures NVIDIA has sent over, the TX2 carrier board is very similar (if not identical) to the TX1 carrier board, so like the TX2 itself is should be familiar to existing Jetson developers.

With the dev kit leading the charge for Jetson TX2, NVIDIA will be selling it for $599 retail/$299 education, the same price the Jetson TX1 dev kit launched at back in 2015. Meanwhile the stand-alone Jetson TX2 module will be arriving in Q2’17, priced at $399 in 1K unit quantities. In the case of the module, this means prices have gone up a bit since the last generation; the TX2 is hitting the market at $100 higher than where the TX1 launched.

NVIDIA Announces Jetson TX2: Parker Comes To NVIDIA’s Embedded System Kit

NVIDIA Announces Jetson TX2: Parker Comes To NVIDIA’s Embedded System Kit

For a few years now, NVIDIA has been offering their line of Jetson embedded system kits. Originally launched using Tegra K1 in 2014, the first Jetson was designed to be a dev kit for groups looking to build their own Tegra-based devices from scratch. Instead, what NVIDIA surprisingly found, was that groups would use the Jetson board as-is instead and build their devices around that. This unexpected market led NVIDIA to pivot a bit on what Jetson would be, resulting in the second-generation Jetson TX1, a proper embedded system board that can be used for both development purposes and production devices.

This relaunched Jetson came at an interesting time for NVIDIA, which was right when their fortunes in neural networking/deep learning took off in earnest. Though the Jetson TX1 and underlying Tegra X1 SoC lack the power needed for high-performance use cases – these are after all based on an SoC designed for mobile applications – they have enough power for lower-performance inferencing. As a result, the Jetson TX1 has become an important part of NVIDIA’s neural networking triad, offering their GPU architecture and its various benefits for devices doing inferencing at the “edge” of a system.

Now about a year and a half after the launch of the Jetson TX1, NVIDIA is going to be giving the Jetson platform a significant update in the form of the Jetson TX2. This updated Jetson is not as radical a change as the TX1 before it was – NVIDIA seems to have found a good place in terms of form factor and the platform’s core feature set – but NVIDIA is looking to take what worked with TX1 and further ramp up the performance of the platform.

The big change here is the upgrade to NVIDIA’s newest-generation Parker SoC. While Parker never made it into third-party mobile designs, NVIDIA has been leveraging it internally for the Drive system and other projects, and now it will finally become the heart of the Jetson platform as well. Relative to the Tegra X1 in the previous Jetson, Parker is a bigger and better version of the SoC. The GPU architecture is upgraded to NVIDIA’s latest-generation Pascal architecture, and on the CPU side NVIDIA adds a pair of Denver 2 CPU cores to the existing quad-core Cortex-A57 cluster. Equally important, Parker finally goes back to a 128-bit memory bus, greatly boosting the memory bandwidth available to the SoC. The resulting SoC is fabbed on TSMC’s 16nm FinFET process, giving NVIDIA a much-welcomed improvement in power efficiency.

Paired with Parker on the Jetson TX2 as supporting hardware is 8GB of LPDDR4-3733 DRAM, a 32GB eMMC flash module, a 2×2 802.11ac + Bluetooth wireless radio, and a Gigabit Ethernet controller. The resulting board is still 50mm x 87mm in size, with NVIDIA intending it to be drop-in compatible with Jetson TX1.

Given these upgrades to the core hardware, unsurprisingly NVIDIA’s primary marketing angle with the Jetson TX2 is on its performance relative to the TX1. In a bit of a departure from the TX1, NVIDIA is canonizing two performance modes on the TX2: Max-Q and Max-P. Max-Q is the company’s name for TX2’s energy efficiency mode; at 7.5W, this mode clocks the Parker SoC for efficiency over performance – essentially placing it right before the bend in the power/performance curve – with NVIDIA claiming that this mode offers 2x the energy efficiency of the Jetson TX1. In this mode, TX2 should have similar performance to TX1 in the latter’s max performance mode.

Meanwhile the board’s Max-P mode is its maximum performance mode. In this mode NVIDIA sets the board TDP to 15W, allowing the TX2 to hit higher performance at the cost of some energy efficiency. NVIDIA claims that Max-P offers up to 2x the performance of the Jetson TX1, though as GPU clockspeeds aren’t double TX1’s, it’s going to be a bit more sensitive on an application-by-application basis.

NVIDIA Jetson TX2 Performance Modes
  Max-Q Max-P Max Clocks
GPU Frequency 854MHz 1122MHz 1302MHz
Cortex-A57 Frequency 1.2GHz Stand-Alone: 2GHz
w/Denver: 1.4GHz
2GHz+
Denver 2 Frequency N/A Stand-Alone: 2GHz
w/A57: 1.4GHz
2GHz
TDP 7.5W 15W N/A

In terms of clockspeeds, NVIDIA has disclosed that in Max-Q mode, the GPU is clocked at 854MHz while the Cortex-A57 cluster is at 1.2GHz. Going to Max-P increases the GPU clockspeed further to 1122MHz, and allows for multiple CPU options; either the Cortex-A57 cluster or Denver 2 cluster can be run at 2GHz, or both can be run at 1.4GHz. Though when it comes to all-out performance, even Max-P mode is below the TX2’s limits; the GPU clock can top out at just over 1300MHz and CPU clocks can reach 2GHz or better. Power states are configurable, so customers can dial in the TDPs and desired clockspeeds they want, however NVIDIA notes that using the maximum clocks goes further outside of the Parker SoC’s efficiency range.

Finally, along with announcing the Jetson TX2 module itself, NVIDIA is also announcing a Jetson TX2 development kit. The dev kit will actually ship first – it ships next week in the US and Europe, with other regions in April – and contains a TX2 module along with a carrier board to provide I/O breakout and interfaces to various features such as the USB, HDMI, and Ethernet. Judging from the pictures NVIDIA has sent over, the TX2 carrier board is very similar (if not identical) to the TX1 carrier board, so like the TX2 itself is should be familiar to existing Jetson developers.

With the dev kit leading the charge for Jetson TX2, NVIDIA will be selling it for $599 retail/$299 education, the same price the Jetson TX1 dev kit launched at back in 2015. Meanwhile the stand-alone Jetson TX2 module will be arriving in Q2’17, priced at $399 in 1K unit quantities. In the case of the module, this means prices have gone up a bit since the last generation; the TX2 is hitting the market at $100 higher than where the TX1 launched.

AMD Prepares 32-Core Naples CPUs for 1P and 2P Servers: Coming in Q2

AMD Prepares 32-Core Naples CPUs for 1P and 2P Servers: Coming in Q2

For users keeping track of AMD’s rollout of its new Zen microarchitecture, stage one was the launch of Ryzen, its new desktop-oriented product line last week. Stage three is the APU launch, focusing mainly on mobile parts. In the middle is stage two, Naples, and arguably the meatier element to AMD’s Zen story.

A lot of fuss has been made about Ryzen and Zen, with AMD’s re-launch back into high-performance x86. If you go by column inches, the consumer-focused Ryzen platform is the one most talked about and many would argue, the most important. In our interview with Dr. Lisa Su, CEO of AMD, the launch of Ryzen was a big hurdle in that journey. However, in the next sentence, Dr. Su lists Naples as another big hurdle, and if you decide to spend some time with one of the regular technology industry analysts, they will tell you that Naples is where AMD’s biggest chunk of the pie is. Enterprise is where the money is.

So while the consumer product line gets columns, the enterprise product line gets profits and high margins. Launching an enterprise product that gains even a few points of market share from the very large blue incumbent can implement billions of dollars to the bottom line, as well as provided some innovation as there are now two big players on the field. One could argue there are three players, if you consider ARM holds a few niche areas, however one of the big barriers to ARM adoption, aside from the lack of a high-performance single-core, is the transition from x86 to ARM instruction sets, requiring a rewrite of code. If AMD can rejoin and a big player in x86 enterprise, it puts a small stop on some of ARMs ambitions and aims to take a big enough chunk into Intel.

With today’s announcement, AMD is setting the scene for its upcoming Naples platform. Naples will not be the official name of the product line, and as we discussed with Dr. Su, Opteron one option being debated internally at AMD as the product name. Nonetheless, Naples builds on Ryzen, using the same core design but implementing it in a big way.

The top end Naples processor will have a total of 32 cores, with simultaneous multi-threading (SMT), to give a total of 64 threads. This will be paired with eight channels of DDR4 memory, up to two DIMMs per channel for a total of 16 DIMMs, and altogether a single CPU will support 128 PCIe 3.0 lanes. Naples also qualifies as a system-on-a-chip (SoC), with a measure of internal IO for storage, USB and other things, and thus may be offered without a chipset.

Naples will be offered as either a single processor platform (1P), or a dual processor platform (2P). In dual processor mode, and thus a system with 64 cores and 128 threads, each processor will use 64 of its PCIe lanes as a communication bus between the processors as part of AMD’s Infinity Fabric. The Infinity Fabric uses a custom protocol over these lanes, but bandwidth is designed to be on the order of PCIe. As each core uses 64 PCIe lanes to talk to the other, this allows each of the CPUs to give 64 lanes to the rest of the system, totaling 128 PCIe 3.0 again.

On the memory side, with eight channels and two DIMMs per channel, AMD is stating that they officially support up to 2TB of DRAM per socket, making 4TB in a single server. The total memory bandwidth available to a single CPU clocks in at 170 GB/s.

While not specifically mentioned in the announcement today, we do know that Naples is not a single monolithic die on the order of 500mm2 or up. Naples uses four of AMD’s Zeppelin dies (the Ryzen dies) in a single package. With each Zeppelin die coming in at 195.2mm2, if it were a monolithic die, that means a total of 780mm2 of silicon, and around 19.2 billion transistors – which is far bigger than anything Global Foundries has ever produced, let alone tried at 14nm. During our interview with Dr. Su, we postulated that multi-die packages would be the way forward on future process nodes given the difficulty of creating these large imposing dies, and the response from Dr. Su indicated that this was a prominent direction to go in.

Each die provides two memory channels, which brings us up to eight channels in total. However, each die only has 16 PCIe 3.0 lanes (24 if you want to count PCH/NVMe), meaning that some form of mux/demux, PCIe switch, or accelerated interface is being used. This could be extra silicon on package, given AMD’s approach of a single die variant of its Zen design to this point.

Note that we’ve seen multi-die packages before in previous products from both AMD and Intel. Despite both companies playing with multi-die or 2.5D technology (AMD with Fury, Intel with EMIB), we are lead to believe that these CPUs are similar to previous multi-chip designs, however there is Infinity Fabric going through them. At what bandwidth, we do not know at this point. It is also pertinent to note that there is a lot of talk going around about the strength of AMD’s Infinity Fabric, as well as how threads are manipulated within a silicon die itself, having two core complexes of four cores each. This is something we are investigating on the consumer side, but will likely be very relevant on the enterprise side as well.

In the land of benchmark numbers we can’t verify (yet), AMD showed demonstrations at the recent Ryzen Tech Day. The main demonstration was a sparse matrix calculation on a 3D-dataset for seismic analysis. In this test, solving a 15-diagonal matrix of 1 billion samples took 35 seconds on an Intel machine vs 18 seconds on an AMD machine (both machines using 44 cores and DDR4-1866). When allowed to use its full 64-cores and DDR4-2400 memory, AMD shaved another four seconds off. Again, we can’t verify these results, and it’s a single data point, but a diagonal matrix solver would be a suitable representation for an enterprise workload. We were told that the clock frequencies for each chip were at stock, however AMD did say that the Naples clocks were not yet finalized.

What we don’t know are power numbers, frequencies, processor lists, pricing, partners, segmentation, and all the meaty stuff. We expect AMD to offer a strong attack on the 1P/2P server markets, which is where 99% of the enterprise is focused, particularly where high-performance virtualization is needed, or storage. How Naples migrates into the workstation space is an unknown, but I hope it does. We’re working with AMD to secure samples for Johan and me in advance of the Q2 launch.

Related Reading