Vik


FMS 2014: Marvell Announces NVMe-Enabled PCIe 3.0 x4 88SS1093 SSD Controller

FMS 2014: Marvell Announces NVMe-Enabled PCIe 3.0 x4 88SS1093 SSD Controller

Two weeks ago Marvell announced their first PCIe SSD controller with NVMe support, named as 88SS1093. It supports PCIe 3.0 x4 interface with up to 4GB/s of bandwidth between the controller and the host, although Marvell has yet to announce any actual performance specs. While PCIe 3.0 x4 is in theory capable of delivering 4GB/s, in our experience the efficiency of PCIe has been about 80%, so in reality I would expect peak sequential performance of around 3GB/s. No word on the channel count of the controller, but if history provides any guidance the 88SS1093 should feature eight NAND channels similar to its SATA siblings. Silicon wise the controller is built on a 28nm CMOS process and features three CPU cores.

The 88SS1093 has support for 15nm MLC and TLC and 3D NAND, although I fully expect it to be compatible with Micron’s and SK Hynix’ 16nm NAND as well (i.e. 15nm TLC is just the smallest it can go). TLC support is enabled by the use of LDPC error-correction, which is part of Marvell’s third generation NANDEdge technology. Capacities of up to 2TB are supported and the controller fits in both 2.5″ and M.2 designs thanks to its small package size and thermal optimization (or should I say throttling). 

The 88SS1093 is currently sampling to Marvell’s key customers and product availability is in 2015. Given how well Intel’s SSD DC P3700 fared in our tests, I am excited to see more NVMe designs popping up. Marvell has known to be the go-to controller source for many of the major SSD manufacturers (SanDisk and Micron/Crucial to name a couple), so the 88SS1093 will play an important part in bringing NVMe to the client market.

FMS 2014: Marvell Announces NVMe-Enabled PCIe 3.0 x4 88SS1093 SSD Controller

FMS 2014: Marvell Announces NVMe-Enabled PCIe 3.0 x4 88SS1093 SSD Controller

Two weeks ago Marvell announced their first PCIe SSD controller with NVMe support, named as 88SS1093. It supports PCIe 3.0 x4 interface with up to 4GB/s of bandwidth between the controller and the host, although Marvell has yet to announce any actual performance specs. While PCIe 3.0 x4 is in theory capable of delivering 4GB/s, in our experience the efficiency of PCIe has been about 80%, so in reality I would expect peak sequential performance of around 3GB/s. No word on the channel count of the controller, but if history provides any guidance the 88SS1093 should feature eight NAND channels similar to its SATA siblings. Silicon wise the controller is built on a 28nm CMOS process and features three CPU cores.

The 88SS1093 has support for 15nm MLC and TLC and 3D NAND, although I fully expect it to be compatible with Micron’s and SK Hynix’ 16nm NAND as well (i.e. 15nm TLC is just the smallest it can go). TLC support is enabled by the use of LDPC error-correction, which is part of Marvell’s third generation NANDEdge technology. Capacities of up to 2TB are supported and the controller fits in both 2.5″ and M.2 designs thanks to its small package size and thermal optimization (or should I say throttling). 

The 88SS1093 is currently sampling to Marvell’s key customers and product availability is in 2015. Given how well Intel’s SSD DC P3700 fared in our tests, I am excited to see more NVMe designs popping up. Marvell has known to be the go-to controller source for many of the major SSD manufacturers (SanDisk and Micron/Crucial to name a couple), so the 88SS1093 will play an important part in bringing NVMe to the client market.

Examining Huawei's Benchmark Optimizations in the Ascend P7

Examining Huawei’s Benchmark Optimizations in the Ascend P7

While benchmark optimization has been a hot topic, recently it has faded into the background as the industry adjusted. Previously, we saw changes such as an automatic 10% GPU overclock that was almost never achieved in normal applications, and behavior that would automatically plug in all cores and set the CPU frequency to maximum. Now, most OEMs have either stopped this behavior. Even if an OEM hasn’t stopped such behavior, there are options that make it possible to use the altered CPU/GPU governor in all applications.

Unfortunately, I have to talk about a case where this isn’t true. While I’ve been working on reviewing the Ascend P7 and have found a lot to like, I am sure that the Ascend P7 alters CPU governor behavior in certain benchmarks. For those that are unfamiliar with the Huawei Ascend P7, it’s considered to be Huawei’s flagship smartphone. As Huawei’s flagship, it’s equipped with a Kirin 910T SoC, which has four Cortex A9r4 CPUs running at a maximum of 1.8 GHz, and two gigabytes of RAM. As a flagship smartphone, it also has a five inch display with a 1080p resolution.

To test for differences in governor behavior, we’ll start by looking at how the P7 normally behaves when faced with a benchmark workload. I haven’t seen any differences in GPU behavior as the governor seems to stay clocked at an appropriate level regardless of the benchmark. At any rate, the behavior is noticeably quite reluctant when it comes to reaching 1.8 GHz. For the most part this only happens in short periods, and there is a great deal of variation in clock speeds, with an average of about 1.3 GHz throughout the test.

Here, we can see a significant difference in the CPU frequency curve. There’s far more time spent at 1.8 GHz, and the frequency profile is incredibly tight outside of the beginning and end. The average frequency is around 1.7 GHz, which is significantly higher than what we see in the renamed version of the benchmark.

While this graph is somewhat boring, it’s important as it shows that only three cores are plugged for the full duration of the test. Any noticeable deviation from this pattern would definitely be concerning.

When running the same workload on the Play Store version of GFXBench, we see that four cores are plugged for almost the entirety of the test. While I’m not surprised to see this kind of behavior when combined with altered frequency scaling, it’s a bit disappointing. Strangely, this policy doesn’t seem to be universal either as I haven’t seen evidence of altered behavior in Huawei’s Snapdragon devices. This sort of optimization seems to be exclusive to the HiSilicon devices. Such behavior is visible in 3DMark as well, although it doesn’t seem to happen in Basemark OS II or Basemark X 1.1.

Huawei Ascend P7 Performance
  Play Store Renamed Perf Increase
GFXBench T-Rex 12.3 10.6 +16%
3DMark Ice Storm U/L 7462 5816 +28.3%

While normally such optimizations have a small effect, in the case of the affected benchmarks the difference is noticeable and quite significant. Needless to say, it’s not really acceptable that Huawei is doing this, and I’m disappointed that they have chosen this path.

In response to this issue, Huawei stated the following:

“CPU configuration is adjusted dynamically according to the workload in different scenarios. Benchmark running is a typical scenario which requires heavy workload, therefore main frequency of CPU will rise to its highest level and will remain so for a while. For P7, the highest frequency is 1.8GHz. It seldom requires CPU to work at the highest frequency for long in others scenarios. Even if the highest level appears, it will only last for a very short time (for example 400 ms). Situation is the same for most devices in the market.”

Unfortunately, I’m not sure how this statement explains the situation, as two identical workloads performed differently. While I was hoping to see an end to rather silly games like this, it seems that this path before OEMs stop this kind of behavior will continue on for longer than I first expected. Ultimately, such games don’t affect anyone that actually knows how to benchmark SoCs and evaluate performance, and one only needs to look to the PC industry to see that such efforts will ultimately be discovered and defeated.

 

Examining Huawei's Benchmark Optimizations in the Ascend P7

Examining Huawei’s Benchmark Optimizations in the Ascend P7

While benchmark optimization has been a hot topic, recently it has faded into the background as the industry adjusted. Previously, we saw changes such as an automatic 10% GPU overclock that was almost never achieved in normal applications, and behavior that would automatically plug in all cores and set the CPU frequency to maximum. Now, most OEMs have either stopped this behavior. Even if an OEM hasn’t stopped such behavior, there are options that make it possible to use the altered CPU/GPU governor in all applications.

Unfortunately, I have to talk about a case where this isn’t true. While I’ve been working on reviewing the Ascend P7 and have found a lot to like, I am sure that the Ascend P7 alters CPU governor behavior in certain benchmarks. For those that are unfamiliar with the Huawei Ascend P7, it’s considered to be Huawei’s flagship smartphone. As Huawei’s flagship, it’s equipped with a Kirin 910T SoC, which has four Cortex A9r4 CPUs running at a maximum of 1.8 GHz, and two gigabytes of RAM. As a flagship smartphone, it also has a five inch display with a 1080p resolution.

To test for differences in governor behavior, we’ll start by looking at how the P7 normally behaves when faced with a benchmark workload. I haven’t seen any differences in GPU behavior as the governor seems to stay clocked at an appropriate level regardless of the benchmark. At any rate, the behavior is noticeably quite reluctant when it comes to reaching 1.8 GHz. For the most part this only happens in short periods, and there is a great deal of variation in clock speeds, with an average of about 1.3 GHz throughout the test.

Here, we can see a significant difference in the CPU frequency curve. There’s far more time spent at 1.8 GHz, and the frequency profile is incredibly tight outside of the beginning and end. The average frequency is around 1.7 GHz, which is significantly higher than what we see in the renamed version of the benchmark.

While this graph is somewhat boring, it’s important as it shows that only three cores are plugged for the full duration of the test. Any noticeable deviation from this pattern would definitely be concerning.

When running the same workload on the Play Store version of GFXBench, we see that four cores are plugged for almost the entirety of the test. While I’m not surprised to see this kind of behavior when combined with altered frequency scaling, it’s a bit disappointing. Strangely, this policy doesn’t seem to be universal either as I haven’t seen evidence of altered behavior in Huawei’s Snapdragon devices. This sort of optimization seems to be exclusive to the HiSilicon devices. Such behavior is visible in 3DMark as well, although it doesn’t seem to happen in Basemark OS II or Basemark X 1.1.

Huawei Ascend P7 Performance
  Play Store Renamed Perf Increase
GFXBench T-Rex 12.3 10.6 +16%
3DMark Ice Storm U/L 7462 5816 +28.3%

While normally such optimizations have a small effect, in the case of the affected benchmarks the difference is noticeable and quite significant. Needless to say, it’s not really acceptable that Huawei is doing this, and I’m disappointed that they have chosen this path.

In response to this issue, Huawei stated the following:

“CPU configuration is adjusted dynamically according to the workload in different scenarios. Benchmark running is a typical scenario which requires heavy workload, therefore main frequency of CPU will rise to its highest level and will remain so for a while. For P7, the highest frequency is 1.8GHz. It seldom requires CPU to work at the highest frequency for long in others scenarios. Even if the highest level appears, it will only last for a very short time (for example 400 ms). Situation is the same for most devices in the market.”

Unfortunately, I’m not sure how this statement explains the situation, as two identical workloads performed differently. While I was hoping to see an end to rather silly games like this, it seems that this path before OEMs stop this kind of behavior will continue on for longer than I first expected. Ultimately, such games don’t affect anyone that actually knows how to benchmark SoCs and evaluate performance, and one only needs to look to the PC industry to see that such efforts will ultimately be discovered and defeated.

 

MSI GS60 Ghost Pro 3K Review

MSI has several lines of gaming notebooks catering to different types of users. At the high-end is the GT series that supports the fastest mobile CPUs and GPUs while the GE series caters more towards the cost-conscious buyers. Somewhere in the middl…