Examining Huawei’s Benchmark Optimizations in the Ascend P7
While benchmark optimization has been a hot topic, recently it has faded into the background as the industry adjusted. Previously, we saw changes such as an automatic 10% GPU overclock that was almost never achieved in normal applications, and behavior that would automatically plug in all cores and set the CPU frequency to maximum. Now, most OEMs have either stopped this behavior. Even if an OEM hasn't stopped such behavior, there are options that make it possible to use the altered CPU/GPU governor in all applications.
Unfortunately, I have to talk about a case where this isn't true. While I've been working on reviewing the Ascend P7 and have found a lot to like, I am sure that the Ascend P7 alters CPU governor behavior in certain benchmarks. For those that are unfamiliar with the Huawei Ascend P7, it's considered to be Huawei's flagship smartphone. As Huawei's flagship, it's equipped with a Kirin 910T SoC, which has four Cortex A9r4 CPUs running at a maximum of 1.8 GHz, and two gigabytes of RAM. As a flagship smartphone, it also has a five inch display with a 1080p resolution.
To test for differences in governor behavior, we'll start by looking at how the P7 normally behaves when faced with a benchmark workload. I haven't seen any differences in GPU behavior as the governor seems to stay clocked at an appropriate level regardless of the benchmark. At any rate, the behavior is noticeably quite reluctant when it comes to reaching 1.8 GHz. For the most part this only happens in short periods, and there is a great deal of variation in clock speeds, with an average of about 1.3 GHz throughout the test.
Here, we can see a significant difference in the CPU frequency curve. There's far more time spent at 1.8 GHz, and the frequency profile is incredibly tight outside of the beginning and end. The average frequency is around 1.7 GHz, which is significantly higher than what we see in the renamed version of the benchmark.
While this graph is somewhat boring, it's important as it shows that only three cores are plugged for the full duration of the test. Any noticeable deviation from this pattern would definitely be concerning.
When running the same workload on the Play Store version of GFXBench, we see that four cores are plugged for almost the entirety of the test. While I'm not surprised to see this kind of behavior when combined with altered frequency scaling, it's a bit disappointing. Strangely, this policy doesn't seem to be universal either as I haven't seen evidence of altered behavior in Huawei's Snapdragon devices. This sort of optimization seems to be exclusive to the HiSilicon devices. Such behavior is visible in 3DMark as well, although it doesn't seem to happen in Basemark OS II or Basemark X 1.1.
|Huawei Ascend P7 Performance
|3DMark Ice Storm U/L
While normally such optimizations have a small effect, in the case of the affected benchmarks the difference is noticeable and quite significant. Needless to say, it's not really acceptable that Huawei is doing this, and I'm disappointed that they have chosen this path.
In response to this issue, Huawei stated the following:
"CPU configuration is adjusted dynamically according to the workload in different scenarios. Benchmark running is a typical scenario which requires heavy workload, therefore main frequency of CPU will rise to its highest level and will remain so for a while. For P7, the highest frequency is 1.8GHz. It seldom requires CPU to work at the highest frequency for long in others scenarios. Even if the highest level appears, it will only last for a very short time (for example 400 ms). Situation is the same for most devices in the market."
Unfortunately, I'm not sure how this statement explains the situation, as two identical workloads performed differently. While I was hoping to see an end to rather silly games like this, it seems that this path before OEMs stop this kind of behavior will continue on for longer than I first expected. Ultimately, such games don't affect anyone that actually knows how to benchmark SoCs and evaluate performance, and one only needs to look to the PC industry to see that such efforts will ultimately be discovered and defeated.