News


The AMD A8-7650K APU Review, Also New Testing Methodology

The staggered birth of Kaveri has been an interesting story to cover but it has been difficult to keep all the pieces right in the forefront of memory. The initial launch in January 2014 saw a small number of SKUs such as the A10-7850K and the A8-7600 at first and since then we have had a small trickle at a rate of one or two new models a quarter hitting the shelves. We’ve seen 65W SKUs, such as in the form of the A10-7800, which offer 45W modes as well. Today we’re reviewing the most recent Kaveri processor to hit the market, the A8-7650K rated at 95W and officially priced at $105/$95.

AMD’s Carrizo-L APUs Unveiled: 12-25W Quad Core Puma+

AMD’s Carrizo-L APUs Unveiled: 12-25W Quad Core Puma+

One of the important press releases that came out as a result of the AMD Financial Analyst Day has been some insights into how AMD is approaching the Carrizo and Carrizo-L platform. Have a read of Ryan’s round up of the Financial Analyst Day, which included some broad details about Zen and the big x86 cores, but Carrizo and Carrizo-L focus on AMD’s mobile strategy as well as describing the next iterations of the Bulldozer architecture (Excavator) and the Cat family of low power SoCs (Puma+). We covered some of AMD’s releases on Carrizo back in February, but despite the similar name Carrizo-L functions for a slightly different market by virtue of the different architecture.

Carrizo-L features ‘Puma+’, which by virtue of the naming scheme suggests an updated version of Puma which was seen in Beema. What the ‘plus’ part of the name means has not been disclosed, as both Puma and Puma+ are reported to be 28nm, but chances are that the design has attacked the low hanging fruit in the processor design, rather than purely just a frequency bump. Carrizo-L will be advertised under the new ‘AMD 7000 Series’ APUs, featuring up to four low power separate cores up to 2.5GHz, up to 25W and up to DDR3-1866 support. These are aimed square at the Atom ecosystem within a similar power budget.

AMD Carrizo-L
  A8-7410 A6-7310 A4-7210 E2-7110 E1-7010
Cores / Threads 4 / 4 4 / 4 4 / 4 4 / 4 2 / 2
CPU Frequency Up to 2.5 GHz Up to 2.4 GHz Up to 2.2 GHz Up to 1.8 GHz Up to 1.5 GHz
TDP 12-25W 12-25W 12-25W 12-15W 10W
L2 Cache 2MB 2MB 2MB 2MB 1MB
DRAM Frequency DDR3L-1866 DDR3L-1600 DDR3L-1600 DDR3L-1600 DDR3L-1333
Radeon Graphics R5 R4 R3 ‘Radeon’ ‘Radeon’
Streaming Processors 128 ? 128 ? 128 ? 128 ? 128 ?
GPU Frequency Unknown Unknown Unknown Unknown Unknown

AMD is stating that these APUs are currently available in Greater China already with a global rollout commencing in due course. All APUs are listed with AMD Radeon graphics, although the Rx number has no indication as to the streaming processors in the graphics part – a similar situation happened with Beema, and all those parts came with 128 SPs, differing only in frequency which is likely the case here. The SoC design also ensures all the IO is onboard, including an AMD Secure Processor, which for Puma was a Cortex-A5 supporting ARM TrustZone. It is likely that Carrizo-L also uses only a single memory channel, similar to Beema.

One of the more interesting elements is that Carrizo and Carrizo-L will share a socket, known as FP4. This means the processors are pin compatible, and what we know about Carrizo at this point suggests that both segments will play within the same sort of power budget (10-25W vs 15-35W). This allows OEMs to build two designs with almost identical hardware under the hood except for the SoC – would you prefer a single/dual Excavator design, or a faster frequency quad-core Puma+ design? There also leaves scope for differential integrated graphics performance, as mobile Kaveri up to 25W had up to 384 SPs or 3x what we are expecting with Carrizo-L. A lot of the performance metrics in this part will be down to binning the various designs, which adjusts the cost.

At some point we will source a Carrizo-L low-power notebook in order to test the hardware – it would be an interesting data point to get a corresponding Carrizo design as well.

Source: AMD

AMD’s Carrizo-L APUs Unveiled: 12-25W Quad Core Puma+

AMD’s Carrizo-L APUs Unveiled: 12-25W Quad Core Puma+

One of the important press releases that came out as a result of the AMD Financial Analyst Day has been some insights into how AMD is approaching the Carrizo and Carrizo-L platform. Have a read of Ryan’s round up of the Financial Analyst Day, which included some broad details about Zen and the big x86 cores, but Carrizo and Carrizo-L focus on AMD’s mobile strategy as well as describing the next iterations of the Bulldozer architecture (Excavator) and the Cat family of low power SoCs (Puma+). We covered some of AMD’s releases on Carrizo back in February, but despite the similar name Carrizo-L functions for a slightly different market by virtue of the different architecture.

Carrizo-L features ‘Puma+’, which by virtue of the naming scheme suggests an updated version of Puma which was seen in Beema. What the ‘plus’ part of the name means has not been disclosed, as both Puma and Puma+ are reported to be 28nm, but chances are that the design has attacked the low hanging fruit in the processor design, rather than purely just a frequency bump. Carrizo-L will be advertised under the new ‘AMD 7000 Series’ APUs, featuring up to four low power separate cores up to 2.5GHz, up to 25W and up to DDR3-1866 support. These are aimed square at the Atom ecosystem within a similar power budget.

AMD Carrizo-L
  A8-7410 A6-7310 A4-7210 E2-7110 E1-7010
Cores / Threads 4 / 4 4 / 4 4 / 4 4 / 4 2 / 2
CPU Frequency Up to 2.5 GHz Up to 2.4 GHz Up to 2.2 GHz Up to 1.8 GHz Up to 1.5 GHz
TDP 12-25W 12-25W 12-25W 12-15W 10W
L2 Cache 2MB 2MB 2MB 2MB 1MB
DRAM Frequency DDR3L-1866 DDR3L-1600 DDR3L-1600 DDR3L-1600 DDR3L-1333
Radeon Graphics R5 R4 R3 ‘Radeon’ ‘Radeon’
Streaming Processors 128 ? 128 ? 128 ? 128 ? 128 ?
GPU Frequency Unknown Unknown Unknown Unknown Unknown

AMD is stating that these APUs are currently available in Greater China already with a global rollout commencing in due course. All APUs are listed with AMD Radeon graphics, although the Rx number has no indication as to the streaming processors in the graphics part – a similar situation happened with Beema, and all those parts came with 128 SPs, differing only in frequency which is likely the case here. The SoC design also ensures all the IO is onboard, including an AMD Secure Processor, which for Puma was a Cortex-A5 supporting ARM TrustZone. It is likely that Carrizo-L also uses only a single memory channel, similar to Beema.

One of the more interesting elements is that Carrizo and Carrizo-L will share a socket, known as FP4. This means the processors are pin compatible, and what we know about Carrizo at this point suggests that both segments will play within the same sort of power budget (10-25W vs 15-35W). This allows OEMs to build two designs with almost identical hardware under the hood except for the SoC – would you prefer a single/dual Excavator design, or a faster frequency quad-core Puma+ design? There also leaves scope for differential integrated graphics performance, as mobile Kaveri up to 25W had up to 384 SPs or 3x what we are expecting with Carrizo-L. A lot of the performance metrics in this part will be down to binning the various designs, which adjusts the cost.

At some point we will source a Carrizo-L low-power notebook in order to test the hardware – it would be an interesting data point to get a corresponding Carrizo design as well.

Source: AMD

MediaTek Unveils Helio X20 Tri-Cluster 10-Core SoC

MediaTek Unveils Helio X20 Tri-Cluster 10-Core SoC

Today MediaTek announces their brand new flagship SoC for smartphones and tablets, the Helio X20. MediaTek continues their Helio SoC branding announced earlier in the year, making the X20 the second SoC in the X-lineup and the first one to be actually released with the new product name from the beginning (as the X10 was a direct name change from the MT6795).

Right off the bat, MediaTek manages to raise eyebrows with what is the first 10 core System-on-a-Chip design. The 10 processor cores are arranged in a tri-cluster orientation, which is a new facet against a myriad of dual-cluster big.LITTLE heterogeneous CPU designs. The three clusters consist of a low power quad-core A53 cluster clocked at 1.4 GHz, a power/performance balanced quad-core A53 cluster at 2.0GHz, and an extreme performance dual-core A72 cluster clocked in at 2.5GHz. To achieve this tri-cluster design, MediaTek chose to employ a custom interconnect IP called the MediaTek Coherent System Interconnect (MCSI).

We’ll get back to the new innovative CPU arrangement in a bit, but first let’s see an overview of what the rest of SoC offers. MediaTek is proud to present its first CDMA2000 compatible integrated modem with the X20. This is an important stepping stone as the company attempts to enter the US market and try to breach Qualcomm’s stronghold on the North American modems and SoCs. Besides C2K, the X20’s modem allows for LTE Release 11 Category 6 with 20+20MHz Carrier Aggregation (downstream), supporting speeds up to 300Mbps in the downstream direction and 50Mbps upstream. The new modem also is supposed to use 30% less power when compared to the Helio X10.

The SoC also has an integrated 802.11ac Wi-Fi with what seems to be a single spatial stream rated in the spec sheets up to 280Mbps.

MediaTek Helio X20 vs The Competition
SoC MediaTek
Helio X20
(MT6797)
MediaTek
Helio X10
(MT6795)
Qualcomm
Snapdragon 808
(MSM8992)
Qualcomm
Snapdragon 620
(MSM8976)
CPU 4x Cortex A53 @1.4GHz

4x Cortex A53 @2.0GHz

2x Cortex A72
@2.3-2.5GHz

4x Cortex A53 @2.2GHz

4x Cortex A53 @2.2GHz

4x Cortex A53 @1.44GHz

2x Cortex A57 @1.82GHz

4x Cortex A53 @1.2GHz

4x Cortex A72 @1.8GHz

Memory
Controller
2x 32-bit @ 933MHz
LPDDR3

14.9GB/s b/w

2x 32-bit @ 933MHz
LPDDR3

14.9GB/s b/w

2x 32-bit @ 933MHz
LPDDR3

14.9GB/s b/w

2x 32-bit @ 933MHz
LPDDR3

14.9GB/s b/w

GPU Mali T8??MP4
@700MHz
PowerVR G6200
@700MHz
Adreno 418
@600MHz
“Next-gen” Adreno
Encode/
Decode
2160p30 10-bit
H.264/HEVC/VP9
decode

2160p30
HEVC w/HDR
encode

2160p30 10-bit
H.264/HEVC/VP9
decode

2160p30
HEVC
encode

2160p30, 1080p120
H.264 & HEVC
decode

2160p30, 1080p120
H.264
encode

2160p30, 1080p120
H.264 & HEVC

 

Camera/ISP Dual ISP
32MP @ 24fps
13MP Dual ISP
21MP
Dual ISP
21MP
Integrated
Modem
 LTE Cat. 6
300Mbps DL 50Mbps UL

2x20MHz C.A. 
(DL)

LTE Cat. 4 
150Mbps DL 50Mbps UL
“X10 LTE” Cat. 9
450Mbps DL
50Mbps UL

3x20MHz C.A. 
(DL)

“X8 LTE” Cat. 7 
300Mbps DL 100Mbps UL

2x20MHz C.A. 
(DL & UL)
 

Video encoding and decoding capabilities seem to be carried over from the MT6795 / X10, but MediaTek advertises a 30% and 40% improvement in decoding and encoding power consumption respectively.

Still on the multimedia side, we see the employment of a new integrated Cortex-M4 companion-core which serves as both an audio processor for low-power audio decoding, speech enhancement features and voice recognition, as well as sensor-hub function acting as a microcontroller for offloading sensor data processing from the main CPU cores. This means that while the device has the display turned off but is playing audio, only the M4 is in use in order to prolong battery life.

On the GPU side, the X20 seemed to be the first officially announced Mali T800 series GPU SoC. MediaTek explains that this is a still-unreleased ARM Mali high-end GPU similar to the T880. MediaTek initially chose a more conservative MP4 configuration clocked in at 700MHz, although final specifications are being withheld at this time. It should be noted that Mediatek has traditionally never aimed very high in terms of GPU configurations. It could be considered that the GPU in the X20 could still remain competitive in prolonged sustained loads as we saw larger Mali implementations such as Samsung’s Exynos SoCs not being able to remain in the thermal envelope at their maximum rated frequencies. Initial relative estimates of the X20, expressed by MediaTek, compared to the Helio X10’s G6200 see a 40% improvement in performance with a 40% drop in power.

On the memory side, MediaTek remains with a 2x32bit LPDDR3 memory interface running at 933MHz. MediaTek reasons that the SoC is limited to 1440p devices and the LPDDR3 memory should be plenty enough to satisfy the SoC’s bandwidth requirements (a notion I agree with, given the GPU configuration).

Going back to the signature 10-Core/Tri-Cluster architecture of the SoC, MediaTek explains that this was a choice of power optimization over conventional two-cluster big.LITTLE designs. b.L works by employing heterogeneous CPU clusters – these may differ in architecture, but can also be identical architectures which then differ in their electrical characteristics and their target operating speeds. We’ve covered how power consumption curves behave in our Exynos 5433 deep-dive, and MediaTek presents a similar overview when explaining the X20’s architecture.

One option in the traditional 2-cluster designs is to employ a low-power low-performance cluster, typically always a lower-power in-order CPU architecture such as ARM’s A53. This is paired with a higher-power high-performance cluster, either a larger CPU core such as the A57/A72, or a frequency optimized A53 as we see employed in some past MediaTek SoCs, or most recently, HiSilicon’s Kirin 930 found in the Huawei P8. 

Contrary to what MediaTek presents as an “introduction of a Mid cluster”, I like to see MediaTek’s tri-cluster approach as an extension to the existing dual A53 cluster designs – where the added A72 cluster is truly optimized for only the highest frequencies. Indeed, we are told that the A72 cluster can reach up to 2.5GHz on a TSMC 20nm process. ARM aims similar clocks for the A72 but at only 14/16nm FinFET processes, so to see MediaTek go this high on 20nm is impressive, even if it’s only a two-core cluster. It will be interesting to see how MediaTek chooses the lower frequency limits on each cluster, especially the A72 CPUs, or how these options will be presented to OEMs.

The end-result is a promised 30% improvement in power consumption over a similar 2-cluster approach. This happens thanks to the finer granularity in the performance/power curve and an increase in available performance-power points for the scheduler to place a thread on. Having a process that is heavy enough that it is not capable of residing on the smallest cluster due to performance constraints, but not demanding enough to require the big cluster’s full performance, can now reside on this medium cluster at much greater efficiency than had it been running on the big cluster at reduced clocks. MediaTek uses CorePilot as a custom developed scheduler implementation that is both power aware and very advanced (based on our internal testing of other MediaTek SoCs). My experience and research with it on existing devices was fairly positive, so I’m sure the X20’s new v3.0 implementation of CorePilot will be able to take good advantage of the tri-cluster design.

The biggest question and need of clarification is in the area of what the MCSI (the interconnect) is capable of. ARM had announced its CCI-500 interconnect back in February, which incidentally also promised the capability of up to 4 CPU clusters. MediaTek hinted that this may be a design based on ARM’s CCI – but we’re still not sure if this means a loosely based design or a direct improvement of ARM’s IP. Cache coherence is a major design effort, and if MediaTek saw this custom IP as an effort worth committing to, then the MCSI may have some improvements we’re still not clear on.

The Helio X20 is certainly an interesting SoC and I’m eager on how the tri-cluster design performs in practice. The X20 samples in H2 2015 and devices with it are planned to be shipping in Q1 2016. In the given time-frame, it seems the X20’s primary competitor is Qualcomm’s Snapdragon 620, so it’ll be definitely a battle for the “super-mid” (as MediaTek likes to put it) crown.

MediaTek Unveils Helio X20 Tri-Cluster 10-Core SoC

MediaTek Unveils Helio X20 Tri-Cluster 10-Core SoC

Today MediaTek announces their brand new flagship SoC for smartphones and tablets, the Helio X20. MediaTek continues their Helio SoC branding announced earlier in the year, making the X20 the second SoC in the X-lineup and the first one to be actually released with the new product name from the beginning (as the X10 was a direct name change from the MT6795).

Right off the bat, MediaTek manages to raise eyebrows with what is the first 10 core System-on-a-Chip design. The 10 processor cores are arranged in a tri-cluster orientation, which is a new facet against a myriad of dual-cluster big.LITTLE heterogeneous CPU designs. The three clusters consist of a low power quad-core A53 cluster clocked at 1.4 GHz, a power/performance balanced quad-core A53 cluster at 2.0GHz, and an extreme performance dual-core A72 cluster clocked in at 2.5GHz. To achieve this tri-cluster design, MediaTek chose to employ a custom interconnect IP called the MediaTek Coherent System Interconnect (MCSI).

We’ll get back to the new innovative CPU arrangement in a bit, but first let’s see an overview of what the rest of SoC offers. MediaTek is proud to present its first CDMA2000 compatible integrated modem with the X20. This is an important stepping stone as the company attempts to enter the US market and try to breach Qualcomm’s stronghold on the North American modems and SoCs. Besides C2K, the X20’s modem allows for LTE Release 11 Category 6 with 20+20MHz Carrier Aggregation (downstream), supporting speeds up to 300Mbps in the downstream direction and 50Mbps upstream. The new modem also is supposed to use 30% less power when compared to the Helio X10.

The SoC also has an integrated 802.11ac Wi-Fi with what seems to be a single spatial stream rated in the spec sheets up to 280Mbps.

MediaTek Helio X20 vs The Competition
SoC MediaTek
Helio X20
(MT6797)
MediaTek
Helio X10
(MT6795)
Qualcomm
Snapdragon 808
(MSM8992)
Qualcomm
Snapdragon 620
(MSM8976)
CPU 4x Cortex A53 @1.4GHz

4x Cortex A53 @2.0GHz

2x Cortex A72
@2.3-2.5GHz

4x Cortex A53 @2.2GHz

4x Cortex A53 @2.2GHz

4x Cortex A53 @1.44GHz

2x Cortex A57 @1.82GHz

4x Cortex A53 @1.2GHz

4x Cortex A72 @1.8GHz

Memory
Controller
2x 32-bit @ 933MHz
LPDDR3

14.9GB/s b/w

2x 32-bit @ 933MHz
LPDDR3

14.9GB/s b/w

2x 32-bit @ 933MHz
LPDDR3

14.9GB/s b/w

2x 32-bit @ 933MHz
LPDDR3

14.9GB/s b/w

GPU Mali T8??MP4
@700MHz
PowerVR G6200
@700MHz
Adreno 418
@600MHz
“Next-gen” Adreno
Encode/
Decode
2160p30 10-bit
H.264/HEVC/VP9
decode

2160p30
HEVC w/HDR
encode

2160p30 10-bit
H.264/HEVC/VP9
decode

2160p30
HEVC
encode

2160p30, 1080p120
H.264 & HEVC
decode

2160p30, 1080p120
H.264
encode

2160p30, 1080p120
H.264 & HEVC

 

Camera/ISP Dual ISP
32MP @ 24fps
13MP Dual ISP
21MP
Dual ISP
21MP
Integrated
Modem
 LTE Cat. 6
300Mbps DL 50Mbps UL

2x20MHz C.A. 
(DL)

LTE Cat. 4 
150Mbps DL 50Mbps UL
“X10 LTE” Cat. 9
450Mbps DL
50Mbps UL

3x20MHz C.A. 
(DL)

“X8 LTE” Cat. 7 
300Mbps DL 100Mbps UL

2x20MHz C.A. 
(DL & UL)
 

Video encoding and decoding capabilities seem to be carried over from the MT6795 / X10, but MediaTek advertises a 30% and 40% improvement in decoding and encoding power consumption respectively.

Still on the multimedia side, we see the employment of a new integrated Cortex-M4 companion-core which serves as both an audio processor for low-power audio decoding, speech enhancement features and voice recognition, as well as sensor-hub function acting as a microcontroller for offloading sensor data processing from the main CPU cores. This means that while the device has the display turned off but is playing audio, only the M4 is in use in order to prolong battery life.

On the GPU side, the X20 seemed to be the first officially announced Mali T800 series GPU SoC. MediaTek explains that this is a still-unreleased ARM Mali high-end GPU similar to the T880. MediaTek initially chose a more conservative MP4 configuration clocked in at 700MHz, although final specifications are being withheld at this time. It should be noted that Mediatek has traditionally never aimed very high in terms of GPU configurations. It could be considered that the GPU in the X20 could still remain competitive in prolonged sustained loads as we saw larger Mali implementations such as Samsung’s Exynos SoCs not being able to remain in the thermal envelope at their maximum rated frequencies. Initial relative estimates of the X20, expressed by MediaTek, compared to the Helio X10’s G6200 see a 40% improvement in performance with a 40% drop in power.

On the memory side, MediaTek remains with a 2x32bit LPDDR3 memory interface running at 933MHz. MediaTek reasons that the SoC is limited to 1440p devices and the LPDDR3 memory should be plenty enough to satisfy the SoC’s bandwidth requirements (a notion I agree with, given the GPU configuration).

Going back to the signature 10-Core/Tri-Cluster architecture of the SoC, MediaTek explains that this was a choice of power optimization over conventional two-cluster big.LITTLE designs. b.L works by employing heterogeneous CPU clusters – these may differ in architecture, but can also be identical architectures which then differ in their electrical characteristics and their target operating speeds. We’ve covered how power consumption curves behave in our Exynos 5433 deep-dive, and MediaTek presents a similar overview when explaining the X20’s architecture.

One option in the traditional 2-cluster designs is to employ a low-power low-performance cluster, typically always a lower-power in-order CPU architecture such as ARM’s A53. This is paired with a higher-power high-performance cluster, either a larger CPU core such as the A57/A72, or a frequency optimized A53 as we see employed in some past MediaTek SoCs, or most recently, HiSilicon’s Kirin 930 found in the Huawei P8. 

Contrary to what MediaTek presents as an “introduction of a Mid cluster”, I like to see MediaTek’s tri-cluster approach as an extension to the existing dual A53 cluster designs – where the added A72 cluster is truly optimized for only the highest frequencies. Indeed, we are told that the A72 cluster can reach up to 2.5GHz on a TSMC 20nm process. ARM aims similar clocks for the A72 but at only 14/16nm FinFET processes, so to see MediaTek go this high on 20nm is impressive, even if it’s only a two-core cluster. It will be interesting to see how MediaTek chooses the lower frequency limits on each cluster, especially the A72 CPUs, or how these options will be presented to OEMs.

The end-result is a promised 30% improvement in power consumption over a similar 2-cluster approach. This happens thanks to the finer granularity in the performance/power curve and an increase in available performance-power points for the scheduler to place a thread on. Having a process that is heavy enough that it is not capable of residing on the smallest cluster due to performance constraints, but not demanding enough to require the big cluster’s full performance, can now reside on this medium cluster at much greater efficiency than had it been running on the big cluster at reduced clocks. MediaTek uses CorePilot as a custom developed scheduler implementation that is both power aware and very advanced (based on our internal testing of other MediaTek SoCs). My experience and research with it on existing devices was fairly positive, so I’m sure the X20’s new v3.0 implementation of CorePilot will be able to take good advantage of the tri-cluster design.

The biggest question and need of clarification is in the area of what the MCSI (the interconnect) is capable of. ARM had announced its CCI-500 interconnect back in February, which incidentally also promised the capability of up to 4 CPU clusters. MediaTek hinted that this may be a design based on ARM’s CCI – but we’re still not sure if this means a loosely based design or a direct improvement of ARM’s IP. Cache coherence is a major design effort, and if MediaTek saw this custom IP as an effort worth committing to, then the MCSI may have some improvements we’re still not clear on.

The Helio X20 is certainly an interesting SoC and I’m eager on how the tri-cluster design performs in practice. The X20 samples in H2 2015 and devices with it are planned to be shipping in Q1 2016. In the given time-frame, it seems the X20’s primary competitor is Qualcomm’s Snapdragon 620, so it’ll be definitely a battle for the “super-mid” (as MediaTek likes to put it) crown.