News


Intel at ISSCC 2015: Reaping the Benefits of 14nm and Going Beyond 10nm

Intel at ISSCC 2015: Reaping the Benefits of 14nm and Going Beyond 10nm

As part of the International Solid-State Circuits Conference every year, Intel brings forward a number of presentations regarding its internal research. The theme this year at ISSCC is ‘Silicon Systems – Small Chips for Big Data’, and Intel previewed a number of their presentations with the media and analysts last week before the conference. Hot topics being talked about include developments for 14nm features that could potentially be ported to real world devices, technological developments at 22nm using Tri-Gate CMOS for adaptive, autonomous and resilient systems and also a few quick words regarding 10nm and beyond.

Taking Moore’s Law Beyond 10nm

Part of ISSCC will be a round table with representatives from Intel, Qualcomm, a couple of industry companies and university researches discussing how 10nm will attack Moore’s Law and how it can be extended below to 7nm. The graphs shown at IDF 2014 make their presence again, showing cost per square mm and cost per transistor, courtesy of Mark Bohr (Intel Senior Fellow, Logic Technology Development):

The fact that 14nm resulted in a better-than-the-trend drop in cost per transistor was explained as some internal smart reworking, making sure that certain areas of the dies require different masking and by optimizing the masking process, the cost can be reduced rather than relying on fewer general masks (but it is still a balance).

It was explained that while 10nm will have more masking steps than 14nm, and the delays that bogged down 14nm coming late to market will not be present at 10nm – or at least reduced. We were told that Intel has learned that the increase in development complexity of 14nm required more internal testing stages and masking implementations was a major reason for the delay, as well as requiring sufficient yields to go ahead with the launch. As a result, Intel is improving the efficiency testing at each stage and expediting the transfer of wafers with their testing protocols in order to avoid delays. Intel tells us that that their 10nm pilot lines are operating 50% faster than 14nm was as a result of these adjustments. So while the additional masking steps at 10nm which ultimately increases fixed costs, Intel is still quoting that their methods results in a reducing in terms of cost per transistor without needing a completely new patterning process. EUV lithography was discussed, but Intel seems to be hoping to avoid it until it is absolutely necessary, as EUV development so far has been slower to progress than expected.

10 nm will come with innovation, and getting down to 7 nm will require new materials and processes which Intel wants to promote as a progressive integration between process development and the product design teams. New materials and device structures are key elements on that list, and while III-V materials were discussed in the ISSCC preview, no exact details were given.

Along with addressing the general challenges in getting down to 7nm, Intel’s research group is also looking to address future integrated systems, specifically 2.5D (separate dies on an interposer) and 3D (stacked dies). While 2.5D and 3D are not direct replacements for smaller manufacturing nodes – they just allow you to lay down more transistors at a higher cost – they are being examined as potential solutions for containing power consumption in certian situations (2.5D) or in building better size-limited integrated topologies (3D). Specifically, Intel is looking at scenarios where logic blocks using different fabrication methods are laid out in their own layers and stacked, rather than implemented on a single layer of a single die (think memory, digital logic, and analog communications on a single chip).

These kinds of configuration may appear in smartphones, tablets, or other devices that use highly-integrated chips where multiple types of fabrication would be necessary, and where manufacturers can charge the premium price necessary to cover the additional costs. We have discussed in the past how 2.5D and 3D configurations can improve performance, especially when it comes to memory density and graphics bandwidth, however the price increase (according to Intel) will result in that premium, even at high volume.

Reaping the Benefits of 14nm

Intel is highlighting a trio of papers at ISSCC regarding 14nm. One of the areas ripe for exploitation at 14nm is data transfer, especially transmitters. To that extent, Intel is showing a 14nm Tri-Gate CMOS serializer/deserializer transmitter capable of 16-40 Gbps, using both the NRZ (non-return zero) and PAM4 (Pulse-Amplitude Modulation with 4 levels) modes within a 0.03 millimeter squared die area.

Also on data transfer is a paper regarding the lowest power 10Gb/s serial link and the first complete serial link using 14nm Tri-Gate CMOS. Intel has working silicon at 14nm showing a 59mW power consumption within 0.065 millimeters squared die area that configures the committed data rate to provide the cleanest data response.

Perhaps the most exciting 14nm development is in the form of memory, with Intel describing in-house 84Mb SRAM design that uses the world’s smallest bitcell (0.050 micron squared). At 14nm it represents a doubling of the density at 14.5 Mb per square millimeter, but also provides substantially lower minimum voltage for a given frequency compared to the previous 22nm process. As shown in the graph in the slide, 0.6V is good for 1.5 GHz, but it can scale up to 3 GHz. It is also worth noting that the 14nm yield gradient is more conducive to lower voltage operation compared to the 22nm process. While it seems odd to promote an 84Mb (10.5 MB) design, Intel discussed that it can be scaled up over 100 Mb or more, making it a better solution for embedded devices rather than something like Crystal Well on desktop.

Still Developing on 22nm

While 14nm is great for density, lower voltage and lower power, other features on die are often produced at a looser resolution in order to ensure compatibility but it also offers a great research platform for testing new on-die features to be scaled down at a later date. To this extent, Intel Labs is also presenting a couple of papers about in-house test chips for new features.

The first test chip concerns data retention within register files. Depending on the external circumstances such as temperature and age, this adaptive and resilient domino register file testchip is designed to realign timing margins and detect errors as they occur and adjust the behavior in order to compensate. The logic that Intel is presenting is designed to also cater for die variation and voltage droop, making it more of a universal solution. On a higher level it sounds like the situation when NAND flash gets old and the onboard controller has to compensate for the voltage level margins.

The second test-chip being described brings the situation down to Intel’s execution units in its graphics and dealing with fast, autonomous and independent dynamic voltage scaling. The use of a combined low-dropout regulator (LDO) for low voltages, such as at idle, and a switched capacitor voltage regulator (SCVR) for high voltages allow the appropriate current injection to deal with voltage droop as well as resulting in a large energy reduction. When applied, this should allow for either a power drop at the same frequency, or a higher frequency at the same voltage. Currently the numbers provided by Intel are all on internal silicon rather than anything in the wild, and will be examined at smaller nodes in due course.

Intel at ISSCC

ISSCC always throws out some interesting information about what is actually going on under the hood with the silicon we use almost every day, as we tend to think about it as a black box that slowly gets better over time. In reality, new features are fully researched and documented in order to be included in the next model, as well as trying to keep a balance of power usage and efficiency. On the CPU architecture side of the equation, we reported that Broadwell features needed to show a 2% performance or efficiency improvement for every 1% increase in power, making that advancement steeper than the 1:1 previously required. For all intents and purposes this means that if the same strategy is applied to 10nm and beyond, we are in for a very interesting time. It was interesting to hear about Intel speeding up on 10nm to avoid the delays occurred at 14nm, as well as thoughts regarding future technologies.

The papers Intel is presenting should be available via the ISSCC website as the presentations take place, along with a few others that pique our interest. This should get us ready for some interesting developments come Intel’s Developer Forum later in the year. 

NVIDIA Mobile Overclocking - There and Back Again

NVIDIA Mobile Overclocking – There and Back Again

The past few months have been a bit interesting on the mobile side of the fence for NVIDIA. Starting with the R346 drivers (347.09 beta and later), NVIDIA apparently decided to lock down overclocking of their mobile GPUs. While this is basically a non-issue for the vast majority of users, for the vocal minority of enthusiasts the reaction has been understandably harsh. Accusations have ranged from “bait and switch” (e.g. selling a laptop GPU that could be overclocked and then removing that “feature”) to “this is what happens when there’s no competition”, and everything in between.

NVIDIA for their part has had a few questionable posts as well, at one point stating that overclocking was just “a bug introduced into our drivers” – a bug that has apparently been around for how many years now? And with a 135MHz overclocking limit no less…. But there’s light at the end of the tunnel, as NVIDIA has now posted that they will be re-enabling overclocking with their next driver update, due in March. So that’s the brief history, but let’s talk about a few other aspects of what all of this means.

First and foremost, anyone that claims enabling/disabling overclocking of mobile GPUs is going to have a huge impact on NVIDIA’s bottom line is, in my view, spouting hyperbole and trying to create a lot of drama. I understand there are people that want this feature, and that’s fine, but for every person that looks at overclocking a mobile GPU there are going to be 100 more (1000 more?) that never give overclocking a first thought, let alone a second one. And for many of those people, disabling overclocking entirely isn’t really a bad idea – a way to protect the user from themselves, basically. I also don’t think that removing overclocking was ever done due to the lack of competition, though it might have had a small role to play. At most, I think NVIDIA might have disabled overclocking because it’s a way to keep people from effectively turning a GTX 780M into a GTX 880M, or the current GTX 980M into a… GTX 1080M (or whatever they call the next version).

NVIDIA’s story carries plenty of weight with me, as I’ve been reviewing and helping people with laptops for over a decade. Their initial comment was, “Overclocking is by no means a trivial feature, and depends on thoughtful design of thermal, electrical, and other considerations. By overclocking a notebook, a user risks serious damage to the system that could result in non-functional systems, reduced notebook life, or many other effects.” This is absolutely true, and I’ve seen plenty of laptops where the GPU has failed after 2-3 years of use, and that’s without overclocking. I’ve also seen a few GTX 780M notebooks where running at stock speeds isn’t 100% stable, especially for prolonged periods of time. Sometimes it’s possible to fix the problem; many people simply end up buying a new laptop and moving on, disgruntled at the OEM for building a shoddy notebook.

For users taking something like a Razer Blade and trying to overclock the GPU, I also think pushing the limits of the hardware beyond what the OEM certified is just asking for trouble. Gaming GPUs and “thin and light” are generally at opposite ends of the laptop spectrum, and in our experience the laptops can already get pretty toasty while gaming. So if you have a laptop that is already nearing the throttling point, overclocking the GPU is going to increase the potential of throttling or potentially even damage the hardware. Again, I’ve seen enough failed laptops that there’s definitely an element of risk – many laptops seem to struggle to run reliably for more than 2-3 years under frequent gaming workloads, so increasing the cooling demands is just going to exacerbate the problem.

On the other hand, if you have a large gaming notebook with a lot of cooling potential and the system generally doesn’t get hot, sure, it’s nice to be able to push the hardware a bit further if you want. Built-in throttling features should also protect the hardware from any immediate damage. We don’t normally investigate overclocking potential on notebooks as it can vary even between units of the same model, and in many cases it voids the warranty, but enthusiasts are a breed apart. My personal opinion is that for a gaming laptop, you should try to keep GPU temperatures under 85C to be safe (which is what most OEMs tend to target); when laptops exceed that “safe zone” (with or without overclocking), I worry about the long-term reliability prospects. If you have a GPU that’s running at 70C under load, however, you can probably reliably run the clocks at the maximum +135MHz that NVIDIA allows.

We’re currently in the process of testing a couple of gaming notebooks, and given the timing of this we’re going to use this as an opportunity to try some overclocking – with the older 344.75 drivers for now. We’ll have a separate article digging into the overclocking results later, but again I’d urge users to err on the side of caution rather than trying to redline your mobile GPU. What that means in practice is that mobile GPU overclocking is mostly going to be of use for people with larger gaming notebooks – generally the high-end Alienware, ASUS, Clevo, and MSI models. There may be other laptops where you can squeeze out some extra performance (e.g. some models with GTX 850M/860M, or maybe even some older GT 750M laptops), but keep an eye on the thermals if you want to go that route.

NVIDIA Mobile Overclocking - There and Back Again

NVIDIA Mobile Overclocking – There and Back Again

The past few months have been a bit interesting on the mobile side of the fence for NVIDIA. Starting with the R346 drivers (347.09 beta and later), NVIDIA apparently decided to lock down overclocking of their mobile GPUs. While this is basically a non-issue for the vast majority of users, for the vocal minority of enthusiasts the reaction has been understandably harsh. Accusations have ranged from “bait and switch” (e.g. selling a laptop GPU that could be overclocked and then removing that “feature”) to “this is what happens when there’s no competition”, and everything in between.

NVIDIA for their part has had a few questionable posts as well, at one point stating that overclocking was just “a bug introduced into our drivers” – a bug that has apparently been around for how many years now? And with a 135MHz overclocking limit no less…. But there’s light at the end of the tunnel, as NVIDIA has now posted that they will be re-enabling overclocking with their next driver update, due in March. So that’s the brief history, but let’s talk about a few other aspects of what all of this means.

First and foremost, anyone that claims enabling/disabling overclocking of mobile GPUs is going to have a huge impact on NVIDIA’s bottom line is, in my view, spouting hyperbole and trying to create a lot of drama. I understand there are people that want this feature, and that’s fine, but for every person that looks at overclocking a mobile GPU there are going to be 100 more (1000 more?) that never give overclocking a first thought, let alone a second one. And for many of those people, disabling overclocking entirely isn’t really a bad idea – a way to protect the user from themselves, basically. I also don’t think that removing overclocking was ever done due to the lack of competition, though it might have had a small role to play. At most, I think NVIDIA might have disabled overclocking because it’s a way to keep people from effectively turning a GTX 780M into a GTX 880M, or the current GTX 980M into a… GTX 1080M (or whatever they call the next version).

NVIDIA’s story carries plenty of weight with me, as I’ve been reviewing and helping people with laptops for over a decade. Their initial comment was, “Overclocking is by no means a trivial feature, and depends on thoughtful design of thermal, electrical, and other considerations. By overclocking a notebook, a user risks serious damage to the system that could result in non-functional systems, reduced notebook life, or many other effects.” This is absolutely true, and I’ve seen plenty of laptops where the GPU has failed after 2-3 years of use, and that’s without overclocking. I’ve also seen a few GTX 780M notebooks where running at stock speeds isn’t 100% stable, especially for prolonged periods of time. Sometimes it’s possible to fix the problem; many people simply end up buying a new laptop and moving on, disgruntled at the OEM for building a shoddy notebook.

For users taking something like a Razer Blade and trying to overclock the GPU, I also think pushing the limits of the hardware beyond what the OEM certified is just asking for trouble. Gaming GPUs and “thin and light” are generally at opposite ends of the laptop spectrum, and in our experience the laptops can already get pretty toasty while gaming. So if you have a laptop that is already nearing the throttling point, overclocking the GPU is going to increase the potential of throttling or potentially even damage the hardware. Again, I’ve seen enough failed laptops that there’s definitely an element of risk – many laptops seem to struggle to run reliably for more than 2-3 years under frequent gaming workloads, so increasing the cooling demands is just going to exacerbate the problem.

On the other hand, if you have a large gaming notebook with a lot of cooling potential and the system generally doesn’t get hot, sure, it’s nice to be able to push the hardware a bit further if you want. Built-in throttling features should also protect the hardware from any immediate damage. We don’t normally investigate overclocking potential on notebooks as it can vary even between units of the same model, and in many cases it voids the warranty, but enthusiasts are a breed apart. My personal opinion is that for a gaming laptop, you should try to keep GPU temperatures under 85C to be safe (which is what most OEMs tend to target); when laptops exceed that “safe zone” (with or without overclocking), I worry about the long-term reliability prospects. If you have a GPU that’s running at 70C under load, however, you can probably reliably run the clocks at the maximum +135MHz that NVIDIA allows.

We’re currently in the process of testing a couple of gaming notebooks, and given the timing of this we’re going to use this as an opportunity to try some overclocking – with the older 344.75 drivers for now. We’ll have a separate article digging into the overclocking results later, but again I’d urge users to err on the side of caution rather than trying to redline your mobile GPU. What that means in practice is that mobile GPU overclocking is mostly going to be of use for people with larger gaming notebooks – generally the high-end Alienware, ASUS, Clevo, and MSI models. There may be other laptops where you can squeeze out some extra performance (e.g. some models with GTX 850M/860M, or maybe even some older GT 750M laptops), but keep an eye on the thermals if you want to go that route.