GPUs


NVIDIA Announces Tesla M40 & M4 Server Cards - Data Center Machine Learning

NVIDIA Announces Tesla M40 & M4 Server Cards – Data Center Machine Learning

Slowly but steadily NVIDIA has been rotating in Maxwell GPUs into the company’s lineup of Tesla server cards. Though Maxwell is not well-suited towards the kind of high precision HPC work that the Tesla lineup was originally crafted for, Maxwell is plenty suitable for just about every other server use NVIDIA can think of. And as a result the company has been launching what’s best described as new breeds of Maxwell cards in the last few months.

After August’s announcement of the Tesla M60 and M6 cards – with a focus on VDI and video encoding – NVIDIA is back today for the announcement of the next set of Tesla cards, the M40 and the M4. In what the company is dubbing their “hyperscale accelerators,” NVIDIA is launching these two cards with a focus on capturing a larger portion of the machine learning market.

NVIDIA Tesla Family Specification Comparison
  Tesla M40 Tesla M4 Tesla M60 Tesla K40
Stream Processors 3072 1024 2 x 2048
(4096)
2880
Boost Clock(s) ~1140MHz ~1075MHz ~1180MHz 810MHz, 875MHz
Memory Clock 6GHz GDDR5 5.5GHz GDDR5 5GHz GDDR5 6GHz GDDR5
Memory Bus Width 384-bit 128-bit 2 x 256-bit 384-bit
VRAM 12GB 4GB 2 x 8GB
(16GB)
12GB
Single Precision (FP32) 7 TFLOPS 2.2 TFLOPS 9.7 TFLOPS 4.29 TFLOPS
Double Precision (FP64) 0.21 TFLOPS (1/32) 0.07 TFLOPS (1/32) 0.3 TFLOPS (1/32) 1.43 TFLOPS (1/3)
Transistor Count 8B 2.94B 2x 5.2B 7.1B
TDP 250W 50W-75W 225W-300W 235W
Cooling Passive Passive
(Low Profile)
Active/Passive Active/Passive
Manufacturing Process TSMC 28nm TSMC 28nm TSMC 28nm TSMC 28nm
GPU GM200 GM206 GM204 GK110
Target Market Machine Learning Machine Learning VDI Compute

First let’s quickly talk about the cards themselves. The Tesla M40 marks the introduction of the GM200 GPU to the Tesla lineup, with NVIDIA looking to put their best single precision (FP32) GPU to good use. This is a 250 Watt full power and fully enabled GM200 card – though with Maxwell this distinction loses some meaning – with NVIDIA outfitting the card with 12GB of GDDR5 VRAM clocked at 6GHz. We know that Maxwell doesn’t support on-chip ECC for the RAM and caches, but it’s not clear at this time whether soft-ECC is supported for the VRAM. Otherwise, with the exception of the change in coolers this card is a spitting image of the consumer GeForce GTX Titan X.

Joining the Tesla M40 is the Tesla M4. As hinted at by its single-digit product number, the M4 is a small, low powered card. In fact this is the first Tesla card to be released in a PCIe half-height low profile form factor, with NVIDIA specifically aiming for dense clusters of these cards. Tesla M4 is based on GM206 – this being the GPU’s first use in a Tesla product as well – and is paired with 4GB of GDDR5 clocked at 5GHz. NVIDIA offers multiple power/performance configurations of the M4 depending on server owner’s needs, ranging from 50W to 75W, with the highest power mode rated to deliver up to 2.2TFLOPS of FP32 performance.

Both the Tesla M40 and M4 are being pitched at the machine learning market, which has been a strong focus for NVIDIA since the very start of the year. The company believes that machine learning is the next great frontier for GPUs, capitalizing on neural net research that has shown GPUs to be capable of both quickly training and quickly executing neural nets. Neural nets in turn are increasingly being used as more efficient means for companies to process vast amounts of audio & video data (e.g. the Facebooks of the world).

To that end we have seen the company focus on machine learning in the automotive sector with products such as the Drive PX system and lay out their long-term plans for machine learning with the forthcoming Pascal architecture at GTC 2015. In the interim then we have the Tesla M40 and Tesla M4 for building machine learning setups with NVIDIA’s current-generation architecture.

Given their performance and power profiles, Tesla M40 and M4 are intended to split the machine learning market on the basis of training versus execution The powerful M40 being well-suited for quicker training of neural nets and other systems, while the more compact M4 is well-suited for dense clusters of systems actually executing various machine learning tasks. Note that it’s interesting that NVIDIA is pitching the M40 and not the more powerful M60 for training tasks; as NVIDIA briefly discussed among their long-term plans at GTC 2015, current training algorithms don’t scale very well beyond a couple of GPUs, so users are better off with a couple top-tier GM200 GPUs than a larger array of densely packed GM204 GPUs. As a result the M40 occupies an interesting position as the company’s top Tesla card for machine learning tasks that aren’t trivially scalable to many GPUs.

Meanwhile, along with today’s hardware announcement NVIDIA is also announcing a new software suite to tie together their hyperscale ambitions. Dubbed the “NVIDIA Hyperscale Suite,” the company is putting together software targeted at end-user facing web services. Arguably the lynchpin of the suite is NVIDIA’s GPU REST Engine, a service for RESTful APIs to utilize the GPU, and in turn allowing web services to easily access GPU resources. NVIDIA anticipates the GPU REST Engine enabling everything from search acceleration to image classification, and to start things off they are providing the NVIDIA Image Compute Engine, a REST-capable service for GPU image resizing. Meanwhile the company is also be providing their cuDNN neural net software as part of the suite, and versions of FFmpeg with support for NVIDIA’s hardware video encode and decode blocks to speed up video processing and transcoding.

Wrapping things up, as is common with Tesla product releases, today’s announcements will predate the hardware itself by a bit. NVIDIA tells us that the Tesla M40 and the hyperscale software suite will be available later this year (with just over a month and a half remaining). Meanwhile the Tesla M4 will be released in Q1 of 2016. NVIDIA has not announced card pricing at this time.

NVIDIA Announces Tesla M40 & M4 Server Cards - Data Center Machine Learning

NVIDIA Announces Tesla M40 & M4 Server Cards – Data Center Machine Learning

Slowly but steadily NVIDIA has been rotating in Maxwell GPUs into the company’s lineup of Tesla server cards. Though Maxwell is not well-suited towards the kind of high precision HPC work that the Tesla lineup was originally crafted for, Maxwell is plenty suitable for just about every other server use NVIDIA can think of. And as a result the company has been launching what’s best described as new breeds of Maxwell cards in the last few months.

After August’s announcement of the Tesla M60 and M6 cards – with a focus on VDI and video encoding – NVIDIA is back today for the announcement of the next set of Tesla cards, the M40 and the M4. In what the company is dubbing their “hyperscale accelerators,” NVIDIA is launching these two cards with a focus on capturing a larger portion of the machine learning market.

NVIDIA Tesla Family Specification Comparison
  Tesla M40 Tesla M4 Tesla M60 Tesla K40
Stream Processors 3072 1024 2 x 2048
(4096)
2880
Boost Clock(s) ~1140MHz ~1075MHz ~1180MHz 810MHz, 875MHz
Memory Clock 6GHz GDDR5 5.5GHz GDDR5 5GHz GDDR5 6GHz GDDR5
Memory Bus Width 384-bit 128-bit 2 x 256-bit 384-bit
VRAM 12GB 4GB 2 x 8GB
(16GB)
12GB
Single Precision (FP32) 7 TFLOPS 2.2 TFLOPS 9.7 TFLOPS 4.29 TFLOPS
Double Precision (FP64) 0.21 TFLOPS (1/32) 0.07 TFLOPS (1/32) 0.3 TFLOPS (1/32) 1.43 TFLOPS (1/3)
Transistor Count 8B 2.94B 2x 5.2B 7.1B
TDP 250W 50W-75W 225W-300W 235W
Cooling Passive Passive
(Low Profile)
Active/Passive Active/Passive
Manufacturing Process TSMC 28nm TSMC 28nm TSMC 28nm TSMC 28nm
GPU GM200 GM206 GM204 GK110
Target Market Machine Learning Machine Learning VDI Compute

First let’s quickly talk about the cards themselves. The Tesla M40 marks the introduction of the GM200 GPU to the Tesla lineup, with NVIDIA looking to put their best single precision (FP32) GPU to good use. This is a 250 Watt full power and fully enabled GM200 card – though with Maxwell this distinction loses some meaning – with NVIDIA outfitting the card with 12GB of GDDR5 VRAM clocked at 6GHz. We know that Maxwell doesn’t support on-chip ECC for the RAM and caches, but it’s not clear at this time whether soft-ECC is supported for the VRAM. Otherwise, with the exception of the change in coolers this card is a spitting image of the consumer GeForce GTX Titan X.

Joining the Tesla M40 is the Tesla M4. As hinted at by its single-digit product number, the M4 is a small, low powered card. In fact this is the first Tesla card to be released in a PCIe half-height low profile form factor, with NVIDIA specifically aiming for dense clusters of these cards. Tesla M4 is based on GM206 – this being the GPU’s first use in a Tesla product as well – and is paired with 4GB of GDDR5 clocked at 5GHz. NVIDIA offers multiple power/performance configurations of the M4 depending on server owner’s needs, ranging from 50W to 75W, with the highest power mode rated to deliver up to 2.2TFLOPS of FP32 performance.

Both the Tesla M40 and M4 are being pitched at the machine learning market, which has been a strong focus for NVIDIA since the very start of the year. The company believes that machine learning is the next great frontier for GPUs, capitalizing on neural net research that has shown GPUs to be capable of both quickly training and quickly executing neural nets. Neural nets in turn are increasingly being used as more efficient means for companies to process vast amounts of audio & video data (e.g. the Facebooks of the world).

To that end we have seen the company focus on machine learning in the automotive sector with products such as the Drive PX system and lay out their long-term plans for machine learning with the forthcoming Pascal architecture at GTC 2015. In the interim then we have the Tesla M40 and Tesla M4 for building machine learning setups with NVIDIA’s current-generation architecture.

Given their performance and power profiles, Tesla M40 and M4 are intended to split the machine learning market on the basis of training versus execution The powerful M40 being well-suited for quicker training of neural nets and other systems, while the more compact M4 is well-suited for dense clusters of systems actually executing various machine learning tasks. Note that it’s interesting that NVIDIA is pitching the M40 and not the more powerful M60 for training tasks; as NVIDIA briefly discussed among their long-term plans at GTC 2015, current training algorithms don’t scale very well beyond a couple of GPUs, so users are better off with a couple top-tier GM200 GPUs than a larger array of densely packed GM204 GPUs. As a result the M40 occupies an interesting position as the company’s top Tesla card for machine learning tasks that aren’t trivially scalable to many GPUs.

Meanwhile, along with today’s hardware announcement NVIDIA is also announcing a new software suite to tie together their hyperscale ambitions. Dubbed the “NVIDIA Hyperscale Suite,” the company is putting together software targeted at end-user facing web services. Arguably the lynchpin of the suite is NVIDIA’s GPU REST Engine, a service for RESTful APIs to utilize the GPU, and in turn allowing web services to easily access GPU resources. NVIDIA anticipates the GPU REST Engine enabling everything from search acceleration to image classification, and to start things off they are providing the NVIDIA Image Compute Engine, a REST-capable service for GPU image resizing. Meanwhile the company is also be providing their cuDNN neural net software as part of the suite, and versions of FFmpeg with support for NVIDIA’s hardware video encode and decode blocks to speed up video processing and transcoding.

Wrapping things up, as is common with Tesla product releases, today’s announcements will predate the hardware itself by a bit. NVIDIA tells us that the Tesla M40 and the hyperscale software suite will be available later this year (with just over a month and a half remaining). Meanwhile the Tesla M4 will be released in Q1 of 2016. NVIDIA has not announced card pricing at this time.

NVIDIA Announces Record Revenue for Q3 FY 2016

NVIDIA Announces Record Revenue for Q3 FY 2016

Today NVIDIA announced their earnings for the third quarter of their fiscal year 2016 (yes their fiscal year is almost a full year ahead of calendar) and the company posted record revenues for this quarter at $1.305 billion. This is up 7% from last year, and 13% from last quarter. Gross margin was 56.3%, with an operating income of $245 million and a net income of $246 million for the quarter. This resulted in diluted earnings per share of $0.44, which was up 42% year-over-year.

NVIDIA Q3 2016 Financial Results (GAAP)
  Q3’2016 Q2’2016 Q3’2015 Q/Q Y/Y
Revenue (in millions USD) $1305 $1153 $1225 +13% +7%
Gross Margin 56.3% 55.0% 55.2% +1.3% +1.1%
Operating Income (in millions USD) $245 $76 $213 +222% +15%
Net Income $246 $26 $173 +846% +42%
EPS $0.44 $0.05 $0.31 +780% +42%

NVIDIA also reports Non-GAAP figures, which excludes stock-based compensation and acquisition costs, restructuring, and warranty. Gross margin was slightly higher at 56.5% compared to GAAP results, with operating income at $308 million and net income of $255 million. Earnings per share on a Non-GAAP basis were $0.46. The Non-GAAP numbers are important this quarter because of the large write-down NVIDIA took last quarter on their Icera modem division.

NVIDIA Q3 2016 Financial Results (Non-GAAP)
  Q3’2016 Q2’2016 Q3’2015 Q/Q Y/Y
Revenue (in millions USD) $1305 $1153 $1225 +13% +7%
Gross Margin 56.5% 56.6% 55.5% -0.1% +1.0%
Operating Income (in millions USD) $308 $231 $264 +33% +17%
Net Income $255 $190 $220 +34% +16%
EPS $0.46 $0.34 $0.39 +35% +18%

NVIDIA saw great gains in GPU sales, which are the bulk of the company. GPU based revenue was up 12% year-over-year and up 16% over last quarter, with gaming GPU revenue up 40% over last year, and now sits at record levels. The Quadro side of the house did not fare so well, with revenues of $190 million, which is up 8% over last quarter, but down 8% compared to the same time last year. Tesla and GRID revenue was $80 million, growing since last quarter 13%, but down 8% year-over-year.

Tegra processors are still a mixed bag for NVIDIA. They have tried their hand in the mobile phone and tablet space, but with little success, but they have seen good performance from Tegra in automotive applications, and this continues to be the growth area for Tegra. For the quarter, Tegra revenue was $129 million, which is down 23% year-over-year. This decline is due to the tablet and smartphone space, because their automotive attributed revenue was $79 million, which is up 11% since last quarter and up more than 50% year-over-year. There is still hope for Tegra, but it appears to be less and less likely to be in the tablet space. NVIDIA did win the Google Pixel C tablet but it’s unclear yet how it will fare in the difficult tablet market.

NVIDIA also still receives $66 million per quarter from Intel due to a patent license agreement.

NVIDIA Quarterly Revenue Comparison (GAAP)
In millions Q3’2016 Q2’2016 Q3’2015 Q/Q Y/Y
GPU $1110 $959 $991 +16% +12%
Tegra Processor $129 $128 $168 +1% -23%
Other $66 $66 $66 flat flat

During Q3 2016, NVIDIA paid back $53 million in dividends to shareholders and bought back 4.6 million shares. Their goal for FY 2016 is to repay $800 million, and through three quarters, they are now at $604 million. NVIDIA is planning on paying approximately $1.0 billion to shareholders for their next fiscal year.

Breaking down the numbers a bit more, NVIDIA has seen big growth in the gaming segment, with revenues increasing from $468 million since Q1 FY 2015 to $761 million this quarter. Year-over-year, the gaming market has grown 44%, at a time when the PC industry as a whole has contracted. PC gaming appears to be alive and well. This has covered the drop in NVIDIA’s other segments, with the biggest drop being PC & Tegra OEM, which fell from $350 million in revenue last year to just $192 million this quarter, which is a drop of 45% year-over-year. Automotive is growing, but it is still some ways away from matching the Tablet market for sales.

Overall, any time you can set a record for a quarter it is clearly good news. Not all of NVIDIA’s business is growing as quickly as they would like, but luckily for them, their largest segment is the one that is growing at a much quicker pace than the rest of the industry.

Looking ahead to next quarter, NVIDIA is expecting revenues of $1.30 billion, plus or minus 2%, with GAAP margins of 56.7%.

Source: NVIDIA Investor Relations

NVIDIA Announces Record Revenue for Q3 FY 2016

NVIDIA Announces Record Revenue for Q3 FY 2016

Today NVIDIA announced their earnings for the third quarter of their fiscal year 2016 (yes their fiscal year is almost a full year ahead of calendar) and the company posted record revenues for this quarter at $1.305 billion. This is up 7% from last year, and 13% from last quarter. Gross margin was 56.3%, with an operating income of $245 million and a net income of $246 million for the quarter. This resulted in diluted earnings per share of $0.44, which was up 42% year-over-year.

NVIDIA Q3 2016 Financial Results (GAAP)
  Q3’2016 Q2’2016 Q3’2015 Q/Q Y/Y
Revenue (in millions USD) $1305 $1153 $1225 +13% +7%
Gross Margin 56.3% 55.0% 55.2% +1.3% +1.1%
Operating Income (in millions USD) $245 $76 $213 +222% +15%
Net Income $246 $26 $173 +846% +42%
EPS $0.44 $0.05 $0.31 +780% +42%

NVIDIA also reports Non-GAAP figures, which excludes stock-based compensation and acquisition costs, restructuring, and warranty. Gross margin was slightly higher at 56.5% compared to GAAP results, with operating income at $308 million and net income of $255 million. Earnings per share on a Non-GAAP basis were $0.46. The Non-GAAP numbers are important this quarter because of the large write-down NVIDIA took last quarter on their Icera modem division.

NVIDIA Q3 2016 Financial Results (Non-GAAP)
  Q3’2016 Q2’2016 Q3’2015 Q/Q Y/Y
Revenue (in millions USD) $1305 $1153 $1225 +13% +7%
Gross Margin 56.5% 56.6% 55.5% -0.1% +1.0%
Operating Income (in millions USD) $308 $231 $264 +33% +17%
Net Income $255 $190 $220 +34% +16%
EPS $0.46 $0.34 $0.39 +35% +18%

NVIDIA saw great gains in GPU sales, which are the bulk of the company. GPU based revenue was up 12% year-over-year and up 16% over last quarter, with gaming GPU revenue up 40% over last year, and now sits at record levels. The Quadro side of the house did not fare so well, with revenues of $190 million, which is up 8% over last quarter, but down 8% compared to the same time last year. Tesla and GRID revenue was $80 million, growing since last quarter 13%, but down 8% year-over-year.

Tegra processors are still a mixed bag for NVIDIA. They have tried their hand in the mobile phone and tablet space, but with little success, but they have seen good performance from Tegra in automotive applications, and this continues to be the growth area for Tegra. For the quarter, Tegra revenue was $129 million, which is down 23% year-over-year. This decline is due to the tablet and smartphone space, because their automotive attributed revenue was $79 million, which is up 11% since last quarter and up more than 50% year-over-year. There is still hope for Tegra, but it appears to be less and less likely to be in the tablet space. NVIDIA did win the Google Pixel C tablet but it’s unclear yet how it will fare in the difficult tablet market.

NVIDIA also still receives $66 million per quarter from Intel due to a patent license agreement.

NVIDIA Quarterly Revenue Comparison (GAAP)
In millions Q3’2016 Q2’2016 Q3’2015 Q/Q Y/Y
GPU $1110 $959 $991 +16% +12%
Tegra Processor $129 $128 $168 +1% -23%
Other $66 $66 $66 flat flat

During Q3 2016, NVIDIA paid back $53 million in dividends to shareholders and bought back 4.6 million shares. Their goal for FY 2016 is to repay $800 million, and through three quarters, they are now at $604 million. NVIDIA is planning on paying approximately $1.0 billion to shareholders for their next fiscal year.

Breaking down the numbers a bit more, NVIDIA has seen big growth in the gaming segment, with revenues increasing from $468 million since Q1 FY 2015 to $761 million this quarter. Year-over-year, the gaming market has grown 44%, at a time when the PC industry as a whole has contracted. PC gaming appears to be alive and well. This has covered the drop in NVIDIA’s other segments, with the biggest drop being PC & Tegra OEM, which fell from $350 million in revenue last year to just $192 million this quarter, which is a drop of 45% year-over-year. Automotive is growing, but it is still some ways away from matching the Tablet market for sales.

Overall, any time you can set a record for a quarter it is clearly good news. Not all of NVIDIA’s business is growing as quickly as they would like, but luckily for them, their largest segment is the one that is growing at a much quicker pace than the rest of the industry.

Looking ahead to next quarter, NVIDIA is expecting revenues of $1.30 billion, plus or minus 2%, with GAAP margins of 56.7%.

Source: NVIDIA Investor Relations