Close Menu
    Facebook X (Twitter) Instagram
    Trending
    • Iran’s Ayatollah Ali Khamenei, who based iron rule on fiery hostility to US and Israel, dies at 86
    • More blasts rock Dubai, Doha and Manama as Iran targets US assets in Gulf | Israel-Iran conflict News
    • Shane van Gisbergen wins NASCAR O’Reilly Series race at COTA
    • Iran’s theocracy faces uncertain transition after Khamenei’s death
    • How does Iran’s theocratic system of power work?
    • US strikes on Iran lead to renewed demands for war powers legislation | Donald Trump News
    • Five potential Anthony Richardson trade destinations
    • Russian Foreign Ministry condemns US-Israel strikes on Iran
    Prime US News
    • Home
    • World News
    • Latest News
    • US News
    • Sports
    • Politics
    • Opinions
    • More
      • Tech News
      • Trending News
      • World Economy
    Prime US News
    Home»Tech News»HBM on GPU: Thermal Challenges and Solutions
    Tech News

    HBM on GPU: Thermal Challenges and Solutions

    Team_Prime US NewsBy Team_Prime US NewsJanuary 14, 2026No Comments6 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email

    Peek contained in the bundle of AMD’s or Nvidia’s most advanced AI products and also you’ll discover a acquainted association: The GPU is flanked on two sides by high-bandwidth memory (HBM), probably the most superior reminiscence chips out there. These reminiscence chips are positioned as shut as potential to the computing chips they serve with a view to minimize down on the largest bottleneck in AI computing—the energy and delay in getting billions of bits per second from reminiscence into logic. However what for those who may deliver computing and reminiscence even nearer collectively by stacking the HBM on high of the GPU?

    Imec lately explored this state of affairs utilizing superior thermal simulations, and the reply—delivered in December on the 2025 IEEE International Electron Device Meeting (IEDM)—was a bit grim. 3D stacking doubles the working temperature contained in the GPU, rendering it inoperable. However the crew, led by Imec’s James Myers, didn’t simply hand over. They recognized a number of engineering optimizations that in the end may whittle down the temperature distinction to almost zero.

    Imec began with a thermal simulation of a GPU and 4 HBM dies as you’d discover them as we speak, inside what’s referred to as a 2.5D bundle. That’s, each the GPU and the HBM sit on substrate referred to as an interposer, with minimal distance between them. The 2 forms of chips are linked by hundreds of micrometer-scale copper interconnects constructed into the interposer’s floor. On this configuration, the mannequin GPU consumes 414 watts and reaches a peak temperature of just below 70 °C—typical for a processor. The reminiscence chips eat a further 40 W or so and get considerably much less scorching. The warmth is faraway from the highest of the bundle by the type of liquid cooling that’s turn out to be frequent in new AI data centers.

    RELATED: Future Chips Will Be Hotter Than Ever

    “Whereas this method is at the moment used, it doesn’t scale nicely for the longer term—particularly because it blocks two sides of the GPU, limiting future GPU-to-GPU connections contained in the bundle,” Yukai Chen, a senior researcher at Imec advised engineers at IEDM. In distinction, “the 3D method results in larger bandwidth, decrease latency… an important enchancment is the bundle footprint.”

    Sadly, as Chen and his colleagues discovered, probably the most easy model of stacking, merely placing the HBM chips on high of the GPU and including a block of clean silicon to fill in a spot on the heart, shot temperatures within the GPU as much as a scorching 140 °C—nicely previous a typical GPU’s 80 °C restrict.

    System Know-how Co-optimization

    The Imec crew set about making an attempt a variety of know-how and system optimizations geared toward decreasing the temperature. The very first thing they tried was to throw out a layer of silicon that was now redundant. To know why, you need to first get a grip on what HBM actually is.

    This type of reminiscence is a stack of as many as 12 high-density DRAM dies. Every has been thinned right down to tens of micrometers and is shot by means of with vertical connections. These thinned dies are stacked one atop one other and linked by tiny balls of solder, and this stack of reminiscence is vertically linked to a different piece of silicon, referred to as the bottom die. The bottom die is a logic chip designed to multiplex the info—pack it into the restricted variety of wires that may match throughout the millimeter-scale hole to the GPU.

    However with the HBM now on high of the GPU, there’s no want for such a knowledge pump. Bits can circulate immediately into the processor with out regard for what number of wires occur to suit alongside the facet of the chip. After all, this variation means shifting the reminiscence management circuits from the bottom die into the GPU and subsequently altering the processor’s floorplan, says Myers. However there needs to be ample room, he suggests, as a result of the GPU will now not want the circuits used to demultiplex incoming reminiscence information.

    RELATED: The Hot, Hot Future of Chips

    Chopping out this middle-man of reminiscence cooled issues down by solely rather less than 4 °C. However, importantly, it ought to massively enhance the bandwidth between the reminiscence and the processor, which is essential for one more optimization the crew tried—slowing down the GPU.

    Which may appear opposite to the entire goal of higher AI computing, however on this case it’s a bonus. Large language models are what are referred to as “reminiscence sure” issues. That’s, reminiscence bandwidth is the primary limiting issue. However Myers’ crew estimated 3D stacking HBM on the GPU would enhance bandwidth fourfold. With that added headroom, even slowing the GPU’s clock by 50 % nonetheless results in a efficiency win, whereas cooling every little thing down by greater than 20 °C. In observe, the processor won’t have to be slowed down fairly that a lot. Growing the clock frequency to 70 % led to a GPU that was only one.7 °C hotter, Myers says.

    Optimized HBM

    One other massive drop in temperature got here from making the HBM stack and the realm round it extra conductive. That included merging the 4 stacks into two wider stacks, thereby eliminating a heat-trapping area; scaling down the highest—normally thicker—die of the stack; and filling in additional of the house across the HBM with clean items of silicon to conduct extra warmth.

    With all of that, the stack now operated at about 88 °C. One remaining optimization introduced issues again to close 70 °C. Usually, some 95 % of a chip’s warmth is faraway from the highest of the bundle, the place on this case water carries the warmth away. However including comparable cooling to the underside as nicely drove the stacked chips down a remaining 17 °C.

    Though the analysis offered at IEDM exhibits it could be potential, HBM-on-GPU isn’t essentially the only option, Myers says. “We’re simulating different system configurations to assist construct confidence that that is or isn’t the only option,” he says. “GPU-on-HBM is of curiosity to some in business,” as a result of it places the GPU nearer to the cooling. However it will doubtless be a extra advanced design, as a result of the GPU’s energy and information must circulate vertically by means of the HBM to achieve it.

    From Your Website Articles

    Associated Articles Across the Internet



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleMarket Talk – January 14, 2026
    Next Article Sick astronaut, rest of crew undock from ISS ahead of expected return to Earth, NASA says
    Team_Prime US News
    • Website

    Related Posts

    Tech News

    Xiangyi Cheng Brings AR to Classrooms and Hospitals

    February 28, 2026
    Tech News

    Charles Proteus Steinmetz: Electric Vehicle Visionary

    February 28, 2026
    Tech News

    Videos: Farming Robots, Humanoid Robots, and More

    February 27, 2026
    Add A Comment
    Leave A Reply Cancel Reply

    Most Popular

    Timberwolves overwhelm Lakers in Game 1 blowout

    April 20, 2025

    Letters to the Editor: UCLA needs to fight back against Trump’s ‘attempted extortion’

    August 13, 2025

    China, US slash sweeping tariffs in trade war climbdown

    May 14, 2025
    Our Picks

    Iran’s Ayatollah Ali Khamenei, who based iron rule on fiery hostility to US and Israel, dies at 86

    March 1, 2026

    More blasts rock Dubai, Doha and Manama as Iran targets US assets in Gulf | Israel-Iran conflict News

    March 1, 2026

    Shane van Gisbergen wins NASCAR O’Reilly Series race at COTA

    March 1, 2026
    Categories
    • Latest News
    • Opinions
    • Politics
    • Sports
    • Tech News
    • Trending News
    • US News
    • World Economy
    • World News
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2024 Primeusnews.com All Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.