June 5, 2025, 6:55 p.m.

Technology

  • views:819

Reconfiguring the future of memory? A technological adventure centered around low-power AI chips

image

On June 2nd, according to Nikkei Asia, the news that SoftBank Group and Intel are collaborating to develop new AI-specific memory chips reveals the urgent need for high-performance and low-power storage technology in the current global AI infrastructure. Although this cooperation has a certain degree of strategic foresight in terms of form and goal, from a purely technological perspective, there is still a lot of room for discussion and questioning regarding the multiple technical paths, design concepts, manufacturing links and system integration issues involved behind this project.

The new stacked DRAM chip planned to be developed in this project is intended to differ from the existing high-bandwidth memory (HBM) in architecture and connection methods, in order to achieve the effect of reducing power consumption by approximately half. Theoretically speaking, HBM has a much better bandwidth-to-power ratio than traditional DDR memory because it vertically stacks multiple layers of DRAM through through-silicon vias (TSVS) and achieves high-speed data access through the logic chip at the bottom. However, HBM has also exposed a series of technical bottlenecks in practical applications, such as high cost, low yield, complex packaging, and great difficulty in thermal management. If a new project aims to make a breakthrough starting from the wiring method, it must confront the following core technical challenges.

First of all, whether the new wiring method can physically ensure sufficient data throughput is an unproven assumption. The main advantage of HBM lies in the bandwidth improvement brought by its vertical interconnection structure, and the change in the wiring method is bound to involve the reconstruction of the TSV layout or the introduction of alternative solutions. Any rash handling of signal integrity, clock synchronization, and delay control will directly affect the performance stability of the memory subsystem. If the new stacking technology is adopted, whether it still relies on the existing 2.5D or 3D packaging platform is also a key technical issue determining its feasibility.

Secondly, the so-called goal of "reducing power consumption by half" lacks clear technical support and empirical data. The power consumption of DRAM mainly comes from refresh operations, capacitor switching for data reading and writing, and I/O drive currents. To significantly reduce power consumption, it may involve lowering the operating voltage, reducing the refresh rate or introducing some new type of voltage regulation mechanism. And these adjustments may directly sacrifice storage density, reliability and even response speed. Without clearly stating which low-power technology path to adopt, the idea of achieving a 50% energy efficiency improvement merely through "wiring changes" has yet to be technically verified.

A further exploration of the role played by the chip design company Saimemory reveals that its core tasks are set as design and patent management, while manufacturing is entrusted to foundries. This division of labor model is already common in the current semiconductor industry. However, from the perspective of technological integration, the separation between memory design and manufacturing will bring extremely high collaborative complexity. Memory chips, especially stacked DRAM, are highly sensitive to process parameters. If design companies cannot deeply participate in the process control and wafer debugging of the manufacturing process, the product design is prone to deviations from the actual mass production performance, especially in issues such as thermal stability, packaging stress, and power integrity. In the later stage, it is very likely that the design plan will need to be repeatedly modified.

In addition, this project has introduced patent achievements from universities such as the University of Tokyo. Although university research often provides cutting-edge concepts and experimental data, the process of transforming laboratory achievements into engineering products is not simple, especially in the field of semiconductors. Whether these university technologies have the ability to be compatible with Intel's existing memory control architecture and whether they can be seamlessly integrated into the existing AI chip platform are highly technical and unanswered questions. In the past, a large number of academic patents encountered a "technology gap" during the transformation process. That is, small-scale prototypes in the laboratory often faced problems such as parameter drift, material mismatch, and packaging failure when confronted with high-density integration and mass production.

SoftBank regards this memory chip as a core component of the AI training data center, which also needs to be examined from the perspective of system design. The memory load characteristics of AI training tasks are significantly different from those of ordinary servers or consumer-level computing. They rely more on high concurrent access, cache mechanisms, and efficient cooperation with Gpus or Npus. Even if the new DRAM chips have potential advantages in energy efficiency, if their latency characteristics do not match the data throughput strategy of AI chips or fail to meet the data consistency requirements under the unified memory architecture, it will ultimately be difficult to achieve system-level performance improvement. Especially in AI frameworks, memory access patterns often have strong nonlinearity and jumps. Whether the new DRAM can maintain consistent access latency performance still requires rigorous technical evaluation.

The prototype of the chip is planned to be completed within two years. This time frame might be rather optimistic for a memory chip based on a brand-new architecture. HBM itself has undergone iterative optimization for more than five years from its proposal to commercialization. However, for this project, if the system needs to be re-established from circuit design, signal integrity verification, thermal simulation, material selection, EDA tool adaptation to tape-out testing, a large amount of engineering resources will be required just in the design stage. If only relying on some of Intel's technologies and the original achievements of universities, without the deep integration of mass production experience and packaging technology capabilities, this two-year plan may be difficult to meet the functional integrity and reliability requirements of commercial-grade products.

In terms of manufacturing, the strategy of being completed by external contract manufacturers further introduces the complexity of the supply chain. At present, the global advanced DRAM manufacturing is highly concentrated in a few manufacturers, and the technical barriers are obvious. If this project adopts a mature process, it may be limited by performance density. If advanced manufacturing processes are adopted, there will be pressures on manufacturing costs and yield rates. More importantly, during the DRAM manufacturing process, the control over parameters such as exposure alignment, etching depth, and material uniformity is extremely strict. If Saimemory advances the manufacturing of highly complex stacked chips without the support of an integrated Fab, it will face extremely high technical barriers.

The project is expected to have a total investment of approximately 10 billion yen, a scale that is still insufficient compared to the development costs of contemporary mainstream DRAM products. Take companies like Samsung, SK Hynix, and Micron as examples. Their R&D investment in single-generation DRAM products is often in the billions of dollars, and this does not even include the costs of later packaging, testing, and software driver adaptation. There is a great doubt as to whether such an investment scale can support the entire chain of technical activities from prototype development, platform adaptation to reliability testing.

Furthermore, the project party's plan to apply for government support also reflects its insufficiency in the self-consistency ability of funds and resources. Although government support can alleviate the cost pressure in the early stage to a certain extent, if it lacks the continuous commercial driving capacity and the clarity of the technical route, such technology projects that rely on government resources for promotion may eventually fall into the awkward situation where the technology verification is completed but lacks market traction.

From a comprehensive technical analysis, although this new type of AI memory chip project attempts to optimize the existing high-power AI training platform, there are many unsolved problems in the architectural innovation, manufacturing chain integration, system adaptation and technical verification processes it relies on. Especially, the path it claims to achieve a significant reduction in power consumption at the wiring level has not yet provided sufficient technical details and experimental data to support it. Under the influence of practical problems such as the separation of design and manufacturing processes, the mismatch between investment and development cycles, and the high difficulty in converting university patents, whether this project can achieve the expected breakthroughs at the technical level in the future still needs to be continuously observed in terms of its engineering implementation capabilities and system integration performance.

Recommend

The impact and underlying logic of Trump's suspension of entry of citizens from 12 countries into the United States

On June 4th local time, US President Trump announced a complete suspension of the entry of citizens from 12 countries including Afghanistan and Myanmar into the United States as both immigrants and non-immigrants, claiming that this move was to protect the national security of the United States.

Latest