
The Northpole chip developed by IBM brings together memory and processing, leading to vast improvements in image recognition and other computing tasks.Credit: IBM Corp.
A brain-inspired computer chip that can supercharge artificial intelligence (AI) by working faster with much less power has been developed by researchers at IBM in San Jose, California. Their massive Northpole processor chip eliminates the need to frequently access external memory, and therefore performs tasks such as image recognition faster than existing architectures – while consuming much less power.
“Its energy efficiency is absolutely astonishing,” says Damien Querlioz, a nanoelectronics researcher at Paris-Saclay University in Palacios. work, published in Science1“It shows that computing and memory can be integrated on a large scale,” he says. “I think this paper will shake up common thinking in computer architecture.”
Northpole runs neural networks: multilayered arrays of simple computational units programmed to recognize patterns in data. The bottom layer takes data, such as pixels in an image; Each successive layer detects patterns of increasing complexity and passes the information to the next layer. The top layer produces an output that can, for example, express how likely it is to contain a cat, car, or other objects in an image.
slowed down due to a bottleneck
Some computer chips can handle these calculations efficiently, but they still need to use external memory called RAM every time they calculate a layer. Shuttling data between chips in this way slows things down – a phenomenon known as the von Neumann bottleneck after the mathematician John von Neumann, who first developed a system based on a processing unit and a separate memory unit. The standard architecture of computers was conceived.
The von Neumann bottleneck is one of the most significant factors slowing down computer applications – including AI. It also results in energy inefficiencies. Study co-author and IBM computer engineer Dharmendra Modha says he once estimated that simulating the human brain on this type of architecture might require the equivalent of the output of 12 nuclear reactors.
Northpole is composed of 256 computing units, or cores, each of which has its own memory. “You’re essentially breaking the von Neumann barrier,” says Modha, IBM’s chief scientist for brain-inspired computing at the company’s Almaden Research Center in San Jose.
Modha says the cores are linked together in a network inspired by white matter connections between parts of the human cerebral cortex. This and other design principles – most of which existed before but were never combined into a single chip – enable Northpole to defeat existing AI machines by a substantial margin in standard benchmark tests of image recognition. Despite not using the latest and smallest manufacturing processes, it also uses one-fifth the energy of state-of-the-art AI chips. The authors estimate that if the Northpole design were implemented with the most up-to-date manufacturing process, its efficiency would be 25 times better than current designs.
on the right track
But even Northpole’s 224 megabytes of RAM is not enough for large language models, such as those used by chatbot ChatGPT, which take up several thousand megabytes of data. Even in their most stripped down versions, And the chip can only run pre-programmed neural networks that need to be ‘trained’ beforehand on a separate machine. But the paper’s authors say the Northpole architecture could be useful in speed-critical applications such as self-driving cars.
The Northpole brings the memory units as physically close as possible to the computing elements in the core. Elsewhere, researchers are developing more radical innovations using new materials and manufacturing processes. These enable the memory units to perform calculations themselves, which in theory could increase both speed and efficiency even further.
Another chip, which was described last month2Performs in-memory calculations using memorable, circuit elements are capable of switching between a resistor and a conductor. “Both approaches, IBM’s and ours, hold promise for reducing latency and energy costs associated with data transfer,” says Bin Gao at Tsinghua University in Beijing, co-author of the memristor study.
Another approach, developed by multiple teams – including a separate IBM laboratory in Zurich, Switzerland3 – Stores information by changing the crystal structure of the circuit element. It remains to be seen whether these new approaches can be scaled up economically.