Researchers have developed a novel methodology to accelerate the processing velocity of electronic devices, such as mobile phones or computers, without necessitating the replacement of existing components. This innovation could potentially double the processing speed while simultaneously consuming a reduced amount of energy.
Modern devices are equipped with several varieties of electronic chips, encompassing the central processing unit (CPU), graphics processing unit (GPU), hardware accelerators for artificial intelligence (AI), and digital signal processing for audio. Typically, these components handle data separately and sequentially, which can decelerate the overall processing.
To surmount this challenge, a team of scientists is proposing an innovative framework wherein treatment units operate in parallel, rather than in a consecutive manner. This method, denominated “simultaneous and heterogeneous multi-threading” (SHMT), permits distinct units to operate on the same region of computer code concurrently.
In contrast to “software pipelining,” a method that enables different components to function simultaneously on disparate tasks, SHMT facilitates a more flexible distribution of tasks between components. This signifies that diverse processing units can tackle the identical portion of the code simultaneously, and subsequently proceed to novel tasks once their respective portion is accomplished.
In addition to accelerating processing, SHMT is more energy-efficient. Numerous tasks typically assigned to power-hungry components, such as the GPU, can be delegated to less power-hungry hardware accelerators.
This approach was tested on a prototype system comprising an ARM multi-core CPU, an Nvidia GPU, and a TPU hardware accelerator, exhibiting nearly twice the performance and 51% lower power consumption compared to a traditional system.
Adopting this software framework on legacy systems could not only reduce hardware costs but also diminish carbon emissions and demand for water due to the reduced cooling required for large data centers, thanks to more efficient and environmentally friendly workload management.
However, the researchers state that their study, based on a prototype, necessitates further work to assess how this method can be applied in practical contexts and what types of applications could benefit most from it.
More information about Simultaneous and Heterogeneous Multi-Threading (SHMT)
Simultaneous and Heterogeneous Multi-Threading (SHMT) is a processor design technique that allows a single physical processor core to execute multiple different instruction streams concurrently. It combines the concepts of simultaneous multithreading (SMT) and heterogeneous computing.
In SMT, a single physical processor core is capable of executing instructions from multiple threads or processes simultaneously by duplicating certain portions of the processor’s architectural state, such as registers and program counters, for each thread. This allows the core to make better use of its execution units and other hardware resources by keeping them busy with instructions from different threads when one thread stalls due to a cache miss or other long-latency operation.
Heterogeneous computing, on the other hand, involves the use of specialized processing elements or cores optimized for different types of workloads or applications, such as graphics processing units (GPUs) for highly parallel workloads and digital signal processors (DSPs) for signal processing tasks.
SHMT combines these two concepts by allowing a single physical processor core to execute instructions from multiple threads or processes, where each thread or process may be optimized for a different type of workload or application. This is achieved by providing the core with the ability to switch between different execution modes or microarchitectures, each tailored to a specific type of workload or application.
For example, a SHMT-capable core might have one execution mode optimized for general-purpose computing, another mode optimized for multimedia or graphics workloads, and a third mode optimized for signal processing tasks. When executing a thread or process associated with a particular type of workload, the core would switch to the corresponding execution mode, effectively transforming itself into a specialized processing element optimized for that workload.
SHMT can potentially provide performance benefits by allowing a single physical core to handle a wider range of workloads more efficiently, without the need for dedicated specialized hardware for each type of workload. However, it also introduces additional complexity in the processor’s design and may require more sophisticated scheduling and resource management mechanisms to effectively utilize the different execution modes.
Source:arXiv