Making the most of Arm NN for GPU inference: FP16 and FastMa

Making the most of Arm NN for GPU inference: FP16 and FastMath

Industry Expert Blogs
Making the most of Arm NN for GPU inference: FP16 and FastMath
arm Blogs
-
Roberto Lopez Mendez, Arm
Jan. 27, 2021
Most operations in deep learning involve massive amounts of data, but simple control logic. As a parallel processor, GPUs are very suitable for this type of task. Current high-end mobile GPUs can provide substantial throughput thanks to having hundreds of Arithmetic Logic Units (ALUs). In fact, GPUs are built with a single purpose – parallel data processing, initially for 3D graphics and later for more general parallel computing.
Additionally, GPUs are energy-efficient processors. Nowadays, the number of operations per watt (TOPs/W) is used to evaluate the energy efficiency of mobile processors and embedded devices. GPUs have higher TOPs/W, due to the relatively simple control unit and lower working frequency.

Related Keywords

, Arithmetic Logic Units , எண்கணிதம் தர்க்கம் அலகுகள் ,