vimarsana.com

Page 2 - Instruction Set Computer News Today : Breaking News, Live Updates & Top Stories | Vimarsana

Atomics in AArch64

CPU fun Introduction In this post we’ll look at the performance of a simple atomic operation on a couple of Arm® AArch64 machines. In particular we’ll show the improvement that comes from using the simple, single-instruction, atomics in the Arm V8.1a architecture in preference to the more general Load-Locked, Store-Conditional (LL-SC) implementation in the earlier architectures. The improved performance of the newer architecture was mentioned in a tweet, so as I already had a benchmark for this for “The Book”, re-running those benchmarks and writing this up seemed worthwhile. The Problem Atomics In a parallel program there are occasions when different threads need to update shared state in a safe way. At a high level that can be achieved using locks and critical sections. However, that just pushes the problem down a level since the locks themselves must be implemented. That leads us (and hardware architects!) to realise that the hardware must provide instructions which

© 2025 Vimarsana

vimarsana © 2020. All Rights Reserved.