When Should You Use _mm_sfence, _mm_lfence, and _mm_mfence?
Multi-threaded programming introduces concurrency-related complexities, necessitating mechanisms to maintain data integrity and synchronization. Intel's intrinsics library provides several functions, including _mm_sfence, _mm_lfence, and _mm_mfence, to control memory ordering in x86 architectures.
Memory Ordering in x86
x86 CPUs have a strongly ordered memory model, but C and C have weaker ones. Hence, additional precautions are required to ensure proper memory ordering and prevent data corruption or race conditions.
_mm_sfence
_mm_sfence is primarily used after non-temporal (NT) stores (_mm_stream_*) to prevent speculative reordering. NT stores are weakly ordered, meaning they can appear to occur out of order relative to other memory operations. _mm_sfence creates a barrier that ensures subsequent memory operations become globally visible after the NT stores are committed to memory.
_mm_lfence
_mm_lfence is rarely used as a load fence. It only has relevance when loading from Write-Combining (WC) memory regions, such as video RAM. _mm_lfence can prevent execution of subsequent instructions until it retires, which can be useful for microbenchmarking.
_mm_mfence
_mm_mfence provides sequential consistency, ensuring subsequent loads cannot read values until after preceding stores become globally visible. It can be useful if you implement your custom version of std::atomic or need to explicitly control memory ordering for operations that would otherwise be speculative.
Summary
Disclaimer: All resources provided are partly from the Internet. If there is any infringement of your copyright or other rights and interests, please explain the detailed reasons and provide proof of copyright or rights and interests and then send it to the email: [email protected] We will handle it for you as soon as possible.
Copyright© 2022 湘ICP备2022001581号-3