Voltatile
The subject of memory barriers is quite complex. It even trips up the experts from time to time. When we talk about a memory barrier we are really combining two different ideas.
· Acquire fence: A memory barrier in which other reads & writes are not allowed to move before the fence.
· Release fence: A memory barrier in which other reads & writes are not allowed to move after the fence.
A memory barrier that creates only one of two is sometimes called a half-fence. A memory barrier that creates both is sometimes called a full-fence.
The volatile keyword creates half-fences. Reads of volatile fields have acquire semantics while writes have release semantics. That means no instruction can be moved before a read or after a write.
The lock keyword creates full-fences on both boundaries (entry and exit). That means no instruction can be moved either before or after each boundary.
However, all of this moot if we are only concerned with one thread. Ordering, as it is perceived by that thread, is always preserved. In fact, without that fundamental guarentee no program would ever work right. The real issue is with how otherthreads perceive reads and writes. That is where you need to be concerned.
MemoryBarrier:
Synchronizes memory access as follows: The processor executing the current thread cannot reorder instructions in such a way that memory accesses prior to the call to MemoryBarrier execute after memory accesses that follow the call to MemoryBarrier
MemoryBarrier();
Mutex:
When two or more threads need to access a shared resource at the same time, the system needs a synchronization mechanism to ensure that only one thread at a time uses the resource. Mutex is a synchronization primitive that grants exclusive access to the shared resource to only one thread. If a thread acquires a mutex, the second thread that wants to acquire that mutex is suspended until the first thread releases the mutex.
The Mutex class enforces thread identity, so a mutex can be released only by the thread that acquired it. By contrast, theSemaphore class does not enforce thread identity.
An abandoned mutex often indicates a serious error in the code. When a thread exits without releasing the mutex, the data structures protected by the mutex might not be in a consistent state. The next thread to request ownership of the mutex can handle this exception and proceed, if the integrity of the data structures can be verified.
// This example shows how a Mutex is used to synchronize access to a protected resource. Unlike Monitor, Mutex can be used with
WaitHandle.WaitAll and WaitAny, and can be passed across AppDomain boundaries.
// Create a new Mutex. The creating thread does not own the Mutex.
private static Mutex mut = new Mutex();
// Wait until it is safe to enter.
mut.WaitOne();
// Release the Mutex.
mut.ReleaseMutex();
Mutexes are of two types: local mutexes, which are unnamed, and named system mutexes. A local mutex exists only within your process. It can be used by any thread in your process that has a reference to the Mutex object that represents the mutex. Each unnamed Mutex object represents a separate local mutex.
If its name begins with the prefix "Global\", the mutex is visible in all terminal server sessions. If its name begins with the prefix "Local\",
the mutex is visible only in the terminal server session where it was created.
Here is my take on the subject and to attempt to provide a quasi-complete list in one answer. If I run across any others I will edit my answer from time to time.
Mechanisms that are generally agreed upon to cause implicit barriers:
· All Monitor
methods including the C# keyword lock
· All Interlocked
methods.
· Thread.VolatileRead
and Thread.VolatileWrite
· Thread.MemoryBarrier
· The volatile
keyword.
· Anything that starts a thread or causes a delegate to execute on another thread including QueueUserWorkItem
,Task.Factory.StartNew
, Thread.Start
, compiler supplied BeginInvoke
methods, etc.
· Using a signaling mechanism such as ManualResetEvent
, AutoResetEvent
, CountdownEvent
, Semaphore
,Barrier
, etc.
· Using marshaling operations such as Control.Invoke
, Dispatcher.Invoke
,SynchronizationContext.Post
, etc.
Mechanisms that are speculated (but not known for certain) to cause implicit barriers:
· Thread.Sleep
(proposed by myself and possibly others due to the fact that code which exhibits a memory barrier problem can be fixed with this method)
· Thread.Yield
· Thread.SpinWait
· Lazy<T>
depending on which LazyThreadSafetyMode
is specified
Other notable mentions:
· Default add and remove handlers for events in C# since they use lock
or Interlocked.CompareExchange
.
· x86 stores have release fence semantics
· Microsoft's implemenation of the CLI has release fence semantics on writes despite the fact that the ECMA specification does not mandate it.
· MarshalByRefObject
seems to suppress certain optimizations in subclasses which may make it appear as if an implicit memory barrier were present. Thanks to Hans Passant for discovering this and bringing it to my attention.1
1This explains why BackgroundWorker works correctly without having volatile on the underlying field for theCancellationPending property.
public sealed class Singleton {
private Singleton() {}
private static Singleton value;
private static object syncRoot = new Object();
public static Singleton Value {
get {
if (Singleton.value == null) {
lock (syncRoot) {
if (Singleton.value == null) {
Singleton newVal = new Singleton();
// Insure all writes used to construct new value have been flushed.
System.Threading.Thread.MemoryBarrier();
Singleton.value = newVal; // publish the new value
}
}
}
return Singleton.value;
}
}
}
Interlocked:
Provides atomic operations for variables that are shared by multiple threads.
The methods of this class help protect against errors that can occur when the scheduler switches contexts while a thread is updating a variable that can be accessed by other threads, or when two threads are executing concurrently on separate processors. The members of this class do not throw exceptions.
The Increment and Decrement methods increment or decrement a variable and store the resulting value in a single operation. On most computers, incrementing a variable is not an atomic operation, requiring the following steps:
1. Load a value from an instance variable into a register.
2. Increment or decrement the value.
3. Store the value in the instance variable.
If you do not use Increment and Decrement, a thread can be preempted after executing the first two steps. Another thread can then execute all three steps. When the first thread resumes execution, it overwrites the value in the instance variable, and the effect of the increment or decrement performed by the second thread is lost.
The Exchange method atomically exchanges the values of the specified variables. The CompareExchange method combines two operations: comparing two values and storing a third value in one of the variables, based on the outcome of the comparison. The compare and exchange operations are performed as an atomic operation.
Read ---- Returns a 64-bit value, loaded as an atomic operation.
The Read method is unnecessary on 64-bit systems, because 64-bit read operations are already atomic.
On 32-bit systems, 64-bit read operations are not atomic unless performed using Read.
Semaphore
Use the Semaphore class to control access to a pool of resources. Threads enter the semaphore by calling the WaitOne method, which is inherited from the WaitHandle class, and release the semaphore by calling the Release method.
ReaderWriterLockSlim
Represents a lock that is used to manage access to a resource, allowing multiple threads for reading or exclusive access for writing.
ReaderWriterLock is used to synchronize access to a resource. At any given time, it allows either concurrent read access for multiple threads, or write access for a single thread. In a situation where a resource is changed infrequently, a ReaderWriterLockprovides better throughput than a simple one-at-a-time lock, such as Monitor.
ReaderWriterLock works best where most accesses are reads, while writes are infrequent and of short duration. Multiple readers alternate with single writers, so that neither readers nor writers are blocked for long periods.
BARRIER
Enables multiple tasks to cooperatively work on an algorithm in parallel through multiple phases.
CountDownEvent
Represents a synchronization primitive that is signaled when its count reaches zero.
EventWaitHandle
Reset Sets the state of the event to nonsignaled, causing threads to block.
Set Sets the state of the event to signaled, allowing one or more waiting threads to proceed.
WaitOne: Blocks the current thread until the current WaitHandle receives a signal
ExpensiveData _data = null;
bool _dataInitialized = false;
object _dataLock = new object();
// ...
ExpensiveData dataToUse = LazyInitializer.EnsureInitialized(ref _data, ref _dataInitialized, ref _dataLock);
ManualResetEvent allows threads to communicate with each other by signaling. Typically, this communication concerns a task which one thread must complete before other threads can proceed.
When a thread begins an activity that must complete before other threads proceed, it calls Reset to put ManualResetEvent in the non-signaled state. This thread can be thought of as controlling the ManualResetEvent. Threads that call WaitOne on theManualResetEvent will block, awaiting the signal. When the controlling thread completes the activity, it calls Set to signal that the waiting threads can proceed. All waiting threads are released.
Once it has been signaled, ManualResetEvent remains signaled until it is manually reset. That is, calls to WaitOne return immediately.