x86: fix long standing bug with usb after hibernation with 4GB ram

[linux-2.6-omap-h63xx.git] / Documentation / RCU / checklist.txt
diff --git a/Documentation/RCU/checklist.txt b/Documentation/RCU/checklist.txt

index 49e27cc19385385c25618d661566c13024a6c2fa..42b01bc2e1b4f01f414b340a6204fb3ea1087953 100644 (file)
--- a/Documentation/RCU/checklist.txt
+++ b/Documentation/RCU/checklist.txt
@@ -144,9 +144,47 @@ over a rather long period of time, but improvements are always welcome!
         whether the increased speed is worth it.
  
  8.     Although synchronize_rcu() is a bit slower than is call_rcu(),
-       it usually results in simpler code.  So, unless update performance
-       is important or the updaters cannot block, synchronize_rcu()
-       should be used in preference to call_rcu().
+       it usually results in simpler code.  So, unless update
+       performance is critically important or the updaters cannot block,
+       synchronize_rcu() should be used in preference to call_rcu().
+
+       An especially important property of the synchronize_rcu()
+       primitive is that it automatically self-limits: if grace periods
+       are delayed for whatever reason, then the synchronize_rcu()
+       primitive will correspondingly delay updates.  In contrast,
+       code using call_rcu() should explicitly limit update rate in
+       cases where grace periods are delayed, as failing to do so can
+       result in excessive realtime latencies or even OOM conditions.
+
+       Ways of gaining this self-limiting property when using call_rcu()
+       include:
+
+       a.      Keeping a count of the number of data-structure elements
+               used by the RCU-protected data structure, including those
+               waiting for a grace period to elapse.  Enforce a limit
+               on this number, stalling updates as needed to allow
+               previously deferred frees to complete.
+
+               Alternatively, limit only the number awaiting deferred
+               free rather than the total number of elements.
+
+       b.      Limiting update rate.  For example, if updates occur only
+               once per hour, then no explicit rate limiting is required,
+               unless your system is already badly broken.  The dcache
+               subsystem takes this approach -- updates are guarded
+               by a global lock, limiting their rate.
+
+       c.      Trusted update -- if updates can only be done manually by
+               superuser or some other trusted user, then it might not
+               be necessary to automatically limit them.  The theory
+               here is that superuser already has lots of ways to crash
+               the machine.
+
+       d.      Use call_rcu_bh() rather than call_rcu(), in order to take
+               advantage of call_rcu_bh()'s faster grace periods.
+
+       e.      Periodically invoke synchronize_rcu(), permitting a limited
+               number of updates per grace period.
  
  9.     All RCU list-traversal primitives, which include
         list_for_each_rcu(), list_for_each_entry_rcu(),
@@ -183,3 +221,49 @@ over a rather long period of time, but improvements are always welcome!
         disable irq on a given acquisition of that lock will result in
         deadlock as soon as the RCU callback happens to interrupt that
         acquisition's critical section.
+
+13.    RCU callbacks can be and are executed in parallel.  In many cases,
+       the callback code simply wrappers around kfree(), so that this
+       is not an issue (or, more accurately, to the extent that it is
+       an issue, the memory-allocator locking handles it).  However,
+       if the callbacks do manipulate a shared data structure, they
+       must use whatever locking or other synchronization is required
+       to safely access and/or modify that data structure.
+
+14.    SRCU (srcu_read_lock(), srcu_read_unlock(), and synchronize_srcu())
+       may only be invoked from process context.  Unlike other forms of
+       RCU, it -is- permissible to block in an SRCU read-side critical
+       section (demarked by srcu_read_lock() and srcu_read_unlock()),
+       hence the "SRCU": "sleepable RCU".  Please note that if you
+       don't need to sleep in read-side critical sections, you should
+       be using RCU rather than SRCU, because RCU is almost always
+       faster and easier to use than is SRCU.
+
+       Also unlike other forms of RCU, explicit initialization
+       and cleanup is required via init_srcu_struct() and
+       cleanup_srcu_struct().  These are passed a "struct srcu_struct"
+       that defines the scope of a given SRCU domain.  Once initialized,
+       the srcu_struct is passed to srcu_read_lock(), srcu_read_unlock()
+       and synchronize_srcu().  A given synchronize_srcu() waits only
+       for SRCU read-side critical sections governed by srcu_read_lock()
+       and srcu_read_unlock() calls that have been passd the same
+       srcu_struct.  This property is what makes sleeping read-side
+       critical sections tolerable -- a given subsystem delays only
+       its own updates, not those of other subsystems using SRCU.
+       Therefore, SRCU is less prone to OOM the system than RCU would
+       be if RCU's read-side critical sections were permitted to
+       sleep.
+
+       The ability to sleep in read-side critical sections does not
+       come for free.  First, corresponding srcu_read_lock() and
+       srcu_read_unlock() calls must be passed the same srcu_struct.
+       Second, grace-period-detection overhead is amortized only
+       over those updates sharing a given srcu_struct, rather than
+       being globally amortized as they are for other forms of RCU.
+       Therefore, SRCU should be used in preference to rw_semaphore
+       only in extremely read-intensive situations, or in situations
+       requiring SRCU's read-side deadlock immunity or low read-side
+       realtime latency.
+
+       Note that, rcu_assign_pointer() and rcu_dereference() relate to
+       SRCU just as they do to other forms of RCU.