Message ID | 1320265849-5744-24-git-send-email-paulmck@linux.vnet.ibm.com |
---|---|
State | New |
Headers | show |
On Wed, Nov 02, 2011 at 01:30:45PM -0700, Paul E. McKenney wrote: > The RCU implementations, including SRCU, are designed to be used in a > lock-like fashion, so that the read-side lock and unlock primitives must > execute in the same context for any given read-side critical section. > This constraint is enforced by lockdep-RCU. However, there is a need for > something that acts more like a reference count than a lock, in order > to allow (for example) the reference to be acquired within the context > of an exception, while that same reference is released in the context of > the task that encountered the exception. The cost of this capability is > that the read-side operations incur the overhead of disabling interrupts. > Some optimization is possible, and will be carried out if warranted. > > Note that although the current implementation allows a given reference to > be acquired by one task and then released by another, all known possible > implementations that allow this have scalability problems. Therefore, > a given reference must be released by the same task that acquired it, > though perhaps from an interrupt or exception handler running within > that task's context. This new bulkref API seems in dire need of documentation. :) > --- a/include/linux/srcu.h > +++ b/include/linux/srcu.h > @@ -181,4 +181,54 @@ static inline void srcu_read_unlock(struct srcu_struct *sp, int idx) > __srcu_read_unlock(sp, idx); > } > > +/* Definitions for bulkref_t, currently defined in terms of SRCU. */ > + > +typedef struct srcu_struct bulkref_t; > +int init_srcu_struct_fields(struct srcu_struct *sp); > + > +static inline int init_bulkref(bulkref_t *brp) > +{ > + return init_srcu_struct_fields(brp); > +} Why can't this call init_srcu_struct and avoid the need to use the previously unexported internal function? - Josh Triplett
On Wed, Nov 02, 2011 at 09:34:41PM -0700, Josh Triplett wrote: > On Wed, Nov 02, 2011 at 01:30:45PM -0700, Paul E. McKenney wrote: > > The RCU implementations, including SRCU, are designed to be used in a > > lock-like fashion, so that the read-side lock and unlock primitives must > > execute in the same context for any given read-side critical section. > > This constraint is enforced by lockdep-RCU. However, there is a need for > > something that acts more like a reference count than a lock, in order > > to allow (for example) the reference to be acquired within the context > > of an exception, while that same reference is released in the context of > > the task that encountered the exception. The cost of this capability is > > that the read-side operations incur the overhead of disabling interrupts. > > Some optimization is possible, and will be carried out if warranted. > > > > Note that although the current implementation allows a given reference to > > be acquired by one task and then released by another, all known possible > > implementations that allow this have scalability problems. Therefore, > > a given reference must be released by the same task that acquired it, > > though perhaps from an interrupt or exception handler running within > > that task's context. > > This new bulkref API seems in dire need of documentation. :) > > > --- a/include/linux/srcu.h > > +++ b/include/linux/srcu.h > > @@ -181,4 +181,54 @@ static inline void srcu_read_unlock(struct srcu_struct *sp, int idx) > > __srcu_read_unlock(sp, idx); > > } > > > > +/* Definitions for bulkref_t, currently defined in terms of SRCU. */ > > + > > +typedef struct srcu_struct bulkref_t; > > +int init_srcu_struct_fields(struct srcu_struct *sp); > > + > > +static inline int init_bulkref(bulkref_t *brp) > > +{ > > + return init_srcu_struct_fields(brp); > > +} > > Why can't this call init_srcu_struct and avoid the need to use the > previously unexported internal function? Seems reasonable now that you mention it. ;-) Thanx, Paul
On Thu, Nov 03, 2011 at 06:34:35AM -0700, Paul E. McKenney wrote: > On Wed, Nov 02, 2011 at 09:34:41PM -0700, Josh Triplett wrote: > > On Wed, Nov 02, 2011 at 01:30:45PM -0700, Paul E. McKenney wrote: > > > The RCU implementations, including SRCU, are designed to be used in a > > > lock-like fashion, so that the read-side lock and unlock primitives must > > > execute in the same context for any given read-side critical section. > > > This constraint is enforced by lockdep-RCU. However, there is a need for > > > something that acts more like a reference count than a lock, in order > > > to allow (for example) the reference to be acquired within the context > > > of an exception, while that same reference is released in the context of > > > the task that encountered the exception. The cost of this capability is > > > that the read-side operations incur the overhead of disabling interrupts. > > > Some optimization is possible, and will be carried out if warranted. > > > > > > Note that although the current implementation allows a given reference to > > > be acquired by one task and then released by another, all known possible > > > implementations that allow this have scalability problems. Therefore, > > > a given reference must be released by the same task that acquired it, > > > though perhaps from an interrupt or exception handler running within > > > that task's context. > > > > This new bulkref API seems in dire need of documentation. :) > > > > > --- a/include/linux/srcu.h > > > +++ b/include/linux/srcu.h > > > @@ -181,4 +181,54 @@ static inline void srcu_read_unlock(struct srcu_struct *sp, int idx) > > > __srcu_read_unlock(sp, idx); > > > } > > > > > > +/* Definitions for bulkref_t, currently defined in terms of SRCU. */ > > > + > > > +typedef struct srcu_struct bulkref_t; > > > +int init_srcu_struct_fields(struct srcu_struct *sp); > > > + > > > +static inline int init_bulkref(bulkref_t *brp) > > > +{ > > > + return init_srcu_struct_fields(brp); > > > +} > > > > Why can't this call init_srcu_struct and avoid the need to use the > > previously unexported internal function? > > Seems reasonable now that you mention it. ;-) Except that doing so results in lockdep initialization that cannot be used. :-( Thanx, Paul
On Wed, 2011-11-02 at 13:30 -0700, Paul E. McKenney wrote: > The RCU implementations, including SRCU, are designed to be used in a > lock-like fashion, so that the read-side lock and unlock primitives must > execute in the same context for any given read-side critical section. > This constraint is enforced by lockdep-RCU. However, there is a need for > something that acts more like a reference count than a lock, in order > to allow (for example) the reference to be acquired within the context > of an exception, while that same reference is released in the context of > the task that encountered the exception. The cost of this capability is > that the read-side operations incur the overhead of disabling interrupts. > Some optimization is possible, and will be carried out if warranted. > > Note that although the current implementation allows a given reference to > be acquired by one task and then released by another, all known possible > implementations that allow this have scalability problems. Therefore, > a given reference must be released by the same task that acquired it, > though perhaps from an interrupt or exception handler running within > that task's context. I'm having trouble with the naming as well as the need for an explicit new API. To me this looks like a regular (S)RCU variant, nothing to do with references per-se (aside from the fact that SRCU is a refcounted rcu variant). Also WTF is this bulk stuff about? Its still a single ref at a time, not 10s or 100s or whatnot. > +static inline int bulkref_get(bulkref_t *brp) > +{ > + unsigned long flags; > + int ret; > + > + local_irq_save(flags); > + ret = __srcu_read_lock(brp); > + local_irq_restore(flags); > + return ret; > +} > + > +static inline void bulkref_put(bulkref_t *brp, int idx) > +{ > + unsigned long flags; > + > + local_irq_save(flags); > + __srcu_read_unlock(brp, idx); > + local_irq_restore(flags); > +} This seems to be the main gist of the patch, which to me sounds utterly ridiculous. Why not document that srcu_read_{un,}lock() aren't IRQ safe and if you want to use it from those contexts you have to fix it up yourself. RCU lockdep doesn't do the full validation so it won't actually catch it if you mess up the irq states, but I guess if you want we could look at adding that. > diff --git a/kernel/srcu.c b/kernel/srcu.c > index 73ce23f..10214c8 100644 > --- a/kernel/srcu.c > +++ b/kernel/srcu.c > @@ -34,13 +34,14 @@ > #include <linux/delay.h> > #include <linux/srcu.h> > > -static int init_srcu_struct_fields(struct srcu_struct *sp) > +int init_srcu_struct_fields(struct srcu_struct *sp) > { > sp->completed = 0; > mutex_init(&sp->mutex); > sp->per_cpu_ref = alloc_percpu(struct srcu_struct_array); > return sp->per_cpu_ref ? 0 : -ENOMEM; > } > +EXPORT_SYMBOL_GPL(init_srcu_struct_fields); What do we need this export for? Usually we don't add exports unless there's a use-case. Since Srikar requested this nonsense, I guess the user is uprobes, but that isn't a module, so no export needed.
On Mon, Nov 28, 2011 at 01:41:11PM +0100, Peter Zijlstra wrote: > On Wed, 2011-11-02 at 13:30 -0700, Paul E. McKenney wrote: > > The RCU implementations, including SRCU, are designed to be used in a > > lock-like fashion, so that the read-side lock and unlock primitives must > > execute in the same context for any given read-side critical section. > > This constraint is enforced by lockdep-RCU. However, there is a need for > > something that acts more like a reference count than a lock, in order > > to allow (for example) the reference to be acquired within the context > > of an exception, while that same reference is released in the context of > > the task that encountered the exception. The cost of this capability is > > that the read-side operations incur the overhead of disabling interrupts. > > Some optimization is possible, and will be carried out if warranted. > > > > Note that although the current implementation allows a given reference to > > be acquired by one task and then released by another, all known possible > > implementations that allow this have scalability problems. Therefore, > > a given reference must be released by the same task that acquired it, > > though perhaps from an interrupt or exception handler running within > > that task's context. > > I'm having trouble with the naming as well as the need for an explicit > new API. > > To me this looks like a regular (S)RCU variant, nothing to do with > references per-se (aside from the fact that SRCU is a refcounted rcu > variant). Also WTF is this bulk stuff about? Its still a single ref at a > time, not 10s or 100s or whatnot. It is a bulk reference in comparison to a conventional atomic_inc()-style reference count, which is normally associated with a specific structure. In contrast, doing a bulkref_get() normally protects a group of structures, everything covered by the bulkref_t. Yes, in theory you could have a global reference counter that protected a group of structures, but in practice we both know that this would not end well. ;-) > > +static inline int bulkref_get(bulkref_t *brp) > > +{ > > + unsigned long flags; > > + int ret; > > + > > + local_irq_save(flags); > > + ret = __srcu_read_lock(brp); > > + local_irq_restore(flags); > > + return ret; > > +} > > + > > +static inline void bulkref_put(bulkref_t *brp, int idx) > > +{ > > + unsigned long flags; > > + > > + local_irq_save(flags); > > + __srcu_read_unlock(brp, idx); > > + local_irq_restore(flags); > > +} > > This seems to be the main gist of the patch, which to me sounds utterly > ridiculous. Why not document that srcu_read_{un,}lock() aren't IRQ safe > and if you want to use it from those contexts you have to fix it up > yourself. I thought I had documented this, but I guess not. I will add that. I lost you on the "fix it up yourself" -- what are you suggesting that someone needing to use RCU in this manner actually do? > RCU lockdep doesn't do the full validation so it won't actually catch it > if you mess up the irq states, but I guess if you want we could look at > adding that. Ah, I had missed that. Yes, it would be very good if that could be added. The vast majority of the uses exit the RCU read-side critical section in the same context that they enter it, so it would be good to check. > > diff --git a/kernel/srcu.c b/kernel/srcu.c > > index 73ce23f..10214c8 100644 > > --- a/kernel/srcu.c > > +++ b/kernel/srcu.c > > @@ -34,13 +34,14 @@ > > #include <linux/delay.h> > > #include <linux/srcu.h> > > > > -static int init_srcu_struct_fields(struct srcu_struct *sp) > > +int init_srcu_struct_fields(struct srcu_struct *sp) > > { > > sp->completed = 0; > > mutex_init(&sp->mutex); > > sp->per_cpu_ref = alloc_percpu(struct srcu_struct_array); > > return sp->per_cpu_ref ? 0 : -ENOMEM; > > } > > +EXPORT_SYMBOL_GPL(init_srcu_struct_fields); > > What do we need this export for? Usually we don't add exports unless > there's a use-case. Since Srikar requested this nonsense, I guess the > user is uprobes, but that isn't a module, so no export needed. Yep, the user is uprobes. The export is for rcutorture, which can run as a module. Thanx, Paul
On Mon, 2011-11-28 at 09:15 -0800, Paul E. McKenney wrote: > > I'm having trouble with the naming as well as the need for an explicit > > new API. > > > > To me this looks like a regular (S)RCU variant, nothing to do with > > references per-se (aside from the fact that SRCU is a refcounted rcu > > variant). Also WTF is this bulk stuff about? Its still a single ref at a > > time, not 10s or 100s or whatnot. > > It is a bulk reference in comparison to a conventional atomic_inc()-style > reference count, which is normally associated with a specific structure. > In contrast, doing a bulkref_get() normally protects a group of structures, > everything covered by the bulkref_t. > > Yes, in theory you could have a global reference counter that protected > a group of structures, but in practice we both know that this would not > end well. ;-) Well, all the counter based RCUs are basically that. And yes, making them scale is 'interesting', however you've done pretty well so far ;-) I just hate the name in that it totally obscures the fact that its regular SRCU. > > > +static inline int bulkref_get(bulkref_t *brp) > > > +{ > > > + unsigned long flags; > > > + int ret; > > > + > > > + local_irq_save(flags); > > > + ret = __srcu_read_lock(brp); > > > + local_irq_restore(flags); > > > + return ret; > > > +} > > > + > > > +static inline void bulkref_put(bulkref_t *brp, int idx) > > > +{ > > > + unsigned long flags; > > > + > > > + local_irq_save(flags); > > > + __srcu_read_unlock(brp, idx); > > > + local_irq_restore(flags); > > > +} > > > > This seems to be the main gist of the patch, which to me sounds utterly > > ridiculous. Why not document that srcu_read_{un,}lock() aren't IRQ safe > > and if you want to use it from those contexts you have to fix it up > > yourself. > > I thought I had documented this, but I guess not. I will add that. Oh, I hadn't checked, it could be. > I lost you on the "fix it up yourself" -- what are you suggesting that > someone needing to use RCU in this manner actually do? local_irq_save(flags); srcu_read_lock(&my_srcu_domain); local_irq_restore(flags); and local_irq_save(flags); srcu_read_unlock(&my_srcu_domain); local_irq_restore(flags) Doesn't look to be too hard, or confusing. > > RCU lockdep doesn't do the full validation so it won't actually catch it > > if you mess up the irq states, but I guess if you want we could look at > > adding that. > > Ah, I had missed that. Yes, it would be very good if that could be added. > The vast majority of the uses exit the RCU read-side critical section in > the same context that they enter it, so it would be good to check. /me adds to TODO list.
On Mon, Nov 28, 2011 at 07:17:59PM +0100, Peter Zijlstra wrote: > On Mon, 2011-11-28 at 09:15 -0800, Paul E. McKenney wrote: > > > > I'm having trouble with the naming as well as the need for an explicit > > > new API. > > > > > > To me this looks like a regular (S)RCU variant, nothing to do with > > > references per-se (aside from the fact that SRCU is a refcounted rcu > > > variant). Also WTF is this bulk stuff about? Its still a single ref at a > > > time, not 10s or 100s or whatnot. > > > > It is a bulk reference in comparison to a conventional atomic_inc()-style > > reference count, which is normally associated with a specific structure. > > In contrast, doing a bulkref_get() normally protects a group of structures, > > everything covered by the bulkref_t. > > > > Yes, in theory you could have a global reference counter that protected > > a group of structures, but in practice we both know that this would not > > end well. ;-) > > Well, all the counter based RCUs are basically that. And yes, making > them scale is 'interesting', however you've done pretty well so far ;-) Fair point, and thank you for the vote of confidence. ;-) Nevertheless, when most people talk to me about explicit reference counters, they are thinking in terms of a reference counter within a structure protecting that structure. > I just hate the name in that it totally obscures the fact that its > regular SRCU. OK, what names would you suggest? > > > > +static inline int bulkref_get(bulkref_t *brp) > > > > +{ > > > > + unsigned long flags; > > > > + int ret; > > > > + > > > > + local_irq_save(flags); > > > > + ret = __srcu_read_lock(brp); > > > > + local_irq_restore(flags); > > > > + return ret; > > > > +} > > > > + > > > > +static inline void bulkref_put(bulkref_t *brp, int idx) > > > > +{ > > > > + unsigned long flags; > > > > + > > > > + local_irq_save(flags); > > > > + __srcu_read_unlock(brp, idx); > > > > + local_irq_restore(flags); > > > > +} > > > > > > This seems to be the main gist of the patch, which to me sounds utterly > > > ridiculous. Why not document that srcu_read_{un,}lock() aren't IRQ safe > > > and if you want to use it from those contexts you have to fix it up > > > yourself. > > > > I thought I had documented this, but I guess not. I will add that. > > Oh, I hadn't checked, it could be. It wasn't. I just now fixed it in my local git tree. ;-) > > I lost you on the "fix it up yourself" -- what are you suggesting that > > someone needing to use RCU in this manner actually do? > > local_irq_save(flags); > srcu_read_lock(&my_srcu_domain); > local_irq_restore(flags); > > and > > local_irq_save(flags); > srcu_read_unlock(&my_srcu_domain); > local_irq_restore(flags) > > Doesn't look to be too hard, or confusing. Ah, OK, I was under the mistaken impression that lockdep would splat if you did (for example) srcu_read_lock() in an exception handler and srcu_read_unlock() in the context of the task that took the exception. > > > RCU lockdep doesn't do the full validation so it won't actually catch it > > > if you mess up the irq states, but I guess if you want we could look at > > > adding that. > > > > Ah, I had missed that. Yes, it would be very good if that could be added. > > The vast majority of the uses exit the RCU read-side critical section in > > the same context that they enter it, so it would be good to check. > > /me adds to TODO list. Thank you! Please CC me on this one -- the above fixup would start failing once lockdep checked for this, right? Thanx, Paul
On Mon, 2011-11-28 at 10:31 -0800, Paul E. McKenney wrote: > > local_irq_save(flags); > > srcu_read_lock(&my_srcu_domain); > > local_irq_restore(flags); > > > > and > > > > local_irq_save(flags); > > srcu_read_unlock(&my_srcu_domain); > > local_irq_restore(flags) > > > > Doesn't look to be too hard, or confusing. > > Ah, OK, I was under the mistaken impression that lockdep would splat > if you did (for example) srcu_read_lock() in an exception handler and > srcu_read_unlock() in the context of the task that took the exception. I don't think it will, lockdep does very little actual validation on the RCU locks other than recording they're held. But if they do, the planned TODO item will get inversed. Should be easy enough to test I guess.
On Mon, 2011-11-28 at 10:31 -0800, Paul E. McKenney wrote: > Nevertheless, when most people talk to me about explicit reference > counters, they are thinking in terms of a reference counter within a > structure protecting that structure. Right and when I see bulk I think of exactly those but think +=,-= instead of ++,--. > > I just hate the name in that it totally obscures the fact that its > > regular SRCU. > > OK, what names would you suggest? How about nothing at all? Simply use the existing SRCU API?
On Mon, 2011-11-28 at 19:35 +0100, Peter Zijlstra wrote: > On Mon, 2011-11-28 at 10:31 -0800, Paul E. McKenney wrote: > > > local_irq_save(flags); > > > srcu_read_lock(&my_srcu_domain); > > > local_irq_restore(flags); > > > > > > and > > > > > > local_irq_save(flags); > > > srcu_read_unlock(&my_srcu_domain); > > > local_irq_restore(flags) > > > > > > Doesn't look to be too hard, or confusing. > > > > Ah, OK, I was under the mistaken impression that lockdep would splat > > if you did (for example) srcu_read_lock() in an exception handler and > > srcu_read_unlock() in the context of the task that took the exception. > > I don't think it will, lockdep does very little actual validation on the > RCU locks other than recording they're held. But if they do, the planned > TODO item will get inversed. > > Should be easy enough to test I guess. OK, so I had me a little peek at lockdep and you're right, it will complain. Still uprobes can do: local_irq_save(flags); __srcu_read_lock(&mr_srcu_domain); local_irq_restore(flags); However if you object to exposing the __srcu functions (which I can understand) you could expose these two functions as srcu_read_{,un}lock_raw() or so, to mirror the non-validation also found in rcu_dereference_raw()
On Tue, Nov 29, 2011 at 02:33:35PM +0100, Peter Zijlstra wrote: > On Mon, 2011-11-28 at 19:35 +0100, Peter Zijlstra wrote: > > On Mon, 2011-11-28 at 10:31 -0800, Paul E. McKenney wrote: > > > > local_irq_save(flags); > > > > srcu_read_lock(&my_srcu_domain); > > > > local_irq_restore(flags); > > > > > > > > and > > > > > > > > local_irq_save(flags); > > > > srcu_read_unlock(&my_srcu_domain); > > > > local_irq_restore(flags) > > > > > > > > Doesn't look to be too hard, or confusing. > > > > > > Ah, OK, I was under the mistaken impression that lockdep would splat > > > if you did (for example) srcu_read_lock() in an exception handler and > > > srcu_read_unlock() in the context of the task that took the exception. > > > > I don't think it will, lockdep does very little actual validation on the > > RCU locks other than recording they're held. But if they do, the planned > > TODO item will get inversed. > > > > Should be easy enough to test I guess. > > OK, so I had me a little peek at lockdep and you're right, it will > complain. OK, I will cross that test off my list for today. ;-) > Still uprobes can do: > > local_irq_save(flags); > __srcu_read_lock(&mr_srcu_domain); > local_irq_restore(flags); And this is exactly what the bulkref stuff does, so we at least agree on the implementation. > However if you object to exposing the __srcu functions (which I can > understand) you could expose these two functions as > srcu_read_{,un}lock_raw() or so, to mirror the non-validation also found > in rcu_dereference_raw() Good point, the _raw suffix is used elsewhere in RCU for "turn off lockdep", so it makes sense to use it here as well. I will change to srcu_read_lock_raw() and srcu_read_unlock_raw(). And that has the added benefit of getting rid of the alternative names for the initialization and cleanup functions, so sounds good! Thank you! Thanx, Paul
diff --git a/include/linux/srcu.h b/include/linux/srcu.h index d4b1244..d5334d0 100644 --- a/include/linux/srcu.h +++ b/include/linux/srcu.h @@ -181,4 +181,54 @@ static inline void srcu_read_unlock(struct srcu_struct *sp, int idx) __srcu_read_unlock(sp, idx); } +/* Definitions for bulkref_t, currently defined in terms of SRCU. */ + +typedef struct srcu_struct bulkref_t; +int init_srcu_struct_fields(struct srcu_struct *sp); + +static inline int init_bulkref(bulkref_t *brp) +{ + return init_srcu_struct_fields(brp); +} + +static inline void cleanup_bulkref(bulkref_t *brp) +{ + cleanup_srcu_struct(brp); +} + +static inline int bulkref_get(bulkref_t *brp) +{ + unsigned long flags; + int ret; + + local_irq_save(flags); + ret = __srcu_read_lock(brp); + local_irq_restore(flags); + return ret; +} + +static inline void bulkref_put(bulkref_t *brp, int idx) +{ + unsigned long flags; + + local_irq_save(flags); + __srcu_read_unlock(brp, idx); + local_irq_restore(flags); +} + +static inline void bulkref_wait_old(bulkref_t *brp) +{ + synchronize_srcu(brp); +} + +static inline void bulkref_wait_old_expedited(bulkref_t *brp) +{ + synchronize_srcu_expedited(brp); +} + +static inline long bulkref_batches_completed(bulkref_t *brp) +{ + return srcu_batches_completed(brp); +} + #endif diff --git a/kernel/srcu.c b/kernel/srcu.c index 73ce23f..10214c8 100644 --- a/kernel/srcu.c +++ b/kernel/srcu.c @@ -34,13 +34,14 @@ #include <linux/delay.h> #include <linux/srcu.h> -static int init_srcu_struct_fields(struct srcu_struct *sp) +int init_srcu_struct_fields(struct srcu_struct *sp) { sp->completed = 0; mutex_init(&sp->mutex); sp->per_cpu_ref = alloc_percpu(struct srcu_struct_array); return sp->per_cpu_ref ? 0 : -ENOMEM; } +EXPORT_SYMBOL_GPL(init_srcu_struct_fields); #ifdef CONFIG_DEBUG_LOCK_ALLOC