mbox series

[0/3] rbd: exclusive mapping (-o exclusive) fixes

Message ID 20240724062914.667734-1-idryomov@gmail.com
Headers show
Series rbd: exclusive mapping (-o exclusive) fixes | expand

Message

Ilya Dryomov July 24, 2024, 6:29 a.m. UTC
Hello,

This addresses incorrect assumptions on rbd_dev->lock_state in
rbd_img_exclusive_lock() and rbd_add_acquire_lock() which could lead to
issues with exclusive mappings in the face of watch errors.

Thanks,

                Ilya


Ilya Dryomov (3):
  rbd: rename RBD_LOCK_STATE_RELEASING and releasing_wait
  rbd: don't assume RBD_LOCK_STATE_LOCKED for exclusive mappings
  rbd: don't assume rbd_is_lock_owner() for exclusive mappings

 drivers/block/rbd.c | 35 +++++++++++++++--------------------
 1 file changed, 15 insertions(+), 20 deletions(-)

Comments

Ilya Dryomov July 25, 2024, 9:31 a.m. UTC | #1
On Thu, Jul 25, 2024 at 10:45 AM Dongsheng Yang
<dongsheng.yang@linux.dev> wrote:
>
>
>
> 在 2024/7/24 星期三 下午 2:29, Ilya Dryomov 写道:
> > Expanding on the previous commit, assuming that rbd_is_lock_owner()
> > always returns true (i.e. that we are either in RBD_LOCK_STATE_LOCKED
> > or RBD_LOCK_STATE_QUIESCING) if the mapping is exclusive is wrong too.
> > In case ceph_cls_set_cookie() fails, the lock would be temporarily
> > released even if the mapping is exclusive, meaning that we can end up
> > even in RBD_LOCK_STATE_UNLOCKED.
> >
> > IOW, exclusive mappings are really "just" about disabling automatic
> > lock transitions (as documented in the man page), not about grabbing
> > the lock and holding on to it whatever it takes.
>
> Hi Ilya,
>         Could you explain more about "disabling atomic lock transitions"? To be
> honest, I was thinking --exclusive means "grabbing
> the lock and holding on to it whatever it takes."

Hi Dongsheng,

Here are the relevant excerpts from the documentation [1]:

> To maintain multi-client access, the exclusive-lock feature
> implements automatic cooperative lock transitions between clients.
>
> Whenever a client that holds an exclusive lock on an RBD image gets
> a request to release the lock, it stops handling writes, flushes its
> caches and releases the lock.
>
> By default, the exclusive-lock feature does not prevent two or more
> concurrently running clients from opening the same RBD image and
> writing to it in turns (whether on the same node or not). In effect,
> their writes just get linearized as the lock is automatically
> transitioned back and forth in a cooperative fashion.
>
> To disable automatic lock transitions between clients, the
> RBD_LOCK_MODE_EXCLUSIVE flag may be specified when acquiring the
> exclusive lock. This is exposed by the --exclusive option for rbd
> device map command.

This is mostly equivalent to "grab the lock and hold on to it", but
it's not guaranteed that the lock would never be released.  If a watch
error occurs, the lock cookie needs to be updated after the watch is
reestablished.  If ceph_cls_set_cookie() fails, we have no choice but
to release the lock and immediately attempt to reacquire it because
otherwise the lock cookie would disagree with that of the new watch.

The code in question has always behaved this way.  Prior to commit
14bb211d324d ("rbd: support updating the lock cookie without releasing
the lock"), the lock was (briefly) released on watch errors
unconditionally.

[1] https://docs.ceph.com/en/latest/rbd/rbd-exclusive-locks/

Thanks,

                Ilya
Dongsheng Yang July 25, 2024, 10:08 a.m. UTC | #2
在 2024/7/25 星期四 下午 5:31, Ilya Dryomov 写道:
> On Thu, Jul 25, 2024 at 10:45 AM Dongsheng Yang
> <dongsheng.yang@linux.dev> wrote:
>>
>>
>>
>> 在 2024/7/24 星期三 下午 2:29, Ilya Dryomov 写道:
>>> Expanding on the previous commit, assuming that rbd_is_lock_owner()
>>> always returns true (i.e. that we are either in RBD_LOCK_STATE_LOCKED
>>> or RBD_LOCK_STATE_QUIESCING) if the mapping is exclusive is wrong too.
>>> In case ceph_cls_set_cookie() fails, the lock would be temporarily
>>> released even if the mapping is exclusive, meaning that we can end up
>>> even in RBD_LOCK_STATE_UNLOCKED.
>>>
>>> IOW, exclusive mappings are really "just" about disabling automatic
>>> lock transitions (as documented in the man page), not about grabbing
>>> the lock and holding on to it whatever it takes.
>>
>> Hi Ilya,
>>          Could you explain more about "disabling atomic lock transitions"? To be
>> honest, I was thinking --exclusive means "grabbing
>> the lock and holding on to it whatever it takes."
> 
> Hi Dongsheng,
> 
> Here are the relevant excerpts from the documentation [1]:
> 
>> To maintain multi-client access, the exclusive-lock feature
>> implements automatic cooperative lock transitions between clients.
>>
>> Whenever a client that holds an exclusive lock on an RBD image gets
>> a request to release the lock, it stops handling writes, flushes its
>> caches and releases the lock.
>>
>> By default, the exclusive-lock feature does not prevent two or more
>> concurrently running clients from opening the same RBD image and
>> writing to it in turns (whether on the same node or not). In effect,
>> their writes just get linearized as the lock is automatically
>> transitioned back and forth in a cooperative fashion.
>>
>> To disable automatic lock transitions between clients, the
>> RBD_LOCK_MODE_EXCLUSIVE flag may be specified when acquiring the
>> exclusive lock. This is exposed by the --exclusive option for rbd
>> device map command.
> 
> This is mostly equivalent to "grab the lock and hold on to it", but
> it's not guaranteed that the lock would never be released.  If a watch
> error occurs, the lock cookie needs to be updated after the watch is
> reestablished.  If ceph_cls_set_cookie() fails, we have no choice but
> to release the lock and immediately attempt to reacquire it because
> otherwise the lock cookie would disagree with that of the new watch.
> 
> The code in question has always behaved this way.  Prior to commit
> 14bb211d324d ("rbd: support updating the lock cookie without releasing
> the lock"), the lock was (briefly) released on watch errors
> unconditionally.

Thanx Ilya, it clarify things. then

Reviewed-by: Dongsheng Yang <dongsheng.yang@easystack.cn>

For all of these 3 patches.

Thanx
> 
> [1] https://docs.ceph.com/en/latest/rbd/rbd-exclusive-locks/
> 
> Thanks,
> 
>                  Ilya
>
Ilya Dryomov July 25, 2024, 10:22 a.m. UTC | #3
On Thu, Jul 25, 2024 at 12:08 PM Dongsheng Yang
<dongsheng.yang@linux.dev> wrote:
>
>
>
> 在 2024/7/25 星期四 下午 5:31, Ilya Dryomov 写道:
> > On Thu, Jul 25, 2024 at 10:45 AM Dongsheng Yang
> > <dongsheng.yang@linux.dev> wrote:
> >>
> >>
> >>
> >> 在 2024/7/24 星期三 下午 2:29, Ilya Dryomov 写道:
> >>> Expanding on the previous commit, assuming that rbd_is_lock_owner()
> >>> always returns true (i.e. that we are either in RBD_LOCK_STATE_LOCKED
> >>> or RBD_LOCK_STATE_QUIESCING) if the mapping is exclusive is wrong too.
> >>> In case ceph_cls_set_cookie() fails, the lock would be temporarily
> >>> released even if the mapping is exclusive, meaning that we can end up
> >>> even in RBD_LOCK_STATE_UNLOCKED.
> >>>
> >>> IOW, exclusive mappings are really "just" about disabling automatic
> >>> lock transitions (as documented in the man page), not about grabbing
> >>> the lock and holding on to it whatever it takes.
> >>
> >> Hi Ilya,
> >>          Could you explain more about "disabling atomic lock transitions"? To be
> >> honest, I was thinking --exclusive means "grabbing
> >> the lock and holding on to it whatever it takes."
> >
> > Hi Dongsheng,
> >
> > Here are the relevant excerpts from the documentation [1]:
> >
> >> To maintain multi-client access, the exclusive-lock feature
> >> implements automatic cooperative lock transitions between clients.
> >>
> >> Whenever a client that holds an exclusive lock on an RBD image gets
> >> a request to release the lock, it stops handling writes, flushes its
> >> caches and releases the lock.
> >>
> >> By default, the exclusive-lock feature does not prevent two or more
> >> concurrently running clients from opening the same RBD image and
> >> writing to it in turns (whether on the same node or not). In effect,
> >> their writes just get linearized as the lock is automatically
> >> transitioned back and forth in a cooperative fashion.
> >>
> >> To disable automatic lock transitions between clients, the
> >> RBD_LOCK_MODE_EXCLUSIVE flag may be specified when acquiring the
> >> exclusive lock. This is exposed by the --exclusive option for rbd
> >> device map command.
> >
> > This is mostly equivalent to "grab the lock and hold on to it", but
> > it's not guaranteed that the lock would never be released.  If a watch
> > error occurs, the lock cookie needs to be updated after the watch is
> > reestablished.  If ceph_cls_set_cookie() fails, we have no choice but
> > to release the lock and immediately attempt to reacquire it because
> > otherwise the lock cookie would disagree with that of the new watch.
> >
> > The code in question has always behaved this way.  Prior to commit
> > 14bb211d324d ("rbd: support updating the lock cookie without releasing
> > the lock"), the lock was (briefly) released on watch errors
> > unconditionally.
>
> Thanx Ilya, it clarify things. then
>
> Reviewed-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
>
> For all of these 3 patches.

Thanks for the speedy review!  I have fixed the typo in the description
of patch 1 and also appended stable tags to patches 2 and 3.

                Ilya