Message ID | 20240630184912.37335-1-pobrn@protonmail.com |
---|---|
State | New |
Headers | show |
Series | [v4] memfd: `MFD_NOEXEC_SEAL` should not imply `MFD_ALLOW_SEALING` | expand |
Hi On Sun, Jun 30, 2024 at 11:49 AM Barnabás Pőcze <pobrn@protonmail.com> wrote: > > `MFD_NOEXEC_SEAL` should remove the executable bits and set `F_SEAL_EXEC` > to prevent further modifications to the executable bits as per the comment > in the uapi header file: > > not executable and sealed to prevent changing to executable > > However, commit 105ff5339f498a ("mm/memfd: add MFD_NOEXEC_SEAL and MFD_EXEC") > that introduced this feature made it so that `MFD_NOEXEC_SEAL` unsets > `F_SEAL_SEAL`, essentially acting as a superset of `MFD_ALLOW_SEALING`. > > Nothing implies that it should be so, and indeed up until the second version > of the of the patchset[0] that introduced `MFD_EXEC` and `MFD_NOEXEC_SEAL`, > `F_SEAL_SEAL` was not removed, however, it was changed in the third revision > of the patchset[1] without a clear explanation. > > This behaviour is surprising for application developers, there is no > documentation that would reveal that `MFD_NOEXEC_SEAL` has the additional > effect of `MFD_ALLOW_SEALING`. Additionally, combined with `vm.memfd_noexec=2` > it has the effect of making all memfds initially sealable. > The documentation is in linux main (653c5c75115c), I hope this gives clarity to the usage of MFD_NOEXEC_SEAL flag to application developers, furthermore I'm working on man page for memfd_create. > So do not remove `F_SEAL_SEAL` when `MFD_NOEXEC_SEAL` is requested, > thereby returning to the pre-Linux 6.3 behaviour of only allowing > sealing when `MFD_ALLOW_SEALING` is specified. > > Now, this is technically a uapi break. However, the damage is expected > to be minimal. To trigger user visible change, a program has to do the > following steps: > > - create memfd: > - with `MFD_NOEXEC_SEAL`, > - without `MFD_ALLOW_SEALING`; > - try to add seals / check the seals. > > But that seems unlikely to happen intentionally since this change > essentially reverts the kernel's behaviour to that of Linux <6.3, > so if a program worked correctly on those older kernels, it will > likely work correctly after this change. > During V3 patch discussion, I sent my reasoning, but here are summaries: - As one might have noticed, unlike other flags in memfd_create, MFD_NOEXEC_SEAL is actually a combination of multiple flags. The idea is to make it easier to use memfd in the most common way, which is NOEXEC + F_SEAL_EXEC + MFD_ALLOW_SEALING. - The new sysctl vm.noexec = 1 helps existing applications move to a more secure way of using memfd. IMO, MFD_ALLOW_SEALING is included by default because most applications would rather have it than not. In any case, an app can set F_SEAL_SEAL to disable the sealing. - MFD_NOEXEC_SEAL has been added for more than one year, multiple applications and distributions have backported and utilized it. Altering ABI now presents a degree of risk and may lead to disruptions. Best regards, -Jeff > I have used Debian Code Search and GitHub to try to find potential > breakages, and I could only find a single one. dbus-broker's > memfd_create() wrapper is aware of this implicit `MFD_ALLOW_SEALING` > behaviour, and tries to work around it[2]. This workaround will > break. Luckily, this only affects the test suite, it does not affect > the normal operations of dbus-broker. There is a PR with a fix[3]. > > I also carried out a smoke test by building a kernel with this change > and booting an Arch Linux system into GNOME and Plasma sessions. > > There was also a previous attempt to address this peculiarity by > introducing a new flag[4]. > > [0]: https://lore.kernel.org/lkml/20220805222126.142525-3-jeffxu@google.com/ > [1]: https://lore.kernel.org/lkml/20221202013404.163143-3-jeffxu@google.com/ > [2]: https://github.com/bus1/dbus-broker/blob/9eb0b7e5826fc76cad7b025bc46f267d4a8784cb/src/util/misc.c#L114 > [3]: https://github.com/bus1/dbus-broker/pull/366 > [4]: https://lore.kernel.org/lkml/20230714114753.170814-1-david@readahead.eu/ > > Cc: stable@vger.kernel.org > Signed-off-by: Barnabás Pőcze <pobrn@protonmail.com> > --- > > * v3: https://lore.kernel.org/linux-mm/20240611231409.3899809-1-jeffxu@chromium.org/ > * v2: https://lore.kernel.org/linux-mm/20240524033933.135049-1-jeffxu@google.com/ > * v1: https://lore.kernel.org/linux-mm/20240513191544.94754-1-pobrn@protonmail.com/ > > This fourth version returns to removing the inconsistency as opposed to documenting > its existence, with the same code change as v1 but with a somewhat extended commit > message. This is sent because I believe it is worth at least a try; it can be easily > reverted if bigger application breakages are discovered than initially imagined. > > --- > mm/memfd.c | 9 ++++----- > tools/testing/selftests/memfd/memfd_test.c | 2 +- > 2 files changed, 5 insertions(+), 6 deletions(-) > > diff --git a/mm/memfd.c b/mm/memfd.c > index 7d8d3ab3fa37..8b7f6afee21d 100644 > --- a/mm/memfd.c > +++ b/mm/memfd.c > @@ -356,12 +356,11 @@ SYSCALL_DEFINE2(memfd_create, > > inode->i_mode &= ~0111; > file_seals = memfd_file_seals_ptr(file); > - if (file_seals) { > - *file_seals &= ~F_SEAL_SEAL; > + if (file_seals) > *file_seals |= F_SEAL_EXEC; > - } > - } else if (flags & MFD_ALLOW_SEALING) { > - /* MFD_EXEC and MFD_ALLOW_SEALING are set */ > + } > + > + if (flags & MFD_ALLOW_SEALING) { > file_seals = memfd_file_seals_ptr(file); > if (file_seals) > *file_seals &= ~F_SEAL_SEAL; > diff --git a/tools/testing/selftests/memfd/memfd_test.c b/tools/testing/selftests/memfd/memfd_test.c > index 95af2d78fd31..7b78329f65b6 100644 > --- a/tools/testing/selftests/memfd/memfd_test.c > +++ b/tools/testing/selftests/memfd/memfd_test.c > @@ -1151,7 +1151,7 @@ static void test_noexec_seal(void) > mfd_def_size, > MFD_CLOEXEC | MFD_NOEXEC_SEAL); > mfd_assert_mode(fd, 0666); > - mfd_assert_has_seals(fd, F_SEAL_EXEC); > + mfd_assert_has_seals(fd, F_SEAL_SEAL | F_SEAL_EXEC); > mfd_fail_chmod(fd, 0777); > close(fd); > } > -- > 2.45.2 > >
On 2024-06-30, Barnabás Pőcze <pobrn@protonmail.com> wrote: > `MFD_NOEXEC_SEAL` should remove the executable bits and set `F_SEAL_EXEC` > to prevent further modifications to the executable bits as per the comment > in the uapi header file: > > not executable and sealed to prevent changing to executable > > However, commit 105ff5339f498a ("mm/memfd: add MFD_NOEXEC_SEAL and MFD_EXEC") > that introduced this feature made it so that `MFD_NOEXEC_SEAL` unsets > `F_SEAL_SEAL`, essentially acting as a superset of `MFD_ALLOW_SEALING`. > > Nothing implies that it should be so, and indeed up until the second version > of the of the patchset[0] that introduced `MFD_EXEC` and `MFD_NOEXEC_SEAL`, > `F_SEAL_SEAL` was not removed, however, it was changed in the third revision > of the patchset[1] without a clear explanation. > > This behaviour is surprising for application developers, there is no > documentation that would reveal that `MFD_NOEXEC_SEAL` has the additional > effect of `MFD_ALLOW_SEALING`. Additionally, combined with `vm.memfd_noexec=2` > it has the effect of making all memfds initially sealable. > > So do not remove `F_SEAL_SEAL` when `MFD_NOEXEC_SEAL` is requested, > thereby returning to the pre-Linux 6.3 behaviour of only allowing > sealing when `MFD_ALLOW_SEALING` is specified. This behaviour makes sense, I'm a little sad I didn't catch it when I was fixing vm.memfd_noexec. There is a possibility for breakage, but we should give it a shot, given how new the API is (and the API itself was also broken until Linux 6.6 anyway)... Feel free to take my Reviewed-by: Aleksa Sarai <cyphar@cyphar.com> Thanks. > Now, this is technically a uapi break. However, the damage is expected > to be minimal. To trigger user visible change, a program has to do the > following steps: > > - create memfd: > - with `MFD_NOEXEC_SEAL`, > - without `MFD_ALLOW_SEALING`; > - try to add seals / check the seals. > > But that seems unlikely to happen intentionally since this change > essentially reverts the kernel's behaviour to that of Linux <6.3, > so if a program worked correctly on those older kernels, it will > likely work correctly after this change. > > I have used Debian Code Search and GitHub to try to find potential > breakages, and I could only find a single one. dbus-broker's > memfd_create() wrapper is aware of this implicit `MFD_ALLOW_SEALING` > behaviour, and tries to work around it[2]. This workaround will > break. Luckily, this only affects the test suite, it does not affect > the normal operations of dbus-broker. There is a PR with a fix[3]. > > I also carried out a smoke test by building a kernel with this change > and booting an Arch Linux system into GNOME and Plasma sessions. > > There was also a previous attempt to address this peculiarity by > introducing a new flag[4]. > > [0]: https://lore.kernel.org/lkml/20220805222126.142525-3-jeffxu@google.com/ > [1]: https://lore.kernel.org/lkml/20221202013404.163143-3-jeffxu@google.com/ > [2]: https://github.com/bus1/dbus-broker/blob/9eb0b7e5826fc76cad7b025bc46f267d4a8784cb/src/util/misc.c#L114 > [3]: https://github.com/bus1/dbus-broker/pull/366 > [4]: https://lore.kernel.org/lkml/20230714114753.170814-1-david@readahead.eu/ > > Cc: stable@vger.kernel.org > Signed-off-by: Barnabás Pőcze <pobrn@protonmail.com> > --- > > * v3: https://lore.kernel.org/linux-mm/20240611231409.3899809-1-jeffxu@chromium.org/ > * v2: https://lore.kernel.org/linux-mm/20240524033933.135049-1-jeffxu@google.com/ > * v1: https://lore.kernel.org/linux-mm/20240513191544.94754-1-pobrn@protonmail.com/ > > This fourth version returns to removing the inconsistency as opposed to documenting > its existence, with the same code change as v1 but with a somewhat extended commit > message. This is sent because I believe it is worth at least a try; it can be easily > reverted if bigger application breakages are discovered than initially imagined. > > --- > mm/memfd.c | 9 ++++----- > tools/testing/selftests/memfd/memfd_test.c | 2 +- > 2 files changed, 5 insertions(+), 6 deletions(-) > > diff --git a/mm/memfd.c b/mm/memfd.c > index 7d8d3ab3fa37..8b7f6afee21d 100644 > --- a/mm/memfd.c > +++ b/mm/memfd.c > @@ -356,12 +356,11 @@ SYSCALL_DEFINE2(memfd_create, > > inode->i_mode &= ~0111; > file_seals = memfd_file_seals_ptr(file); > - if (file_seals) { > - *file_seals &= ~F_SEAL_SEAL; > + if (file_seals) > *file_seals |= F_SEAL_EXEC; > - } > - } else if (flags & MFD_ALLOW_SEALING) { > - /* MFD_EXEC and MFD_ALLOW_SEALING are set */ > + } > + > + if (flags & MFD_ALLOW_SEALING) { > file_seals = memfd_file_seals_ptr(file); > if (file_seals) > *file_seals &= ~F_SEAL_SEAL; > diff --git a/tools/testing/selftests/memfd/memfd_test.c b/tools/testing/selftests/memfd/memfd_test.c > index 95af2d78fd31..7b78329f65b6 100644 > --- a/tools/testing/selftests/memfd/memfd_test.c > +++ b/tools/testing/selftests/memfd/memfd_test.c > @@ -1151,7 +1151,7 @@ static void test_noexec_seal(void) > mfd_def_size, > MFD_CLOEXEC | MFD_NOEXEC_SEAL); > mfd_assert_mode(fd, 0666); > - mfd_assert_has_seals(fd, F_SEAL_EXEC); > + mfd_assert_has_seals(fd, F_SEAL_SEAL | F_SEAL_EXEC); > mfd_fail_chmod(fd, 0777); > close(fd); > } > -- > 2.45.2 > >
Hi Gentle ping. Is there any chance we could move forward with this? I am not aware of any breakage it would cause; but longer the wait, the higher the likelihood. Regards, Barnabás Pőcze 2024. június 30., vasárnap 20:49 keltezéssel, Barnabás Pőcze <pobrn@protonmail.com> írta: > `MFD_NOEXEC_SEAL` should remove the executable bits and set `F_SEAL_EXEC` > to prevent further modifications to the executable bits as per the comment > in the uapi header file: > > not executable and sealed to prevent changing to executable > > However, commit 105ff5339f498a ("mm/memfd: add MFD_NOEXEC_SEAL and MFD_EXEC") > that introduced this feature made it so that `MFD_NOEXEC_SEAL` unsets > `F_SEAL_SEAL`, essentially acting as a superset of `MFD_ALLOW_SEALING`. > > Nothing implies that it should be so, and indeed up until the second version > of the of the patchset[0] that introduced `MFD_EXEC` and `MFD_NOEXEC_SEAL`, > `F_SEAL_SEAL` was not removed, however, it was changed in the third revision > of the patchset[1] without a clear explanation. > > This behaviour is surprising for application developers, there is no > documentation that would reveal that `MFD_NOEXEC_SEAL` has the additional > effect of `MFD_ALLOW_SEALING`. Additionally, combined with `vm.memfd_noexec=2` > it has the effect of making all memfds initially sealable. > > So do not remove `F_SEAL_SEAL` when `MFD_NOEXEC_SEAL` is requested, > thereby returning to the pre-Linux 6.3 behaviour of only allowing > sealing when `MFD_ALLOW_SEALING` is specified. > > Now, this is technically a uapi break. However, the damage is expected > to be minimal. To trigger user visible change, a program has to do the > following steps: > > - create memfd: > - with `MFD_NOEXEC_SEAL`, > - without `MFD_ALLOW_SEALING`; > - try to add seals / check the seals. > > But that seems unlikely to happen intentionally since this change > essentially reverts the kernel's behaviour to that of Linux <6.3, > so if a program worked correctly on those older kernels, it will > likely work correctly after this change. > > I have used Debian Code Search and GitHub to try to find potential > breakages, and I could only find a single one. dbus-broker's > memfd_create() wrapper is aware of this implicit `MFD_ALLOW_SEALING` > behaviour, and tries to work around it[2]. This workaround will > break. Luckily, this only affects the test suite, it does not affect > the normal operations of dbus-broker. There is a PR with a fix[3]. > > I also carried out a smoke test by building a kernel with this change > and booting an Arch Linux system into GNOME and Plasma sessions. > > There was also a previous attempt to address this peculiarity by > introducing a new flag[4]. > > [0]: https://lore.kernel.org/lkml/20220805222126.142525-3-jeffxu@google.com/ > [1]: https://lore.kernel.org/lkml/20221202013404.163143-3-jeffxu@google.com/ > [2]: https://github.com/bus1/dbus-broker/blob/9eb0b7e5826fc76cad7b025bc46f267d4a8784cb/src/util/misc.c#L114 > [3]: https://github.com/bus1/dbus-broker/pull/366 > [4]: https://lore.kernel.org/lkml/20230714114753.170814-1-david@readahead.eu/ > > Cc: stable@vger.kernel.org > Signed-off-by: Barnabás Pőcze <pobrn@protonmail.com> > --- > > * v3: https://lore.kernel.org/linux-mm/20240611231409.3899809-1-jeffxu@chromium.org/ > * v2: https://lore.kernel.org/linux-mm/20240524033933.135049-1-jeffxu@google.com/ > * v1: https://lore.kernel.org/linux-mm/20240513191544.94754-1-pobrn@protonmail.com/ > > This fourth version returns to removing the inconsistency as opposed to documenting > its existence, with the same code change as v1 but with a somewhat extended commit > message. This is sent because I believe it is worth at least a try; it can be easily > reverted if bigger application breakages are discovered than initially imagined. > > --- > mm/memfd.c | 9 ++++----- > tools/testing/selftests/memfd/memfd_test.c | 2 +- > 2 files changed, 5 insertions(+), 6 deletions(-) > > diff --git a/mm/memfd.c b/mm/memfd.c > index 7d8d3ab3fa37..8b7f6afee21d 100644 > --- a/mm/memfd.c > +++ b/mm/memfd.c > @@ -356,12 +356,11 @@ SYSCALL_DEFINE2(memfd_create, > > inode->i_mode &= ~0111; > file_seals = memfd_file_seals_ptr(file); > - if (file_seals) { > - *file_seals &= ~F_SEAL_SEAL; > + if (file_seals) > *file_seals |= F_SEAL_EXEC; > - } > - } else if (flags & MFD_ALLOW_SEALING) { > - /* MFD_EXEC and MFD_ALLOW_SEALING are set */ > + } > + > + if (flags & MFD_ALLOW_SEALING) { > file_seals = memfd_file_seals_ptr(file); > if (file_seals) > *file_seals &= ~F_SEAL_SEAL; > diff --git a/tools/testing/selftests/memfd/memfd_test.c b/tools/testing/selftests/memfd/memfd_test.c > index 95af2d78fd31..7b78329f65b6 100644 > --- a/tools/testing/selftests/memfd/memfd_test.c > +++ b/tools/testing/selftests/memfd/memfd_test.c > @@ -1151,7 +1151,7 @@ static void test_noexec_seal(void) > mfd_def_size, > MFD_CLOEXEC | MFD_NOEXEC_SEAL); > mfd_assert_mode(fd, 0666); > - mfd_assert_has_seals(fd, F_SEAL_EXEC); > + mfd_assert_has_seals(fd, F_SEAL_SEAL | F_SEAL_EXEC); > mfd_fail_chmod(fd, 0777); > close(fd); > } > -- > 2.45.2 >
diff --git a/mm/memfd.c b/mm/memfd.c index 7d8d3ab3fa37..8b7f6afee21d 100644 --- a/mm/memfd.c +++ b/mm/memfd.c @@ -356,12 +356,11 @@ SYSCALL_DEFINE2(memfd_create, inode->i_mode &= ~0111; file_seals = memfd_file_seals_ptr(file); - if (file_seals) { - *file_seals &= ~F_SEAL_SEAL; + if (file_seals) *file_seals |= F_SEAL_EXEC; - } - } else if (flags & MFD_ALLOW_SEALING) { - /* MFD_EXEC and MFD_ALLOW_SEALING are set */ + } + + if (flags & MFD_ALLOW_SEALING) { file_seals = memfd_file_seals_ptr(file); if (file_seals) *file_seals &= ~F_SEAL_SEAL; diff --git a/tools/testing/selftests/memfd/memfd_test.c b/tools/testing/selftests/memfd/memfd_test.c index 95af2d78fd31..7b78329f65b6 100644 --- a/tools/testing/selftests/memfd/memfd_test.c +++ b/tools/testing/selftests/memfd/memfd_test.c @@ -1151,7 +1151,7 @@ static void test_noexec_seal(void) mfd_def_size, MFD_CLOEXEC | MFD_NOEXEC_SEAL); mfd_assert_mode(fd, 0666); - mfd_assert_has_seals(fd, F_SEAL_EXEC); + mfd_assert_has_seals(fd, F_SEAL_SEAL | F_SEAL_EXEC); mfd_fail_chmod(fd, 0777); close(fd); }
`MFD_NOEXEC_SEAL` should remove the executable bits and set `F_SEAL_EXEC` to prevent further modifications to the executable bits as per the comment in the uapi header file: not executable and sealed to prevent changing to executable However, commit 105ff5339f498a ("mm/memfd: add MFD_NOEXEC_SEAL and MFD_EXEC") that introduced this feature made it so that `MFD_NOEXEC_SEAL` unsets `F_SEAL_SEAL`, essentially acting as a superset of `MFD_ALLOW_SEALING`. Nothing implies that it should be so, and indeed up until the second version of the of the patchset[0] that introduced `MFD_EXEC` and `MFD_NOEXEC_SEAL`, `F_SEAL_SEAL` was not removed, however, it was changed in the third revision of the patchset[1] without a clear explanation. This behaviour is surprising for application developers, there is no documentation that would reveal that `MFD_NOEXEC_SEAL` has the additional effect of `MFD_ALLOW_SEALING`. Additionally, combined with `vm.memfd_noexec=2` it has the effect of making all memfds initially sealable. So do not remove `F_SEAL_SEAL` when `MFD_NOEXEC_SEAL` is requested, thereby returning to the pre-Linux 6.3 behaviour of only allowing sealing when `MFD_ALLOW_SEALING` is specified. Now, this is technically a uapi break. However, the damage is expected to be minimal. To trigger user visible change, a program has to do the following steps: - create memfd: - with `MFD_NOEXEC_SEAL`, - without `MFD_ALLOW_SEALING`; - try to add seals / check the seals. But that seems unlikely to happen intentionally since this change essentially reverts the kernel's behaviour to that of Linux <6.3, so if a program worked correctly on those older kernels, it will likely work correctly after this change. I have used Debian Code Search and GitHub to try to find potential breakages, and I could only find a single one. dbus-broker's memfd_create() wrapper is aware of this implicit `MFD_ALLOW_SEALING` behaviour, and tries to work around it[2]. This workaround will break. Luckily, this only affects the test suite, it does not affect the normal operations of dbus-broker. There is a PR with a fix[3]. I also carried out a smoke test by building a kernel with this change and booting an Arch Linux system into GNOME and Plasma sessions. There was also a previous attempt to address this peculiarity by introducing a new flag[4]. [0]: https://lore.kernel.org/lkml/20220805222126.142525-3-jeffxu@google.com/ [1]: https://lore.kernel.org/lkml/20221202013404.163143-3-jeffxu@google.com/ [2]: https://github.com/bus1/dbus-broker/blob/9eb0b7e5826fc76cad7b025bc46f267d4a8784cb/src/util/misc.c#L114 [3]: https://github.com/bus1/dbus-broker/pull/366 [4]: https://lore.kernel.org/lkml/20230714114753.170814-1-david@readahead.eu/ Cc: stable@vger.kernel.org Signed-off-by: Barnabás Pőcze <pobrn@protonmail.com> --- * v3: https://lore.kernel.org/linux-mm/20240611231409.3899809-1-jeffxu@chromium.org/ * v2: https://lore.kernel.org/linux-mm/20240524033933.135049-1-jeffxu@google.com/ * v1: https://lore.kernel.org/linux-mm/20240513191544.94754-1-pobrn@protonmail.com/ This fourth version returns to removing the inconsistency as opposed to documenting its existence, with the same code change as v1 but with a somewhat extended commit message. This is sent because I believe it is worth at least a try; it can be easily reverted if bigger application breakages are discovered than initially imagined. --- mm/memfd.c | 9 ++++----- tools/testing/selftests/memfd/memfd_test.c | 2 +- 2 files changed, 5 insertions(+), 6 deletions(-)