Message ID | 20220328175033.2437312-1-roberto.sassu@huawei.com |
---|---|
Headers | show |
Series | bpf: Secure and authenticated preloading of eBPF programs | expand |
On Mon, Mar 28, 2022 at 10:51 AM Roberto Sassu <roberto.sassu@huawei.com> wrote: > > eBPF already allows programs to be preloaded and kept running without > intervention from user space. There is a dedicated kernel module called > bpf_preload, which contains the light skeleton of the iterators_bpf eBPF > program. If this module is enabled in the kernel configuration, its loading > will be triggered when the bpf filesystem is mounted (unless the module is > built-in), and the links of iterators_bpf are pinned in that filesystem > (they will appear as the progs.debug and maps.debug files). > > However, the current mechanism, if used to preload an LSM, would not offer > the same security guarantees of LSMs integrated in the security subsystem. > Also, it is not generic enough to be used for preloading arbitrary eBPF > programs, unless the bpf_preload code is heavily modified. > > More specifically, the security problems are: > - any program can be pinned to the bpf filesystem without limitations > (unless a MAC mechanism enforces some restrictions); > - programs being executed can be terminated at any time by deleting the > pinned objects or unmounting the bpf filesystem. > > The usability problems are: > - only a fixed amount of links can be pinned; > - only links can be pinned, other object types are not supported; > - code to pin objects has to be written manually; > - preloading multiple eBPF programs is not practical, bpf_preload has to be > modified to include additional light skeletons. > > Solve the security problems by mounting the bpf filesystem from the kernel, > by preloading authenticated kernel modules (e.g. with module.sig_enforce) > and by pinning objects to that filesystem. This particular filesystem > instance guarantees that desired eBPF programs run until the very end of > the kernel lifecycle, since even root cannot interfere with it. > > Solve the usability problems by generalizing the pinning function, to > handle not only links but also maps and progs. Also increment the object > reference count and call the pinning function directly from the preload > method (currently in the bpf_preload kernel module) rather than from the > bpf filesystem code itself, so that a generic eBPF program can do those > operations depending on its objects (this also avoids the limitation of the > fixed-size array for storing the objects to pin). > > Then, simplify the process of pinning objects defined by a generic eBPF > program by automatically generating the required methods in the light > skeleton. Also, generate a separate kernel module for each eBPF program to > preload, so that existing ones don't have to be modified. Finally, support > preloading multiple eBPF programs by allowing users to specify a list from > the kernel configuration, at build time, or with the new kernel option > bpf_preload_list=, at run-time. > > To summarize, this patch set makes it possible to plug in out-of-tree LSMs > matching the security guarantees of their counterpart in the security > subsystem, without having to modify the kernel itself. The same benefits > are extended to other eBPF program types. > > Only one remaining problem is how to support auto-attaching eBPF programs > with LSM type. It will be solved with a separate patch set. > > Patches 1-2 export some definitions, to build out-of-tree kernel modules > with eBPF programs to preload. Patches 3-4 allow eBPF programs to pin > objects by themselves. Patches 5-10 automatically generate the methods for > preloading in the light skeleton. Patches 11-14 make it possible to preload > multiple eBPF programs. Patch 15 automatically generates the kernel module > for preloading an eBPF program, patch 16 does a kernel mount of the bpf > filesystem, and finally patches 17-18 test the functionality introduced. > This approach of moving tons of pretty generic code into codegen of lskel seems suboptimal. Why so much code has to be codegenerated? Especially that tiny module code? Can you please elaborate on why it can't be done in a way that doesn't require such extensive light skeleton codegen changes? > Roberto Sassu (18): > bpf: Export bpf_link_inc() > bpf-preload: Move bpf_preload.h to include/linux > bpf-preload: Generalize object pinning from the kernel > bpf-preload: Export and call bpf_obj_do_pin_kernel() > bpf-preload: Generate static variables > bpf-preload: Generate free_objs_and_skel() > bpf-preload: Generate preload() > bpf-preload: Generate load_skel() > bpf-preload: Generate code to pin non-internal maps > bpf-preload: Generate bpf_preload_ops > bpf-preload: Store multiple bpf_preload_ops structures in a linked > list > bpf-preload: Implement new registration method for preloading eBPF > programs > bpf-preload: Move pinned links and maps to a dedicated directory in > bpffs > bpf-preload: Switch to new preload registration method > bpf-preload: Generate code of kernel module to preload > bpf-preload: Do kernel mount to ensure that pinned objects don't > disappear > bpf-preload/selftests: Add test for automatic generation of preload > methods > bpf-preload/selftests: Preload a test eBPF program and check pinned > objects please use proper prefixes: bpf (for kernel-side changes), libbpf, bpftool, selftests/bpf, etc > > .../admin-guide/kernel-parameters.txt | 8 + > fs/namespace.c | 1 + > include/linux/bpf.h | 5 + > include/linux/bpf_preload.h | 37 ++ > init/main.c | 2 + > kernel/bpf/inode.c | 295 +++++++++-- > kernel/bpf/preload/Kconfig | 25 +- > kernel/bpf/preload/bpf_preload.h | 16 - > kernel/bpf/preload/bpf_preload_kern.c | 85 +--- > kernel/bpf/preload/iterators/Makefile | 9 +- > .../bpf/preload/iterators/iterators.lskel.h | 466 +++++++++++------- > kernel/bpf/syscall.c | 1 + > .../bpf/bpftool/Documentation/bpftool-gen.rst | 13 + > tools/bpf/bpftool/bash-completion/bpftool | 6 +- > tools/bpf/bpftool/gen.c | 331 +++++++++++++ > tools/bpf/bpftool/main.c | 7 +- > tools/bpf/bpftool/main.h | 1 + > tools/testing/selftests/bpf/Makefile | 32 +- > .../bpf/bpf_testmod_preload/.gitignore | 7 + > .../bpf/bpf_testmod_preload/Makefile | 20 + > .../gen_preload_methods.expected.diff | 97 ++++ > .../bpf/prog_tests/test_gen_preload_methods.c | 27 + > .../bpf/prog_tests/test_preload_methods.c | 69 +++ > .../selftests/bpf/progs/gen_preload_methods.c | 23 + > 24 files changed, 1246 insertions(+), 337 deletions(-) > create mode 100644 include/linux/bpf_preload.h > delete mode 100644 kernel/bpf/preload/bpf_preload.h > create mode 100644 tools/testing/selftests/bpf/bpf_testmod_preload/.gitignore > create mode 100644 tools/testing/selftests/bpf/bpf_testmod_preload/Makefile > create mode 100644 tools/testing/selftests/bpf/prog_tests/gen_preload_methods.expected.diff > create mode 100644 tools/testing/selftests/bpf/prog_tests/test_gen_preload_methods.c > create mode 100644 tools/testing/selftests/bpf/prog_tests/test_preload_methods.c > create mode 100644 tools/testing/selftests/bpf/progs/gen_preload_methods.c > > -- > 2.32.0 >
On Mon, Mar 28, 2022 at 10:52 AM Roberto Sassu <roberto.sassu@huawei.com> wrote: > > The first part of the preload code generation consists in generating the > static variables to be used by the code itself: the links and maps to be > pinned, and the skeleton. Generation of the preload variables and methods > is enabled with the option -P added to 'bpftool gen skeleton'. > > The existing variables maps_link and progs_links in bpf_preload_kern.c have > been renamed respectively to dump_bpf_map_link and dump_bpf_prog_link, to > match the name of the variables in the main structure of the light > skeleton. > > Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> > --- > kernel/bpf/preload/bpf_preload_kern.c | 35 +- > kernel/bpf/preload/iterators/Makefile | 2 +- > .../bpf/preload/iterators/iterators.lskel.h | 378 +++++++++--------- > .../bpf/bpftool/Documentation/bpftool-gen.rst | 5 + > tools/bpf/bpftool/bash-completion/bpftool | 2 +- > tools/bpf/bpftool/gen.c | 27 ++ > tools/bpf/bpftool/main.c | 7 +- > tools/bpf/bpftool/main.h | 1 + > 8 files changed, 254 insertions(+), 203 deletions(-) > [...] > +__attribute__((unused)) static void > +iterators_bpf__assert(struct iterators_bpf *s) > +{ > +#ifdef __cplusplus > +#define _Static_assert static_assert > +#endif > +#ifdef __cplusplus > +#undef _Static_assert > +#endif > +} > + > +static struct bpf_link *dump_bpf_map_link; > +static struct bpf_link *dump_bpf_prog_link; > +static struct iterators_bpf *skel; I don't understand what is this and what for? You are making an assumption that light skeleton can be instantiated just once, why? And adding extra bpftool option to light skeleton codegen just to save a bit of typing at the place where light skeleton is actually instantiated and used doesn't seems like a right approach. Further, even if this is the way to go, please split out bpftool changes from kernel changes. There is nothing requiring them to be coupled together. [...]
> From: Andrii Nakryiko [mailto:andrii.nakryiko@gmail.com] > Sent: Wednesday, March 30, 2022 1:51 AM > On Mon, Mar 28, 2022 at 10:51 AM Roberto Sassu > <roberto.sassu@huawei.com> wrote: [...] > > Patches 1-2 export some definitions, to build out-of-tree kernel modules > > with eBPF programs to preload. Patches 3-4 allow eBPF programs to pin > > objects by themselves. Patches 5-10 automatically generate the methods > for > > preloading in the light skeleton. Patches 11-14 make it possible to preload > > multiple eBPF programs. Patch 15 automatically generates the kernel > module > > for preloading an eBPF program, patch 16 does a kernel mount of the bpf > > filesystem, and finally patches 17-18 test the functionality introduced. > > > > This approach of moving tons of pretty generic code into codegen of > lskel seems suboptimal. Why so much code has to be codegenerated? > Especially that tiny module code? Hi Andrii the main goal of this patch set is to use the preloading mechanism to plug in securely LSMs implemented as eBPF programs. I have a use case, I want to plug in my eBPF program, DIGLIM eBPF. I started to modify the preloading code manually, and I realized how complicated the process is if you want to add something more than the existing iterators_bpf program. First, you have to look at which objects you want to preload, then write code for each of them. This process is repetitive and deterministic, this is why I immediately thought that it is a good case for automatic code generation. My idea is that, if this mechanism is accepted, an implementer of an LSM wishing to be preloaded at the very beginning, only has to write his eBPF code, the kernel and bpftool take care of the rest. Generation of the preloading code is optional, and need to be enabled with the -P option, in addition to -L. The light skeleton of DIGLIM eBPF looks like: https://github.com/robertosassu/linux/blob/bpf-preload-v1/kernel/bpf/preload/diglim/diglim.lskel.h The preloading interface is very similar to the one used by the security subsystem: an ordered list of eBPF programs to preload set in the kernel configuration, that can be overwritten with the kernel option bpf_preload_list=. The changes that would be required to preload DIGLIM eBPF look like: https://github.com/robertosassu/linux/commit/c07e1a78584ee688aeb812f07dc7ab3060ac6152 Thanks Roberto HUAWEI TECHNOLOGIES Duesseldorf GmbH, HRB 56063 Managing Director: Li Peng, Zhong Ronghua > Can you please elaborate on why it can't be done in a way that doesn't > require such extensive light skeleton codegen changes? > > > > Roberto Sassu (18): > > bpf: Export bpf_link_inc() > > bpf-preload: Move bpf_preload.h to include/linux > > bpf-preload: Generalize object pinning from the kernel > > bpf-preload: Export and call bpf_obj_do_pin_kernel() > > bpf-preload: Generate static variables > > bpf-preload: Generate free_objs_and_skel() > > bpf-preload: Generate preload() > > bpf-preload: Generate load_skel() > > bpf-preload: Generate code to pin non-internal maps > > bpf-preload: Generate bpf_preload_ops > > bpf-preload: Store multiple bpf_preload_ops structures in a linked > > list > > bpf-preload: Implement new registration method for preloading eBPF > > programs > > bpf-preload: Move pinned links and maps to a dedicated directory in > > bpffs > > bpf-preload: Switch to new preload registration method > > bpf-preload: Generate code of kernel module to preload > > bpf-preload: Do kernel mount to ensure that pinned objects don't > > disappear > > bpf-preload/selftests: Add test for automatic generation of preload > > methods > > bpf-preload/selftests: Preload a test eBPF program and check pinned > > objects > > please use proper prefixes: bpf (for kernel-side changes), libbpf, > bpftool, selftests/bpf, etc > > > > > > .../admin-guide/kernel-parameters.txt | 8 + > > fs/namespace.c | 1 + > > include/linux/bpf.h | 5 + > > include/linux/bpf_preload.h | 37 ++ > > init/main.c | 2 + > > kernel/bpf/inode.c | 295 +++++++++-- > > kernel/bpf/preload/Kconfig | 25 +- > > kernel/bpf/preload/bpf_preload.h | 16 - > > kernel/bpf/preload/bpf_preload_kern.c | 85 +--- > > kernel/bpf/preload/iterators/Makefile | 9 +- > > .../bpf/preload/iterators/iterators.lskel.h | 466 +++++++++++------- > > kernel/bpf/syscall.c | 1 + > > .../bpf/bpftool/Documentation/bpftool-gen.rst | 13 + > > tools/bpf/bpftool/bash-completion/bpftool | 6 +- > > tools/bpf/bpftool/gen.c | 331 +++++++++++++ > > tools/bpf/bpftool/main.c | 7 +- > > tools/bpf/bpftool/main.h | 1 + > > tools/testing/selftests/bpf/Makefile | 32 +- > > .../bpf/bpf_testmod_preload/.gitignore | 7 + > > .../bpf/bpf_testmod_preload/Makefile | 20 + > > .../gen_preload_methods.expected.diff | 97 ++++ > > .../bpf/prog_tests/test_gen_preload_methods.c | 27 + > > .../bpf/prog_tests/test_preload_methods.c | 69 +++ > > .../selftests/bpf/progs/gen_preload_methods.c | 23 + > > 24 files changed, 1246 insertions(+), 337 deletions(-) > > create mode 100644 include/linux/bpf_preload.h > > delete mode 100644 kernel/bpf/preload/bpf_preload.h > > create mode 100644 > tools/testing/selftests/bpf/bpf_testmod_preload/.gitignore > > create mode 100644 > tools/testing/selftests/bpf/bpf_testmod_preload/Makefile > > create mode 100644 > tools/testing/selftests/bpf/prog_tests/gen_preload_methods.expected.diff > > create mode 100644 > tools/testing/selftests/bpf/prog_tests/test_gen_preload_methods.c > > create mode 100644 > tools/testing/selftests/bpf/prog_tests/test_preload_methods.c > > create mode 100644 > tools/testing/selftests/bpf/progs/gen_preload_methods.c > > > > -- > > 2.32.0 > >
> From: Andrii Nakryiko [mailto:andrii.nakryiko@gmail.com] > Sent: Wednesday, March 30, 2022 1:52 AM > On Mon, Mar 28, 2022 at 10:52 AM Roberto Sassu > <roberto.sassu@huawei.com> wrote: > > > > The first part of the preload code generation consists in generating the > > static variables to be used by the code itself: the links and maps to be > > pinned, and the skeleton. Generation of the preload variables and > methods > > is enabled with the option -P added to 'bpftool gen skeleton'. > > > > The existing variables maps_link and progs_links in bpf_preload_kern.c > have > > been renamed respectively to dump_bpf_map_link and > dump_bpf_prog_link, to > > match the name of the variables in the main structure of the light > > skeleton. > > > > Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> > > --- > > kernel/bpf/preload/bpf_preload_kern.c | 35 +- > > kernel/bpf/preload/iterators/Makefile | 2 +- > > .../bpf/preload/iterators/iterators.lskel.h | 378 +++++++++--------- > > .../bpf/bpftool/Documentation/bpftool-gen.rst | 5 + > > tools/bpf/bpftool/bash-completion/bpftool | 2 +- > > tools/bpf/bpftool/gen.c | 27 ++ > > tools/bpf/bpftool/main.c | 7 +- > > tools/bpf/bpftool/main.h | 1 + > > 8 files changed, 254 insertions(+), 203 deletions(-) > > > > [...] > > > +__attribute__((unused)) static void > > +iterators_bpf__assert(struct iterators_bpf *s) > > +{ > > +#ifdef __cplusplus > > +#define _Static_assert static_assert > > +#endif > > +#ifdef __cplusplus > > +#undef _Static_assert > > +#endif > > +} > > + > > +static struct bpf_link *dump_bpf_map_link; > > +static struct bpf_link *dump_bpf_prog_link; > > +static struct iterators_bpf *skel; > > I don't understand what is this and what for? You are making an > assumption that light skeleton can be instantiated just once, why? And > adding extra bpftool option to light skeleton codegen just to save a > bit of typing at the place where light skeleton is actually > instantiated and used doesn't seems like a right approach. True, iterator_bpf is simple. Writing the preloading code for it is simple. But, what if you wanted to preload an LSM with 10 hooks or more? Ok, regarding where the preloading code should be, I will try to move the generated code to the kernel module instead of the light skeleton. Thanks Roberto HUAWEI TECHNOLOGIES Duesseldorf GmbH, HRB 56063 Managing Director: Li Peng, Zhong Ronghua > Further, even if this is the way to go, please split out bpftool > changes from kernel changes. There is nothing requiring them to be > coupled together. > > [...]
> From: Roberto Sassu > Sent: Wednesday, March 30, 2022 9:45 AM > > From: Andrii Nakryiko [mailto:andrii.nakryiko@gmail.com] > > Sent: Wednesday, March 30, 2022 1:52 AM > > On Mon, Mar 28, 2022 at 10:52 AM Roberto Sassu > > <roberto.sassu@huawei.com> wrote: > > > > > > The first part of the preload code generation consists in generating the > > > static variables to be used by the code itself: the links and maps to be > > > pinned, and the skeleton. Generation of the preload variables and > > methods > > > is enabled with the option -P added to 'bpftool gen skeleton'. > > > > > > The existing variables maps_link and progs_links in bpf_preload_kern.c > > have > > > been renamed respectively to dump_bpf_map_link and > > dump_bpf_prog_link, to > > > match the name of the variables in the main structure of the light > > > skeleton. > > > > > > Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> > > > --- > > > kernel/bpf/preload/bpf_preload_kern.c | 35 +- > > > kernel/bpf/preload/iterators/Makefile | 2 +- > > > .../bpf/preload/iterators/iterators.lskel.h | 378 +++++++++--------- > > > .../bpf/bpftool/Documentation/bpftool-gen.rst | 5 + > > > tools/bpf/bpftool/bash-completion/bpftool | 2 +- > > > tools/bpf/bpftool/gen.c | 27 ++ > > > tools/bpf/bpftool/main.c | 7 +- > > > tools/bpf/bpftool/main.h | 1 + > > > 8 files changed, 254 insertions(+), 203 deletions(-) > > > > > > > [...] > > > > > +__attribute__((unused)) static void > > > +iterators_bpf__assert(struct iterators_bpf *s) > > > +{ > > > +#ifdef __cplusplus > > > +#define _Static_assert static_assert > > > +#endif > > > +#ifdef __cplusplus > > > +#undef _Static_assert > > > +#endif > > > +} > > > + > > > +static struct bpf_link *dump_bpf_map_link; > > > +static struct bpf_link *dump_bpf_prog_link; > > > +static struct iterators_bpf *skel; > > > > I don't understand what is this and what for? You are making an > > assumption that light skeleton can be instantiated just once, why? And > > adding extra bpftool option to light skeleton codegen just to save a > > bit of typing at the place where light skeleton is actually > > instantiated and used doesn't seems like a right approach. > > True, iterator_bpf is simple. Writing the preloading code > for it is simple. But, what if you wanted to preload an LSM > with 10 hooks or more? > > Ok, regarding where the preloading code should be, I will > try to move the generated code to the kernel module instead > of the light skeleton. Done. I moved everything from the light skeleton to the kernel module. The changes now are also well separated, and regeneration of the kernel module occurs only after all the generation code is added to bpftool. I pushed a new branch: https://github.com/robertosassu/linux/commits/bpf-preload-v2-devel-v2 Roberto HUAWEI TECHNOLOGIES Duesseldorf GmbH, HRB 56063 Managing Director: Li Peng, Zhong Ronghua
On Mon, Mar 28, 2022 at 07:50:15PM +0200, Roberto Sassu wrote: > eBPF already allows programs to be preloaded and kept running without > intervention from user space. There is a dedicated kernel module called > bpf_preload, which contains the light skeleton of the iterators_bpf eBPF > program. If this module is enabled in the kernel configuration, its loading > will be triggered when the bpf filesystem is mounted (unless the module is > built-in), and the links of iterators_bpf are pinned in that filesystem > (they will appear as the progs.debug and maps.debug files). > > However, the current mechanism, if used to preload an LSM, would not offer > the same security guarantees of LSMs integrated in the security subsystem. > Also, it is not generic enough to be used for preloading arbitrary eBPF > programs, unless the bpf_preload code is heavily modified. > > More specifically, the security problems are: > - any program can be pinned to the bpf filesystem without limitations > (unless a MAC mechanism enforces some restrictions); > - programs being executed can be terminated at any time by deleting the > pinned objects or unmounting the bpf filesystem. So many things to untangle here. The above paragraphs are misleading and incorrect. The commit log sounds like there are security issues that this patch set is fixing. This is not true. Looks like there is a massive misunderstanding on what bpffs is. It's a file system to pin and get bpf objects with normal file access permissions. Nothing else. Do NOT use it to pin LSM or any other security sensitive bpf programs and then complain that root can unpin them. Yes. Root can and should be able to 'rm -rf' anything in bpffs instance. > The usability problems are: > - only a fixed amount of links can be pinned; where do you see this limit? > - only links can be pinned, other object types are not supported; really? progs, maps can be pinned as well. > - code to pin objects has to be written manually; huh? > Solve the security problems by mounting the bpf filesystem from the kernel, > by preloading authenticated kernel modules (e.g. with module.sig_enforce) > and by pinning objects to that filesystem. This particular filesystem > instance guarantees that desired eBPF programs run until the very end of > the kernel lifecycle, since even root cannot interfere with it. No. I suspect there is huge confusion on what these two "progs.debug" and "maps.debug" files are in a bpffs instance. They are debug files to pretty pring loaded maps and progs for folks who like to use 'cat' to examine the state of the system instead of 'bpftool'. The root can remove these files from bpffs. There is no reason for kernel module to pin its bpf progs. If you want to develop DIGLIM as a kernel module that uses light skeleton just do: #include <linux/init.h> #include <linux/module.h> #include "diglim.lskel.h" static struct diglim_bpf *skel; static int __init load(void) { skel = diglim_bpf__open_and_load(); err = diglim_bpf__attach(skel); } /* detach skel in __fini */ It's really that short. Then you will be able to - insmod diglim.ko -> will load and attach bpf progs. - rmmod diglim -> will detach them. Independantly from these two mistunderstandings of bpffs and light skel we've been talking about auto exposing loaded bpf progs, maps, links in a bpffs without incrementing refcnt of them. When progs get unloaded the files will disappear. Some folks believe that doing 'ls' in a directory and see one file for each bpf prog loaded and then doing 'cat' on that file would be useful for debugging. That idea wasn't rejected. We're still thinking what would be the best way to auto-expose all bpf objects for debugging and whether it's actually makes sense to do considering that bpftool already has commands to list all progs, maps, links, etc with great detail. It's pretty much an argument between 'cat+ls' believers and 'bpftool' cmdline believers. That discussion is orthogonal and should not be mixed with bpffs, lsm, security or anything else.
> From: Alexei Starovoitov [mailto:alexei.starovoitov@gmail.com] > Sent: Thursday, March 31, 2022 4:27 AM > On Mon, Mar 28, 2022 at 07:50:15PM +0200, Roberto Sassu wrote: > > eBPF already allows programs to be preloaded and kept running without > > intervention from user space. There is a dedicated kernel module called > > bpf_preload, which contains the light skeleton of the iterators_bpf eBPF > > program. If this module is enabled in the kernel configuration, its loading > > will be triggered when the bpf filesystem is mounted (unless the module is > > built-in), and the links of iterators_bpf are pinned in that filesystem > > (they will appear as the progs.debug and maps.debug files). > > > > However, the current mechanism, if used to preload an LSM, would not > offer > > the same security guarantees of LSMs integrated in the security > subsystem. > > Also, it is not generic enough to be used for preloading arbitrary eBPF > > programs, unless the bpf_preload code is heavily modified. > > > > More specifically, the security problems are: > > - any program can be pinned to the bpf filesystem without limitations > > (unless a MAC mechanism enforces some restrictions); > > - programs being executed can be terminated at any time by deleting the > > pinned objects or unmounting the bpf filesystem. > > So many things to untangle here. Hi Alexei thanks for taking the time to provide such detailed explanation. > The above paragraphs are misleading and incorrect. > The commit log sounds like there are security issues that this > patch set is fixing. > This is not true. I reiterate the goal: enforce a mandatory policy with an out-of-tree LSM (a kernel module is fine), with the same guarantees of LSMs integrated in the security subsystem. The root user is not part of the TCB (i.e. is untrusted), all the changes that user wants to make must be subject of decision by the LSM enforcing the mandatory policy. I thought about adding support for LSMs from kernel modules via a new built-in LSM (called LoadLSM), but to me it looks that the bpf LSM is closer to achieve the same goal. And in addition, eBPF significantly simplifies with its helpers writing an LSM. > Looks like there is a massive misunderstanding on what bpffs is. > It's a file system to pin and get bpf objects with normal > file access permissions. Nothing else. > Do NOT use it to pin LSM or any other security sensitive bpf programs > and then complain that root can unpin them. > Yes. Root can and should be able to 'rm -rf' anything in bpffs instance. > > > The usability problems are: > > - only a fixed amount of links can be pinned; > > where do you see this limit? static int populate_bpffs(struct dentry *parent) { struct bpf_preload_info objs[BPF_PRELOAD_LINKS] = {}; #define BPF_PRELOAD_LINKS 2 > > - only links can be pinned, other object types are not supported; > > really? progs, maps can be pinned as well. struct bpf_preload_info { char link_name[16]; struct bpf_link *link; }; > > - code to pin objects has to be written manually; > > huh? I meant if you want to extend the bpf_preload kernel module. > > Solve the security problems by mounting the bpf filesystem from the > kernel, > > by preloading authenticated kernel modules (e.g. with > module.sig_enforce) > > and by pinning objects to that filesystem. This particular filesystem > > instance guarantees that desired eBPF programs run until the very end of > > the kernel lifecycle, since even root cannot interfere with it. > > No. Ok. How can the goal I stated above be achieved properly? > I suspect there is huge confusion on what these two "progs.debug" > and "maps.debug" files are in a bpffs instance. > They are debug files to pretty pring loaded maps and progs for folks who > like to use 'cat' to examine the state of the system instead of 'bpftool'. > The root can remove these files from bpffs. > > There is no reason for kernel module to pin its bpf progs. > If you want to develop DIGLIM as a kernel module that uses light skeleton > just do: > #include <linux/init.h> > #include <linux/module.h> > #include "diglim.lskel.h" > > static struct diglim_bpf *skel; > > static int __init load(void) > { > skel = diglim_bpf__open_and_load(); > err = diglim_bpf__attach(skel); > } > /* detach skel in __fini */ > > It's really that short. > > Then you will be able to > - insmod diglim.ko -> will load and attach bpf progs. > - rmmod diglim -> will detach them. root can stop the LSM without consulting the security policy. The goal of having root untrusted is not achieved. Maybe there is another way to prevent unloading the kernel module. I didn't find it yet. If there was an LSM hook called when kernel modules are unloaded, that would be sufficient, I guess. My point was that pinning progs seems to be the recommended way of keeping them running. Pinning them to unreachable inodes intuitively looked the way to go for achieving the stated goal. Or maybe I should just increment the reference count of links and don't decrement during an rmmod? If there is something I'm missing, please let me know. Thanks Roberto HUAWEI TECHNOLOGIES Duesseldorf GmbH, HRB 56063 Managing Director: Li Peng, Zhong Ronghua
On Thu, Mar 31, 2022 at 08:25:22AM +0000, Roberto Sassu wrote: > > From: Alexei Starovoitov [mailto:alexei.starovoitov@gmail.com] > > Sent: Thursday, March 31, 2022 4:27 AM > > On Mon, Mar 28, 2022 at 07:50:15PM +0200, Roberto Sassu wrote: > > > eBPF already allows programs to be preloaded and kept running without > > > intervention from user space. There is a dedicated kernel module called > > > bpf_preload, which contains the light skeleton of the iterators_bpf eBPF > > > program. If this module is enabled in the kernel configuration, its loading > > > will be triggered when the bpf filesystem is mounted (unless the module is > > > built-in), and the links of iterators_bpf are pinned in that filesystem > > > (they will appear as the progs.debug and maps.debug files). > > > > > > However, the current mechanism, if used to preload an LSM, would not > > offer > > > the same security guarantees of LSMs integrated in the security > > subsystem. > > > Also, it is not generic enough to be used for preloading arbitrary eBPF > > > programs, unless the bpf_preload code is heavily modified. > > > > > > More specifically, the security problems are: > > > - any program can be pinned to the bpf filesystem without limitations > > > (unless a MAC mechanism enforces some restrictions); > > > - programs being executed can be terminated at any time by deleting the > > > pinned objects or unmounting the bpf filesystem. > > > > So many things to untangle here. > > Hi Alexei > > thanks for taking the time to provide such detailed > explanation. > > > The above paragraphs are misleading and incorrect. > > The commit log sounds like there are security issues that this > > patch set is fixing. > > This is not true. > > I reiterate the goal: enforce a mandatory policy with > an out-of-tree LSM (a kernel module is fine), with the > same guarantees of LSMs integrated in the security > subsystem. To make it 100% clear: Any in-kernel feature that benefits out-of-tree module will be rejected. > The root user is not part of the TCB (i.e. is untrusted), > all the changes that user wants to make must be subject > of decision by the LSM enforcing the mandatory policy. > > I thought about adding support for LSMs from kernel > modules via a new built-in LSM (called LoadLSM), but Such approach will be rejected. See above. > > I suspect there is huge confusion on what these two "progs.debug" > > and "maps.debug" files are in a bpffs instance. > > They are debug files to pretty pring loaded maps and progs for folks who > > like to use 'cat' to examine the state of the system instead of 'bpftool'. > > The root can remove these files from bpffs. > > > > There is no reason for kernel module to pin its bpf progs. > > If you want to develop DIGLIM as a kernel module that uses light skeleton > > just do: > > #include <linux/init.h> > > #include <linux/module.h> > > #include "diglim.lskel.h" > > > > static struct diglim_bpf *skel; > > > > static int __init load(void) > > { > > skel = diglim_bpf__open_and_load(); > > err = diglim_bpf__attach(skel); > > } > > /* detach skel in __fini */ > > > > It's really that short. > > > > Then you will be able to > > - insmod diglim.ko -> will load and attach bpf progs. > > - rmmod diglim -> will detach them. > > root can stop the LSM without consulting the security > policy. The goal of having root untrusted is not achieved. Out-of-tree module can do any hack. For example: 1. don't do detach skel in __fini rmmod will remove the module, but bpf progs will keep running. 2. do module_get(THIS_MODULE) in __init rmmod will return EBUSY and have some out-of-band way of dropping mod refcnt. 3. hack into sys_delete_module. if module_name==diglem return EBUSY. 4. add proper LSM hook to delete_module > My point was that pinning progs seems to be the > recommended way of keeping them running. Not quite. bpf_link refcnt is what keeps progs attached. bpffs is mainly used for: - to pass maps/links from one process to another when passing fd is not possible. - to solve the case of crashing user space. The user space agent will restart and will pick up where it's left by reading map, link, prog FDs from bpffs. - pinning bpf iterators that are later used to 'cat' such files. That is what bpf_preload is doing by creating two debug files "maps.debug" and "progs.debug". > Pinning > them to unreachable inodes intuitively looked the > way to go for achieving the stated goal. We can consider inodes in bpffs that are not unlinkable by root in the future, but certainly not for this use case. > Or maybe I > should just increment the reference count of links > and don't decrement during an rmmod? I suggest to abandon out-of-tree goal. Only then we can help and continue this discussion.
On Sat, Apr 2, 2022 at 1:55 AM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > On Thu, Mar 31, 2022 at 08:25:22AM +0000, Roberto Sassu wrote: > > > From: Alexei Starovoitov [mailto:alexei.starovoitov@gmail.com] > > > Sent: Thursday, March 31, 2022 4:27 AM > > > On Mon, Mar 28, 2022 at 07:50:15PM +0200, Roberto Sassu wrote: > > > > eBPF already allows programs to be preloaded and kept running without > > > > intervention from user space. There is a dedicated kernel module called > > > > bpf_preload, which contains the light skeleton of the iterators_bpf eBPF > > > > program. If this module is enabled in the kernel configuration, its loading > > > > will be triggered when the bpf filesystem is mounted (unless the module is > > > > built-in), and the links of iterators_bpf are pinned in that filesystem > > > > (they will appear as the progs.debug and maps.debug files). > > > > > > > > However, the current mechanism, if used to preload an LSM, would not > > > offer > > > > the same security guarantees of LSMs integrated in the security > > > subsystem. > > > > Also, it is not generic enough to be used for preloading arbitrary eBPF > > > > programs, unless the bpf_preload code is heavily modified. > > > > > > > > More specifically, the security problems are: > > > > - any program can be pinned to the bpf filesystem without limitations > > > > (unless a MAC mechanism enforces some restrictions); > > > > - programs being executed can be terminated at any time by deleting the > > > > pinned objects or unmounting the bpf filesystem. > > > > > > So many things to untangle here. > > > > Hi Alexei > > > > thanks for taking the time to provide such detailed > > explanation. > > > > > The above paragraphs are misleading and incorrect. > > > The commit log sounds like there are security issues that this > > > patch set is fixing. > > > This is not true. +1 these are not security issues. They are limitations of your MAC policy. > > > > I reiterate the goal: enforce a mandatory policy with > > an out-of-tree LSM (a kernel module is fine), with the > > same guarantees of LSMs integrated in the security > > subsystem. > > To make it 100% clear: > Any in-kernel feature that benefits out-of-tree module will be rejected. > > > The root user is not part of the TCB (i.e. is untrusted), > > all the changes that user wants to make must be subject > > of decision by the LSM enforcing the mandatory policy. > > > > I thought about adding support for LSMs from kernel > > modules via a new built-in LSM (called LoadLSM), but Kernel modules cannot implement LSMs, this has already been proposed on the lists and has been rejected. > > Such approach will be rejected. See above. > > > > I suspect there is huge confusion on what these two "progs.debug" > > > and "maps.debug" files are in a bpffs instance. > > > They are debug files to pretty pring loaded maps and progs for folks who > > > like to use 'cat' to examine the state of the system instead of 'bpftool'. > > > The root can remove these files from bpffs. > > > > > > There is no reason for kernel module to pin its bpf progs. > > > If you want to develop DIGLIM as a kernel module that uses light skeleton > > > just do: > > > #include <linux/init.h> > > > #include <linux/module.h> > > > #include "diglim.lskel.h" > > > > > > static struct diglim_bpf *skel; > > > > > > static int __init load(void) > > > { > > > skel = diglim_bpf__open_and_load(); > > > err = diglim_bpf__attach(skel); > > > } > > > /* detach skel in __fini */ > > > > > > It's really that short. > > > > > > Then you will be able to > > > - insmod diglim.ko -> will load and attach bpf progs. > > > - rmmod diglim -> will detach them. > > > > root can stop the LSM without consulting the security > > policy. The goal of having root untrusted is not achieved. Ofcourse, this is an issue, if you are using BPF to define a MAC policy, the policy needs to be comprehensive to prevent itself from being overridden. This is why We have so many LSM hooks. If you think some are missing, let's add them. This is why implementing a policy is not trivial, but we need to allow users to build such policies with the help from the kernel and not by using out-of-tree modules. I do think we can add some more helpers (e.g. for modifying xattrs from BPF) that would help us build complex policies. > > Out-of-tree module can do any hack. > For example: > 1. don't do detach skel in __fini > rmmod will remove the module, but bpf progs will keep running. > 2. do module_get(THIS_MODULE) in __init > rmmod will return EBUSY > and have some out-of-band way of dropping mod refcnt. > 3. hack into sys_delete_module. if module_name==diglem return EBUSY. > 4. add proper LSM hook to delete_module +1 I recommend this (but not from an out of tree module) > > > My point was that pinning progs seems to be the > > recommended way of keeping them running. > > Not quite. bpf_link refcnt is what keeps progs attached. > bpffs is mainly used for: > - to pass maps/links from one process to another > when passing fd is not possible. > - to solve the case of crashing user space. > The user space agent will restart and will pick up where > it's left by reading map, link, prog FDs from bpffs. > - pinning bpf iterators that are later used to 'cat' such files. > That is what bpf_preload is doing by creating two debug > files "maps.debug" and "progs.debug". > > > Pinning > > them to unreachable inodes intuitively looked the > > way to go for achieving the stated goal. > > We can consider inodes in bpffs that are not unlinkable by root > in the future, but certainly not for this use case. Can this not be already done by adding a BPF_LSM program to the inode_unlink LSM hook? > > > Or maybe I > > should just increment the reference count of links > > and don't decrement during an rmmod? > > I suggest to abandon out-of-tree goal. > Only then we can help and continue this discussion. +1
On Wed, Mar 30, 2022 at 12:44 AM Roberto Sassu <roberto.sassu@huawei.com> wrote: > > > From: Andrii Nakryiko [mailto:andrii.nakryiko@gmail.com] > > Sent: Wednesday, March 30, 2022 1:52 AM > > On Mon, Mar 28, 2022 at 10:52 AM Roberto Sassu > > <roberto.sassu@huawei.com> wrote: > > > > > > The first part of the preload code generation consists in generating the > > > static variables to be used by the code itself: the links and maps to be > > > pinned, and the skeleton. Generation of the preload variables and > > methods > > > is enabled with the option -P added to 'bpftool gen skeleton'. > > > > > > The existing variables maps_link and progs_links in bpf_preload_kern.c > > have > > > been renamed respectively to dump_bpf_map_link and > > dump_bpf_prog_link, to > > > match the name of the variables in the main structure of the light > > > skeleton. > > > > > > Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> > > > --- > > > kernel/bpf/preload/bpf_preload_kern.c | 35 +- > > > kernel/bpf/preload/iterators/Makefile | 2 +- > > > .../bpf/preload/iterators/iterators.lskel.h | 378 +++++++++--------- > > > .../bpf/bpftool/Documentation/bpftool-gen.rst | 5 + > > > tools/bpf/bpftool/bash-completion/bpftool | 2 +- > > > tools/bpf/bpftool/gen.c | 27 ++ > > > tools/bpf/bpftool/main.c | 7 +- > > > tools/bpf/bpftool/main.h | 1 + > > > 8 files changed, 254 insertions(+), 203 deletions(-) > > > > > > > [...] > > > > > +__attribute__((unused)) static void > > > +iterators_bpf__assert(struct iterators_bpf *s) > > > +{ > > > +#ifdef __cplusplus > > > +#define _Static_assert static_assert > > > +#endif > > > +#ifdef __cplusplus > > > +#undef _Static_assert > > > +#endif > > > +} > > > + > > > +static struct bpf_link *dump_bpf_map_link; > > > +static struct bpf_link *dump_bpf_prog_link; > > > +static struct iterators_bpf *skel; > > > > I don't understand what is this and what for? You are making an > > assumption that light skeleton can be instantiated just once, why? And > > adding extra bpftool option to light skeleton codegen just to save a > > bit of typing at the place where light skeleton is actually > > instantiated and used doesn't seems like a right approach. > > True, iterator_bpf is simple. Writing the preloading code > for it is simple. But, what if you wanted to preload an LSM > with 10 hooks or more? I suppose you'd write a straightforward code to do pinning ten times for ten different programs to ten different paths. But with this you don't have to establish a random set of conventions that might not apply in all the situations to anyone that would try to use this feature. Worst case, light skeleton can be extended to provide a way to iterate all programs programmatically. > > Ok, regarding where the preloading code should be, I will > try to move the generated code to the kernel module instead > of the light skeleton. > > Thanks > > Roberto > > HUAWEI TECHNOLOGIES Duesseldorf GmbH, HRB 56063 > Managing Director: Li Peng, Zhong Ronghua > > > Further, even if this is the way to go, please split out bpftool > > changes from kernel changes. There is nothing requiring them to be > > coupled together. > > > > [...]
On Sun, Apr 3, 2022 at 5:42 PM KP Singh <kpsingh@kernel.org> wrote: > > On Sat, Apr 2, 2022 at 1:55 AM Alexei Starovoitov > <alexei.starovoitov@gmail.com> wrote: ... > > > > > Pinning > > > them to unreachable inodes intuitively looked the > > > way to go for achieving the stated goal. > > > > We can consider inodes in bpffs that are not unlinkable by root > > in the future, but certainly not for this use case. > > Can this not be already done by adding a BPF_LSM program to the > inode_unlink LSM hook? > Also, beside of the inode_unlink... and out of curiosity: making sysfs/bpffs/ readonly after pinning, then using bpf LSM hooks sb_mount|remount|unmount... family combining bpf() LSM hook... isn't this enough to: 1. Restrict who can pin to bpffs without using a full MAC 2. Restrict who can delete or unmount bpf filesystem ?
> From: Djalal Harouni [mailto:tixxdz@gmail.com] > Sent: Monday, April 4, 2022 9:45 AM > On Sun, Apr 3, 2022 at 5:42 PM KP Singh <kpsingh@kernel.org> wrote: > > > > On Sat, Apr 2, 2022 at 1:55 AM Alexei Starovoitov > > <alexei.starovoitov@gmail.com> wrote: > ... > > > > > > > Pinning > > > > them to unreachable inodes intuitively looked the > > > > way to go for achieving the stated goal. > > > > > > We can consider inodes in bpffs that are not unlinkable by root > > > in the future, but certainly not for this use case. > > > > Can this not be already done by adding a BPF_LSM program to the > > inode_unlink LSM hook? > > > > Also, beside of the inode_unlink... and out of curiosity: making sysfs/bpffs/ > readonly after pinning, then using bpf LSM hooks > sb_mount|remount|unmount... > family combining bpf() LSM hook... isn't this enough to: > 1. Restrict who can pin to bpffs without using a full MAC > 2. Restrict who can delete or unmount bpf filesystem > > ? I'm thinking to implement something like this. First, I add a new program flag called BPF_F_STOP_ONCONFIRM, which causes the ref count of the link to increase twice at creation time. In this way, user space cannot make the link disappear, unless a confirmation is explicitly sent via the bpf() system call. Another advantage is that other LSMs can decide whether or not they allow a program with this flag (in the bpf security hook). This would work regardless of the method used to load the eBPF program (user space or kernel space). Second, I extend the bpf() system call with a new subcommand, BPF_LINK_CONFIRM_STOP, which decreases the ref count for the link of the programs with the BPF_F_STOP_ONCONFIRM flag. I will also introduce a new security hook (something like security_link_confirm_stop), so that an LSM has the opportunity to deny the stop (the bpf security hook would not be sufficient to determine exactly for which link the confirmation is given, an LSM should be able to deny the stop for its own programs). What do you think? Thanks Roberto HUAWEI TECHNOLOGIES Duesseldorf GmbH, HRB 56063 Managing Director: Li Peng, Zhong Ronghua
> From: KP Singh [mailto:kpsingh@kernel.org] > Sent: Saturday, April 2, 2022 3:03 AM > On Sat, Apr 2, 2022 at 1:55 AM Alexei Starovoitov > <alexei.starovoitov@gmail.com> wrote: > > > > On Thu, Mar 31, 2022 at 08:25:22AM +0000, Roberto Sassu wrote: > > > > From: Alexei Starovoitov [mailto:alexei.starovoitov@gmail.com] > > > > Sent: Thursday, March 31, 2022 4:27 AM > > > > On Mon, Mar 28, 2022 at 07:50:15PM +0200, Roberto Sassu wrote: > > > > > eBPF already allows programs to be preloaded and kept running > without > > > > > intervention from user space. There is a dedicated kernel module > called > > > > > bpf_preload, which contains the light skeleton of the iterators_bpf > eBPF > > > > > program. If this module is enabled in the kernel configuration, its > loading > > > > > will be triggered when the bpf filesystem is mounted (unless the > module is > > > > > built-in), and the links of iterators_bpf are pinned in that filesystem > > > > > (they will appear as the progs.debug and maps.debug files). > > > > > > > > > > However, the current mechanism, if used to preload an LSM, would > not > > > > offer > > > > > the same security guarantees of LSMs integrated in the security > > > > subsystem. > > > > > Also, it is not generic enough to be used for preloading arbitrary eBPF > > > > > programs, unless the bpf_preload code is heavily modified. > > > > > > > > > > More specifically, the security problems are: > > > > > - any program can be pinned to the bpf filesystem without limitations > > > > > (unless a MAC mechanism enforces some restrictions); > > > > > - programs being executed can be terminated at any time by deleting > the > > > > > pinned objects or unmounting the bpf filesystem. > > > > > > > > So many things to untangle here. > > > > > > Hi Alexei > > > > > > thanks for taking the time to provide such detailed > > > explanation. > > > > > > > The above paragraphs are misleading and incorrect. > > > > The commit log sounds like there are security issues that this > > > > patch set is fixing. > > > > This is not true. > > +1 these are not security issues. They are limitations of your MAC policy. > > > > > > > I reiterate the goal: enforce a mandatory policy with > > > an out-of-tree LSM (a kernel module is fine), with the > > > same guarantees of LSMs integrated in the security > > > subsystem. > > > > To make it 100% clear: > > Any in-kernel feature that benefits out-of-tree module will be rejected. > > > > > The root user is not part of the TCB (i.e. is untrusted), > > > all the changes that user wants to make must be subject > > > of decision by the LSM enforcing the mandatory policy. > > > > > > I thought about adding support for LSMs from kernel > > > modules via a new built-in LSM (called LoadLSM), but > > Kernel modules cannot implement LSMs, this has already been > proposed on the lists and has been rejected. Looking at commit cb80ddc67152 ("bpf: Convert bpf_preload.ko to use light skeleton."), I got that it is the most efficient way to load an eBPF program (does not even require libbpf). Another advantage was that we get integrity verification from the module infrastructure. This would have been the optimal solution in terms of dependencies. Enforcing integrity could be turned on with the module.sig_enforce kernel option. If we switch to user space, the choice would be IMA. However, in my use case (DIGLIM) it would be used just for the purpose of doing integrity verifications pre-init. Thinking which policy could be implemented for such purpose, maybe something like appraise every process that is not linked to an executable? And since there are no xattrs in the initial ram disk, could I append a module signature to an ELF binary? Thanks Roberto HUAWEI TECHNOLOGIES Duesseldorf GmbH, HRB 56063 Managing Director: Li Peng, Zhong Ronghua > > Such approach will be rejected. See above. > > > > > > I suspect there is huge confusion on what these two "progs.debug" > > > > and "maps.debug" files are in a bpffs instance. > > > > They are debug files to pretty pring loaded maps and progs for folks > who > > > > like to use 'cat' to examine the state of the system instead of 'bpftool'. > > > > The root can remove these files from bpffs. > > > > > > > > There is no reason for kernel module to pin its bpf progs. > > > > If you want to develop DIGLIM as a kernel module that uses light > skeleton > > > > just do: > > > > #include <linux/init.h> > > > > #include <linux/module.h> > > > > #include "diglim.lskel.h" > > > > > > > > static struct diglim_bpf *skel; > > > > > > > > static int __init load(void) > > > > { > > > > skel = diglim_bpf__open_and_load(); > > > > err = diglim_bpf__attach(skel); > > > > } > > > > /* detach skel in __fini */ > > > > > > > > It's really that short. > > > > > > > > Then you will be able to > > > > - insmod diglim.ko -> will load and attach bpf progs. > > > > - rmmod diglim -> will detach them. > > > > > > root can stop the LSM without consulting the security > > > policy. The goal of having root untrusted is not achieved. > > Ofcourse, this is an issue, if you are using BPF to define a MAC > policy, the policy > needs to be comprehensive to prevent itself from being overridden. This is > why > We have so many LSM hooks. If you think some are missing, let's add them. > > This is why implementing a policy is not trivial, but we need to allow > users to build > such policies with the help from the kernel and not by using > out-of-tree modules. > > I do think we can add some more helpers (e.g. for modifying xattrs > from BPF) that > would help us build complex policies. > > > > > Out-of-tree module can do any hack. > > For example: > > 1. don't do detach skel in __fini > > rmmod will remove the module, but bpf progs will keep running. > > 2. do module_get(THIS_MODULE) in __init > > rmmod will return EBUSY > > and have some out-of-band way of dropping mod refcnt. > > 3. hack into sys_delete_module. if module_name==diglem return EBUSY. > > 4. add proper LSM hook to delete_module > > +1 I recommend this (but not from an out of tree module) > > > > > > My point was that pinning progs seems to be the > > > recommended way of keeping them running. > > > > Not quite. bpf_link refcnt is what keeps progs attached. > > bpffs is mainly used for: > > - to pass maps/links from one process to another > > when passing fd is not possible. > > - to solve the case of crashing user space. > > The user space agent will restart and will pick up where > > it's left by reading map, link, prog FDs from bpffs. > > - pinning bpf iterators that are later used to 'cat' such files. > > That is what bpf_preload is doing by creating two debug > > files "maps.debug" and "progs.debug". > > > > > Pinning > > > them to unreachable inodes intuitively looked the > > > way to go for achieving the stated goal. > > > > We can consider inodes in bpffs that are not unlinkable by root > > in the future, but certainly not for this use case. > > Can this not be already done by adding a BPF_LSM program to the > inode_unlink LSM hook? > > > > > > Or maybe I > > > should just increment the reference count of links > > > and don't decrement during an rmmod? > > > > I suggest to abandon out-of-tree goal. > > Only then we can help and continue this discussion. > > +1
On Mon, Apr 4, 2022 at 10:21 AM Roberto Sassu <roberto.sassu@huawei.com> wrote: > > > From: Djalal Harouni [mailto:tixxdz@gmail.com] > > Sent: Monday, April 4, 2022 9:45 AM > > On Sun, Apr 3, 2022 at 5:42 PM KP Singh <kpsingh@kernel.org> wrote: > > > > > > On Sat, Apr 2, 2022 at 1:55 AM Alexei Starovoitov > > > <alexei.starovoitov@gmail.com> wrote: > > ... > > > > > > > > > Pinning > > > > > them to unreachable inodes intuitively looked the > > > > > way to go for achieving the stated goal. > > > > > > > > We can consider inodes in bpffs that are not unlinkable by root > > > > in the future, but certainly not for this use case. > > > > > > Can this not be already done by adding a BPF_LSM program to the > > > inode_unlink LSM hook? > > > > > > > Also, beside of the inode_unlink... and out of curiosity: making sysfs/bpffs/ > > readonly after pinning, then using bpf LSM hooks > > sb_mount|remount|unmount... > > family combining bpf() LSM hook... isn't this enough to: > > 1. Restrict who can pin to bpffs without using a full MAC > > 2. Restrict who can delete or unmount bpf filesystem > > > > ? > > I'm thinking to implement something like this. > > First, I add a new program flag called > BPF_F_STOP_ONCONFIRM, which causes the ref count > of the link to increase twice at creation time. In this way, > user space cannot make the link disappear, unless a > confirmation is explicitly sent via the bpf() system call. > > Another advantage is that other LSMs can decide > whether or not they allow a program with this flag > (in the bpf security hook). > > This would work regardless of the method used to > load the eBPF program (user space or kernel space). > > Second, I extend the bpf() system call with a new > subcommand, BPF_LINK_CONFIRM_STOP, which > decreases the ref count for the link of the programs > with the BPF_F_STOP_ONCONFIRM flag. I will also > introduce a new security hook (something like > security_link_confirm_stop), so that an LSM has the > opportunity to deny the stop (the bpf security hook > would not be sufficient to determine exactly for > which link the confirmation is given, an LSM should > be able to deny the stop for its own programs). > > What do you think? Hack upon a hack? Makes no sense.
On Tue, Apr 5, 2022 at 12:49 AM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > On Mon, Apr 4, 2022 at 10:21 AM Roberto Sassu <roberto.sassu@huawei.com> wrote: > > > > > From: Djalal Harouni [mailto:tixxdz@gmail.com] > > > Sent: Monday, April 4, 2022 9:45 AM > > > On Sun, Apr 3, 2022 at 5:42 PM KP Singh <kpsingh@kernel.org> wrote: > > > > > > > > On Sat, Apr 2, 2022 at 1:55 AM Alexei Starovoitov > > > > <alexei.starovoitov@gmail.com> wrote: > > > ... > > > > > > > > > > > Pinning > > > > > > them to unreachable inodes intuitively looked the > > > > > > way to go for achieving the stated goal. > > > > > > > > > > We can consider inodes in bpffs that are not unlinkable by root > > > > > in the future, but certainly not for this use case. > > > > > > > > Can this not be already done by adding a BPF_LSM program to the > > > > inode_unlink LSM hook? > > > > > > > > > > Also, beside of the inode_unlink... and out of curiosity: making sysfs/bpffs/ > > > readonly after pinning, then using bpf LSM hooks > > > sb_mount|remount|unmount... > > > family combining bpf() LSM hook... isn't this enough to: > > > 1. Restrict who can pin to bpffs without using a full MAC > > > 2. Restrict who can delete or unmount bpf filesystem > > > I like this approach better, you will have to restrict the BPF, if you want to implement MAC policy using BPF. Can you please try implementing something using these hooks? > > > ? > > > > I'm thinking to implement something like this. > > > > First, I add a new program flag called > > BPF_F_STOP_ONCONFIRM, which causes the ref count > > of the link to increase twice at creation time. In this way, > > user space cannot make the link disappear, unless a > > confirmation is explicitly sent via the bpf() system call. I don't like this approach, this just sounds like an intentional dangling reference, prone to refcounting errors and it does not really solve the purpose you want to achieve. And you will still need a policy around the BPF syscall, so why not just use the LSM hooks as suggested above? > > > > Another advantage is that other LSMs can decide > > whether or not they allow a program with this flag > > (in the bpf security hook). > > > > This would work regardless of the method used to > > load the eBPF program (user space or kernel space). > > > > Second, I extend the bpf() system call with a new > > subcommand, BPF_LINK_CONFIRM_STOP, which > > decreases the ref count for the link of the programs > > with the BPF_F_STOP_ONCONFIRM flag. I will also > > introduce a new security hook (something like > > security_link_confirm_stop), so that an LSM has the > > opportunity to deny the stop (the bpf security hook > > would not be sufficient to determine exactly for > > which link the confirmation is given, an LSM should > > be able to deny the stop for its own programs). > > > > What do you think? > > Hack upon a hack? Makes no sense.
On 4/4/2022 10:20 AM, Roberto Sassu wrote: >> From: Djalal Harouni [mailto:tixxdz@gmail.com] >> Sent: Monday, April 4, 2022 9:45 AM >> On Sun, Apr 3, 2022 at 5:42 PM KP Singh <kpsingh@kernel.org> wrote: >>> On Sat, Apr 2, 2022 at 1:55 AM Alexei Starovoitov >>> <alexei.starovoitov@gmail.com> wrote: >> ... >>>>> Pinning >>>>> them to unreachable inodes intuitively looked the >>>>> way to go for achieving the stated goal. >>>> We can consider inodes in bpffs that are not unlinkable by root >>>> in the future, but certainly not for this use case. >>> Can this not be already done by adding a BPF_LSM program to the >>> inode_unlink LSM hook? >>> >> Also, beside of the inode_unlink... and out of curiosity: making sysfs/bpffs/ >> readonly after pinning, then using bpf LSM hooks >> sb_mount|remount|unmount... >> family combining bpf() LSM hook... isn't this enough to: >> 1. Restrict who can pin to bpffs without using a full MAC >> 2. Restrict who can delete or unmount bpf filesystem >> >> ? > I'm thinking to implement something like this. > > First, I add a new program flag called > BPF_F_STOP_ONCONFIRM, which causes the ref count > of the link to increase twice at creation time. In this way, > user space cannot make the link disappear, unless a > confirmation is explicitly sent via the bpf() system call. > > Another advantage is that other LSMs can decide > whether or not they allow a program with this flag > (in the bpf security hook). > > This would work regardless of the method used to > load the eBPF program (user space or kernel space). > > Second, I extend the bpf() system call with a new > subcommand, BPF_LINK_CONFIRM_STOP, which > decreasres the ref count for the link of the programs > with the BPF_F_STOP_ONCONFIRM flag. I will also > introduce a new security hook (something like > security_link_confirm_stop), so that an LSM has the > opportunity to deny the stop (the bpf security hook > would not be sufficient to determine exactly for > which link the confirmation is given, an LSM should > be able to deny the stop for its own programs). Would you please stop referring to a set of eBPF programs loaded into the BPF LSM as an LSM? Call it a BPF security module (BSM) if you must use an abbreviation. An LSM is a provider of security_ hooks. In your case that is BPF. When you call the set of eBPF programs an LSM it is like calling an SELinux policy an LSM. > > What do you think? > > Thanks > > Roberto > > HUAWEI TECHNOLOGIES Duesseldorf GmbH, HRB 56063 > Managing Director: Li Peng, Zhong Ronghua
> From: Casey Schaufler [mailto:casey@schaufler-ca.com] > Sent: Tuesday, April 5, 2022 4:50 PM > On 4/4/2022 10:20 AM, Roberto Sassu wrote: > >> From: Djalal Harouni [mailto:tixxdz@gmail.com] > >> Sent: Monday, April 4, 2022 9:45 AM > >> On Sun, Apr 3, 2022 at 5:42 PM KP Singh <kpsingh@kernel.org> wrote: > >>> On Sat, Apr 2, 2022 at 1:55 AM Alexei Starovoitov > >>> <alexei.starovoitov@gmail.com> wrote: > >> ... > >>>>> Pinning > >>>>> them to unreachable inodes intuitively looked the > >>>>> way to go for achieving the stated goal. > >>>> We can consider inodes in bpffs that are not unlinkable by root > >>>> in the future, but certainly not for this use case. > >>> Can this not be already done by adding a BPF_LSM program to the > >>> inode_unlink LSM hook? > >>> > >> Also, beside of the inode_unlink... and out of curiosity: making > sysfs/bpffs/ > >> readonly after pinning, then using bpf LSM hooks > >> sb_mount|remount|unmount... > >> family combining bpf() LSM hook... isn't this enough to: > >> 1. Restrict who can pin to bpffs without using a full MAC > >> 2. Restrict who can delete or unmount bpf filesystem > >> > >> ? > > I'm thinking to implement something like this. > > > > First, I add a new program flag called > > BPF_F_STOP_ONCONFIRM, which causes the ref count > > of the link to increase twice at creation time. In this way, > > user space cannot make the link disappear, unless a > > confirmation is explicitly sent via the bpf() system call. > > > > Another advantage is that other LSMs can decide > > whether or not they allow a program with this flag > > (in the bpf security hook). > > > > This would work regardless of the method used to > > load the eBPF program (user space or kernel space). > > > > Second, I extend the bpf() system call with a new > > subcommand, BPF_LINK_CONFIRM_STOP, which > > decreasres the ref count for the link of the programs > > with the BPF_F_STOP_ONCONFIRM flag. I will also > > introduce a new security hook (something like > > security_link_confirm_stop), so that an LSM has the > > opportunity to deny the stop (the bpf security hook > > would not be sufficient to determine exactly for > > which link the confirmation is given, an LSM should > > be able to deny the stop for its own programs). > > Would you please stop referring to a set of eBPF programs > loaded into the BPF LSM as an LSM? Call it a BPF security > module (BSM) if you must use an abbreviation. An LSM is a > provider of security_ hooks. In your case that is BPF. When > you call the set of eBPF programs an LSM it is like calling > an SELinux policy an LSM. An eBPF program could be a provider of security_ hooks too. The bpf LSM is an aggregator, similarly to your infrastructure to manage built-in LSMs. Maybe, calling it second-level LSM or secondary LSM would better represent this new class. The only differences are the registration method, (SEC directive instead of DEFINE_LSM), and what the hook implementation can access. The implementation of a security_ hook via eBPF can follow the same structure of built-in LSMs, i.e. it can be uniquely responsible for enforcing and be policy-agnostic, and can retrieve the decisions based on a policy from a component implemented somewhere else. Hopefully, I understood the basic principles correctly. I let the eBPF maintainers comment on this. Thanks Roberto HUAWEI TECHNOLOGIES Duesseldorf GmbH, HRB 56063 Managing Director: Li Peng, Zhong Ronghua > > What do you think? > > > > Thanks > > > > Roberto > > > > HUAWEI TECHNOLOGIES Duesseldorf GmbH, HRB 56063 > > Managing Director: Li Peng, Zhong Ronghua
On 4/5/2022 8:29 AM, Roberto Sassu wrote: >> From: Casey Schaufler [mailto:casey@schaufler-ca.com] >> Sent: Tuesday, April 5, 2022 4:50 PM >> On 4/4/2022 10:20 AM, Roberto Sassu wrote: >>>> From: Djalal Harouni [mailto:tixxdz@gmail.com] >>>> Sent: Monday, April 4, 2022 9:45 AM >>>> On Sun, Apr 3, 2022 at 5:42 PM KP Singh <kpsingh@kernel.org> wrote: >>>>> On Sat, Apr 2, 2022 at 1:55 AM Alexei Starovoitov >>>>> <alexei.starovoitov@gmail.com> wrote: >>>> ... >>>>>>> Pinning >>>>>>> them to unreachable inodes intuitively looked the >>>>>>> way to go for achieving the stated goal. >>>>>> We can consider inodes in bpffs that are not unlinkable by root >>>>>> in the future, but certainly not for this use case. >>>>> Can this not be already done by adding a BPF_LSM program to the >>>>> inode_unlink LSM hook? >>>>> >>>> Also, beside of the inode_unlink... and out of curiosity: making >> sysfs/bpffs/ >>>> readonly after pinning, then using bpf LSM hooks >>>> sb_mount|remount|unmount... >>>> family combining bpf() LSM hook... isn't this enough to: >>>> 1. Restrict who can pin to bpffs without using a full MAC >>>> 2. Restrict who can delete or unmount bpf filesystem >>>> >>>> ? >>> I'm thinking to implement something like this. >>> >>> First, I add a new program flag called >>> BPF_F_STOP_ONCONFIRM, which causes the ref count >>> of the link to increase twice at creation time. In this way, >>> user space cannot make the link disappear, unless a >>> confirmation is explicitly sent via the bpf() system call. >>> >>> Another advantage is that other LSMs can decide >>> whether or not they allow a program with this flag >>> (in the bpf security hook). >>> >>> This would work regardless of the method used to >>> load the eBPF program (user space or kernel space). >>> >>> Second, I extend the bpf() system call with a new >>> subcommand, BPF_LINK_CONFIRM_STOP, which >>> decreasres the ref count for the link of the programs >>> with the BPF_F_STOP_ONCONFIRM flag. I will also >>> introduce a new security hook (something like >>> security_link_confirm_stop), so that an LSM has the >>> opportunity to deny the stop (the bpf security hook >>> would not be sufficient to determine exactly for >>> which link the confirmation is given, an LSM should >>> be able to deny the stop for its own programs). >> Would you please stop referring to a set of eBPF programs >> loaded into the BPF LSM as an LSM? Call it a BPF security >> module (BSM) if you must use an abbreviation. An LSM is a >> provider of security_ hooks. In your case that is BPF. When >> you call the set of eBPF programs an LSM it is like calling >> an SELinux policy an LSM. > An eBPF program could be a provider of security_ hooks > too. No, it can't. If I look in /sys/kernel/security/lsm what you see is "bpf". The LSM is BPF. What BPF does in its hooks is up to it and its responsibility. > The bpf LSM is an aggregator, similarly to your > infrastructure to manage built-in LSMs. Maybe, calling > it second-level LSM or secondary LSM would better > represent this new class. It isn't an LSM, and adding a qualifier doesn't make it one and only adds to the confusion. > The only differences are the registration method, (SEC > directive instead of DEFINE_LSM), and what the hook > implementation can access. Those two things pretty well define what an LSM is. > The implementation of a security_ hook via eBPF can > follow the same structure of built-in LSMs, i.e. it can be > uniquely responsible for enforcing and be policy-agnostic, > and can retrieve the decisions based on a policy from a > component implemented somewhere else. The BPF LSM provides mechanism. The eBPF programs provide policy. > > Hopefully, I understood the basic principles correctly. > I let the eBPF maintainers comment on this. > > Thanks > > Roberto > > HUAWEI TECHNOLOGIES Duesseldorf GmbH, HRB 56063 > Managing Director: Li Peng, Zhong Ronghua > >>> What do you think? >>> >>> Thanks >>> >>> Roberto >>> >>> HUAWEI TECHNOLOGIES Duesseldorf GmbH, HRB 56063 >>> Managing Director: Li Peng, Zhong Ronghua
On Tue, Apr 5, 2022 at 6:22 PM Casey Schaufler <casey@schaufler-ca.com> wrote: > > On 4/5/2022 8:29 AM, Roberto Sassu wrote: > >> From: Casey Schaufler [mailto:casey@schaufler-ca.com] > >> Sent: Tuesday, April 5, 2022 4:50 PM > >> On 4/4/2022 10:20 AM, Roberto Sassu wrote: > >>>> From: Djalal Harouni [mailto:tixxdz@gmail.com] > >>>> Sent: Monday, April 4, 2022 9:45 AM > >>>> On Sun, Apr 3, 2022 at 5:42 PM KP Singh <kpsingh@kernel.org> wrote: > >>>>> On Sat, Apr 2, 2022 at 1:55 AM Alexei Starovoitov > >>>>> <alexei.starovoitov@gmail.com> wrote: > >>>> ... > >>>>>>> Pinning > >>>>>>> them to unreachable inodes intuitively looked the > >>>>>>> way to go for achieving the stated goal. > >>>>>> We can consider inodes in bpffs that are not unlinkable by root > >>>>>> in the future, but certainly not for this use case. > >>>>> Can this not be already done by adding a BPF_LSM program to the > >>>>> inode_unlink LSM hook? > >>>>> > >>>> Also, beside of the inode_unlink... and out of curiosity: making > >> sysfs/bpffs/ > >>>> readonly after pinning, then using bpf LSM hooks > >>>> sb_mount|remount|unmount... > >>>> family combining bpf() LSM hook... isn't this enough to: > >>>> 1. Restrict who can pin to bpffs without using a full MAC > >>>> 2. Restrict who can delete or unmount bpf filesystem > >>>> > >>>> ? > >>> I'm thinking to implement something like this. > >>> > >>> First, I add a new program flag called > >>> BPF_F_STOP_ONCONFIRM, which causes the ref count > >>> of the link to increase twice at creation time. In this way, > >>> user space cannot make the link disappear, unless a > >>> confirmation is explicitly sent via the bpf() system call. > >>> > >>> Another advantage is that other LSMs can decide > >>> whether or not they allow a program with this flag > >>> (in the bpf security hook). > >>> > >>> This would work regardless of the method used to > >>> load the eBPF program (user space or kernel space). > >>> > >>> Second, I extend the bpf() system call with a new > >>> subcommand, BPF_LINK_CONFIRM_STOP, which > >>> decreasres the ref count for the link of the programs > >>> with the BPF_F_STOP_ONCONFIRM flag. I will also > >>> introduce a new security hook (something like > >>> security_link_confirm_stop), so that an LSM has the > >>> opportunity to deny the stop (the bpf security hook > >>> would not be sufficient to determine exactly for > >>> which link the confirmation is given, an LSM should > >>> be able to deny the stop for its own programs). > >> Would you please stop referring to a set of eBPF programs > >> loaded into the BPF LSM as an LSM? Call it a BPF security > >> module (BSM) if you must use an abbreviation. An LSM is a > >> provider of security_ hooks. In your case that is BPF. When > >> you call the set of eBPF programs an LSM it is like calling > >> an SELinux policy an LSM. > > An eBPF program could be a provider of security_ hooks > > too. > > No, it can't. If I look in /sys/kernel/security/lsm what > you see is "bpf". The LSM is BPF. What BPF does in its > hooks is up to it and its responsibility. > > > The bpf LSM is an aggregator, similarly to your > > infrastructure to manage built-in LSMs. Maybe, calling > > it second-level LSM or secondary LSM would better > > represent this new class. > > It isn't an LSM, and adding a qualifier doesn't make it > one and only adds to the confusion. > > > The only differences are the registration method, (SEC > > directive instead of DEFINE_LSM), and what the hook > > implementation can access. > > Those two things pretty well define what an LSM is. > > > The implementation of a security_ hook via eBPF can > > follow the same structure of built-in LSMs, i.e. it can be > > uniquely responsible for enforcing and be policy-agnostic, > > and can retrieve the decisions based on a policy from a > > component implemented somewhere else. > > The BPF LSM provides mechanism. The eBPF programs provide policy. Yeah, let's stick what we call an LSM in the kernel, Here, "bpf" is the LSM like selinux,apparmor and this is what you set in CONFIG_LSM or pass on cmdline as lsm= and can be seen in /sys/kernel/security/lsm Calling your BPF programs an LSM will just confuse people. > > > > > Hopefully, I understood the basic principles correctly. > > I let the eBPF maintainers comment on this. > > > > Thanks > > > > Roberto > > > > HUAWEI TECHNOLOGIES Duesseldorf GmbH, HRB 56063 > > Managing Director: Li Peng, Zhong Ronghua > > > >>> What do you think? > >>> > >>> Thanks > >>> > >>> Roberto > >>> > >>> HUAWEI TECHNOLOGIES Duesseldorf GmbH, HRB 56063 > >>> Managing Director: Li Peng, Zhong Ronghua