Message ID | 20230630183908.32148-1-mkoutny@suse.com |
---|---|
Headers | show |
Series | cpuset: Allow setscheduler regardless of manipulated task | expand |
On 6/30/23 14:39, Michal Koutný wrote: > When we migrate a task between two cgroups, one of the checks is a > verification whether we can modify task's scheduler settings > (cap_task_setscheduler()). > > An implicit migration occurs also when enabling a controller on the > unified hierarchy (think of parent to child migration). The > aforementioned check may be problematic if the caller of the migration > (enabling a controller) has no permissions over migrated tasks. > For instance, a user's cgroup that ends up running a process of a > different user. Although cgroup permissions are configured favorably, > the enablement fails due to the foreign process [1]. > > Change the behavior by relaxing the permissions check on the unified > hierarchy (or in v2 mode). This is in accordance with unified hierarchy > attachment behavior when permissions of the source to target cgroups are > decisive whereas the migrated task is opaque (as opposed to more > restrictive check in __cgroup1_procs_write()). > > [1] https://github.com/systemd/systemd/issues/18293#issuecomment-831205649 > > Signed-off-by: Michal Koutný <mkoutny@suse.com> > --- > kernel/cgroup/cpuset.c | 13 ++++++++++--- > 1 file changed, 10 insertions(+), 3 deletions(-) > > diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c > index 58e6f18f01c1..41d3ed14b0f4 100644 > --- a/kernel/cgroup/cpuset.c > +++ b/kernel/cgroup/cpuset.c > @@ -2505,9 +2505,16 @@ static int cpuset_can_attach(struct cgroup_taskset *tset) > ret = task_can_attach(task); > if (ret) > goto out_unlock; > - ret = security_task_setscheduler(task); > - if (ret) > - goto out_unlock; > + > + /* > + * Skip rights over task check in v2, migration permission derives > + * from hierarchy ownership in cgroup_procs_write_permission()). > + */ > + if (!cgroup_subsys_on_dfl(cpuset_cgrp_subsys)) { > + ret = security_task_setscheduler(task); > + if (ret) > + goto out_unlock; > + } I am somewhat hesitant to skip the security_task_setscheduler() check for all cgroup v2 task migrations. The check is controlled by SElinux which is a different subsystem. I believe the scheduler property here refer's to the task cpu affinity and node mask. If you look at cpuset_attach(), we have actually skipped the task iteration process to change them if cpu affinity and node mask aren't changed at all. I don't want to introduce a possible security vulnerability because of this relaxation. I would suggest you skip it under the same condition of no change to cpu affinity and node mask for cgroup v2. Thanks, Longman